Explain filtering data and validating data
In computer security, there are often known good data — data the developer is completely certain is safe.There are also known bad characters; data the developer is certain is unsafe (can cause Code injection etc.).Other errors include unmodeled plant dynamics such as holdup changes, and other instabilities in plant operations that violate steady state (algebraic) models.Additional dynamic errors arise when measurements and samples are not taken at the same time, especially lab analyses.The validation may be strict (such as rejecting any address that does not have a valid postal code) or fuzzy (such as correcting records that partially match existing, known records).Some data cleansing solutions will clean data by cross checking with a validated data set.This strategy has several problems: "Encoding" processes content that is about to be used in another application so that any characters which have potentially special meanings to the receiving application are made safe.Characters from a typical known safe charset for the particular destination medium are often left as they are.
It uses routines, often called "validation rules" "validation constraints" or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system.
Data cleansing differs from data validation in that validation almost invariably means data is rejected from the system at entry and is performed at the time of entry, rather than on batches of data.
The actual process of data cleansing may involve removing typographical errors or validating and correcting values against a known list of entities.
Industrial process data validation and reconciliation, or more briefly, data validation and reconciliation (DVR), is a technology that uses process information and mathematical methods in order to automatically correct measurements in industrial processes.
The use of DVR allows for extracting accurate and reliable information about the state of industry processes from raw measurement data and produces a single consistent set of data representing the most likely process operation.