Saturday, December 14, 2013

Types of Data Quality (DQ's) Checks in OFSAA

1 comment

As already discussed in the article Introduction to Data Quality Checks (DQ’s) in OFSAA DQ’s are defined to eliminate expected anomalies from the data. Anomalies can be of several types eg. Null values, blank values, negative values etc. in the data. Such data may not be required as per business. In that case, the anomalies will either take a default value or corresponding rows in Stage tables will error out.
Depending upon the types of anomalies DQ checks are defined in OFSAA. OFSAA provides seven types of Data Quality Checks as explained below:

1.       List of Value Check (LoV Check):   LoV Check in OFSAA checks the column values of the table against static list of data elements defined in the check as per business. If the values are other than the list of values defined in LoV check the data is considered inconsistent. In such case value is either defaulted or error out depending upon the severity of the check.
For eg. LoV check defined for Column A of Table T has list of values as ‘a’, ‘b’ and ‘c’. So when the column A will have value other then ‘a’, ‘b’ and ‘c’, the data will be considered inconsistent.

2.        Null Value Check: Null Check in OFSAA is to avoid null values in the table columns. If business decides columns to have non null values, Null check is defined for such columns. Every time a null value is encountered for such columns, Null check reports ‘Warning’ or ‘Error’ depending upon the severity of the check.

3.       Blank Value Check: Blank Value Check in OFSAA is to avoid blank values in the table columns. Do not confuse Null Check with Blank Check in OFSAA. Null check looks for null or no value whereas Blank Value Check checks whether all the characters of the attribute are white spaces i.e. Blank. Every time a blank value is encountered for such columns, Blank Value Check reports ‘Warning’ or ‘Error’ depending upon the severity of the check.

4.       Referential Integrity Check:  Referential Integrity Check in OFSAA checks the Stage table column values against Dimension tables. It acts as a lookup for the Stage column values in DIM tables. Referential Integrity Check and LoV check serves almost the same purpose in OFSAA. The difference between the two is that, in case of LoV check we have hardcoded the list of values which we have to look up. However, in case of Referential integrity check we refer to DIM tables for lookup.

5.       Duplicity Check: Duplicity check in OFSAA looks for redundant/duplicate values in the table attributes.

6.       Data Integrity Check: Under the Data Integrity Check in OFSAA we have three sub checks:

a.       Range Check: Range Check in OFSAA defines the range of the data for particular table columns. If data jumps out of the range defined in Range Check, it reports an error or warning depending upon the severity of the check. Suppose, a column is expected to have values between 1 to 100, Range Check will set range between 1 to 100. Values outside this range will be identified inconsistent by Range Check.

b.      Data Length Check:  Data Length Check in OFSAA checks for number of bytes in the data. If data is required to have 15 bytes, Data Length check will set it to 15 bytes. Data having less or greater than 15 bytes will be identified as inconsistent by Data length Check.

c.       Column Reference Check

7.       Custom/Business Check: Business checks in OFSAA are custom data quality checks defined by the user.

1 comment:

  1. Hi,

    Does it make sense to create a referential check (or a LOV check) and a null value check aswell on the same collumn? Or is the NULL check not necessary?

    Regards Michiel