[ 
https://issues.apache.org/jira/browse/GRIFFIN-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Guo resolved GRIFFIN-289.
---------------------------------
    Fix Version/s: 0.6.0
       Resolution: Fixed

Issue resolved by pull request 538
[https://github.com/apache/griffin/pull/538]

> new feature for griffin COMPLETENESS dq type
> --------------------------------------------
>
>                 Key: GRIFFIN-289
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-289
>             Project: Griffin
>          Issue Type: New Feature
>          Components: completeness-batch
>    Affects Versions: 0.3.1-incubating
>            Reporter: Zhao Li
>            Priority: Major
>             Fix For: 0.6.0
>
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> Hello
>  
> Now we use griffin measure module to check batch data quality. In 
> COMPLETENESS dq type, griffin checks how many incomplete records in table, 
> and griffin only check if one column is 'null' or not.
>  
> However, only "null" is not enough to consider whether one column is invalid 
> or not. In our condition, analysts may consider other value is invalid even 
> though they are not "null". For example, one column named "company", if 
> company in ("a", "b", "c"), this record is invalid.
>  
> Here we need two ways for user to filter incomplete record, one is 
> "enumeration", users write all invalid values they think for one column; the 
> other is "regular expression", users write regular expression to match 
> invalid values for one column.
>  
> Could griffin updates COMPLETENESS dq type to support our "enumeration" and 
> "regular expression" way to filter incomplete records?
>  
> Regards
>  
> Zhao



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to