[
https://issues.apache.org/jira/browse/GRIFFIN-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
William Guo resolved GRIFFIN-289.
---------------------------------
Fix Version/s: 0.6.0
Resolution: Fixed
Issue resolved by pull request 538
[https://github.com/apache/griffin/pull/538]
> new feature for griffin COMPLETENESS dq type
> --------------------------------------------
>
> Key: GRIFFIN-289
> URL: https://issues.apache.org/jira/browse/GRIFFIN-289
> Project: Griffin
> Issue Type: New Feature
> Components: completeness-batch
> Affects Versions: 0.3.1-incubating
> Reporter: Zhao Li
> Priority: Major
> Fix For: 0.6.0
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> Hello
>
> Now we use griffin measure module to check batch data quality. In
> COMPLETENESS dq type, griffin checks how many incomplete records in table,
> and griffin only check if one column is 'null' or not.
>
> However, only "null" is not enough to consider whether one column is invalid
> or not. In our condition, analysts may consider other value is invalid even
> though they are not "null". For example, one column named "company", if
> company in ("a", "b", "c"), this record is invalid.
>
> Here we need two ways for user to filter incomplete record, one is
> "enumeration", users write all invalid values they think for one column; the
> other is "regular expression", users write regular expression to match
> invalid values for one column.
>
> Could griffin updates COMPLETENESS dq type to support our "enumeration" and
> "regular expression" way to filter incomplete records?
>
> Regards
>
> Zhao
--
This message was sent by Atlassian Jira
(v8.3.4#803005)