[ https://issues.apache.org/jira/browse/GRIFFIN-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
William Guo closed GRIFFIN-164. ------------------------------- > Make 'Regular expression detection count' available in UI > --------------------------------------------------------- > > Key: GRIFFIN-164 > URL: https://issues.apache.org/jira/browse/GRIFFIN-164 > Project: Griffin (Incubating) > Issue Type: Improvement > Affects Versions: 0.1.6-incubating > Reporter: Enrico D'Urso > Priority: Minor > Fix For: 1.0.0-incubating > > > Hi, > I have been playing for one month now with Griffin. > Given my experience, some companies (included the one am working for as a > consultant) prefer doing stuff using UI. > Personally, I find very useful the following feature: > > * Regular expression detection count > which is, I have a column which should contain just numbers so I want to > check if my ETL process, wrongly, has populated my table with non-numeric > values. > I have been able to run such a job creating my self the right config.json, in > particular, using spark-sql as dialect: > {code:java} > select count(*) from src where account_id rlike [^0-9] > {code} > I saw that in pr.component.ts there is a commented line of code: > {code:java} > // {"id":10,"itemName":"Regular Expression Detection Count","category": > "Advanced Statistics"} > {code} > which I think is what I am talking about. > Also, I can read: > {code:java} > // case 'Regular Expression Detection Count': // return > 'count(source.`'+col.name+'`) where source.`'+col.name+'` LIKE '; > {code} > which should be the griffin-dsl dialect, even if, probably, the regex should > be added just after LIKE. > Then, once that the above griffin-dsl statement is available in the backend, > ProfilingRulePlanTrans class > should map that into 'rlike' Spark-sql clause. > Am not sure where (and if) ProfilingRulePlanTrans should be modified as > preGroupbyClause should contains everything, but I do not have enough > knowledge about it. > > Please judge yourself the priority of such a feature, which knowing well the > code, should not be too hard to make. > Thanks, > -- This message was sent by Atlassian JIRA (v7.6.3#76005)