[ https://issues.apache.org/jira/browse/KYLIN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaofeng SHI closed KYLIN-3634. ------------------------------- “Resolved in release 2.5.1 (2018-11-06)" > When filter column has null value may cause incorrect query result > ------------------------------------------------------------------ > > Key: KYLIN-3634 > URL: https://issues.apache.org/jira/browse/KYLIN-3634 > Project: Kylin > Issue Type: Bug > Components: Query Engine > Affects Versions: v2.0.0 > Reporter: WangBo > Assignee: WangBo > Priority: Major > Fix For: v2.4.2, v2.5.1 > > Attachments: > 0001-KYLIN-3634-when-filter-column-has-null-value-may-cau.patch, > image-2018-10-27-14-11-57-955.png > > > h1. Question > when a column has null value,and using it as a filter column when querying, > and the filter value is not exist in the table,this may cause incorrect result > h1. An Example > h2. Table A > the table A has three rows,city column of one row has null value > > ||day||...||city||price|| > |20180101| |null|10| > |20180101| |beijing|20| > |20180101| |shanghai|10| > h2. Query SQL > select day,sum(price) from a where city <> 'abc' group by day > h2. Correct Result > exclude the row contains null city value > ||day||col|| > |20180101|30| > h2. InCorrect Result > resullt 0 rows > this happens in our production environment,the kylin version is 2.0.0 > h1. Analysis process > 1,city column dosen't have a value,so the CompareTupleFilter will turn into > ConstantTupleFilter(see GTUtil.java) > 2,if dimensions in the sql dosen't match all the columns using in group > by,the bytesComparator used in hbase aggregation map will only compare the > columns using in group by > 3,when GTAggregateScanner constructs key of aggBufMap,the key may contains > null value,because the comparator of aggBufMap only compares group by > columns,so the tuple share same group by columns may also share the same keys > which contains null value;This may cause kylin server receives tuples > contains null value; > 4,when the code which dynamically generated by calcilte deals tuples using > filter,it first judges whether the column is null.Because filter column in > the tuple contains null value,so it always return false, no tuples will > return. > h1. Solution > when the filter column value is a invalid means not in the table,turn the > CompareTupleFiter into IS_NOT_NULL filter,instead of ConstantTupleFilter.TURE > > Now I have test the feature in our production environment ; > test in “mvn test” had passed,but not test in sandbox > -- This message was sent by Atlassian JIRA (v7.6.3#76005)