[ 
https://issues.apache.org/jira/browse/KYLIN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656295#comment-16656295
 ] 

Shaofeng SHI edited comment on KYLIN-3634 at 10/19/18 5:36 AM:
---------------------------------------------------------------

Hi Bo, thanks for the patch. Do you mean that, "NULL" should not be matched as  
" <> 'abc'" ?

I couldn't understand why the incorrect result is 0 row today; It should be the 
sum of all columns. Please clarify. 


was (Author: shaofengshi):
Hi Bo, thanks for the patch. Do you mean that, "NULL" should not be matched as  
" <> 'abc'" ?

> when filter column has null value may cause incorrect query result
> ------------------------------------------------------------------
>
>                 Key: KYLIN-3634
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3634
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v2.0.0
>            Reporter: WangBo
>            Priority: Major
>         Attachments: 
> 0001-KYLIN-3634-when-filter-column-has-null-value-may-cau.patch
>
>
> h1. Question
> when a column has null value,and using it as a filter column when querying, 
> and the filter value is not exist in the table,this may cause incorrect result
> h1. An Example
> h2. Table A
> the table A has three rows,city column of one row has null value
>  
> ||day||...||city||price||
> |20180101| |null|10|
> |20180101| |beijing|20|
> |20180101| |shanghai|10|
> h2. Query SQL
> select day,sum(price) from a where city <> 'abc' group by day
> h2. Correct Result
> exclude the row contains null city value
> ||day||col||
> |20180101|30|
> h2. InCorrect Result
> resullt 0 rows
> this happens in our production environment,the kylin version is 2.0.0
> h1. Analysis process
> 1,city column dosen't have a value,so the CompareTupleFilter will turn into  
> ConstantTupleFilter(see GTUtil.java)
> 2,if dimensions in the sql dosen't match all the columns using in group 
> by,the  bytesComparator used in hbase aggregation map will only compare the 
> columns using in group by
> 3,when GTAggregateScanner constructs key of aggBufMap,the key may contains 
> null value,because the comparator of aggBufMap only compares group by 
> columns,so the tuple share same group by columns may also share the same keys 
> which contains null value;This may cause kylin server receives tuples 
> contains null value;
> 4,when the code which dynamically generated by calcilte deals tuples using 
> filter,it first judges whether the column is null.Because filter column in 
> the tuple contains null value,so it always return false, no tuples will 
> return.
> h1. Solution
> when the filter column value is a invalid means not in the table,turn the 
> CompareTupleFiter into IS_NOT_NULL filter,instead of ConstantTupleFilter.TURE
>  
> Now I have test the feature in our production environment ;
> test in “mvn test” had passed,but not test in sandbox
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to