[
https://issues.apache.org/jira/browse/KYLIN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901820#comment-14901820
]
Dayue Gao edited comment on KYLIN-1039 at 9/22/15 2:50 AM:
-----------------------------------------------------------
I debugged the second query yesterday, here's some clues I've found.
1. in CubeStorageEngine#search, `findSingleValueColumns` failed to detect
lstg_format_name as a single value column. This information is used in
`isExactAggregation`. However since week_beg_dt is a derived dimension,
isExactAggregation is already false. So I suspect that singleValuesD is not the
cause of the this problem althought +it's indeed wrong and may lead to problems
in other queries+.
2. the scan range returned by `buildScanRanges` is different for the first and
second query. For the first query, the range including filter on
lstg_format_name. While for the second query, the range only including filter
on cal_dt. I'm not sure for now whether it's the root cause or not.
3. I noticed that in CompareTupleFilter#evaluate, there is a comment "TODO
requires generalize, currently only evaluates COLUMN \{op\} CONST". Could
someone explain a little more about when evaluate will be called?
was (Author: gaodayue):
I debugged the second query yesterday, here's some clues I've found.
1. in CubeStorageEngine#search, `findSingleValueColumns` failed to detect
lstg_format_name as a single value column. This information is used in
`isExactAggregation`. However since week_beg_dt is a derived dimension,
isExactAggregation is already false. So I suspect that singleValuesD is not the
cause of the this problem althought +it's indeed wrong and may lead to problems
in other queries+.
2. the scan range returned by `buildScanRanges` is different for the first and
second query. For the first query, the range including filter on
lstg_format_name. While for the second query, the range only including filter
on cal_dt. I'm not sure for now whether it's the root cause or not.
3. I noticed that in CompareTupleFilter#evaluate, there is a comment "TODO
requires generalize, currently only evaluates COLUMN {op} CONST". Could someone
explain a little more about when evaluate will be called?
> Wrong answer to query with filters including OR
> -----------------------------------------------
>
> Key: KYLIN-1039
> URL: https://issues.apache.org/jira/browse/KYLIN-1039
> Project: Kylin
> Issue Type: Bug
> Reporter: Dayue Gao
>
> The following query on test dataset produces a result set containing 5 rows.
> {code:sql}
> select test_cal_dt.week_beg_dt, sum(price) as GMV
> from test_kylin_fact
> inner JOIN edw.test_cal_dt as test_cal_dt ON test_kylin_fact.cal_dt =
> test_cal_dt.cal_dt
> where test_cal_dt.week_beg_dt between DATE '2013-09-01' and DATE
> '2013-10-01' and lstg_format_name='FP-GTC'
> group by test_cal_dt.week_beg_dt
> {code}
> However, if I change the where condition to the following, Kylin produces
> empty result. This is wrong because `A or false` is just `A`, the result
> should be the same as above qeury.
> {code:sql}
> where test_cal_dt.week_beg_dt between DATE '2013-09-01' and DATE
> '2013-10-01' and (lstg_format_name='FP-GTC' or 'a' = 'b')
> {code}
> I have tried to add a constant folding rule
> `ReduceExpressionsRule.FILTER_INSTANCE` from calcite, expecting planner to
> reduce the second query to the first one. But it didn't work. Seems the rule
> has bug in itself, see https://issues.apache.org/jira/browse/DRILL-2218.
> As as result, we need more investigating to see why it goes wrong and fix the
> problem.
> BTW, the second query may seems rediculous at first glance. But in fact this
> kind of query is generated by one of our reporting front-end.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)