I checked the code, and find root cause is DumpMerger.enqueueFromDump() I create a jira KYLIN-2926 <https://issues.apache.org/jira/browse/KYLIN-2926> to trace the bug.
2017-10-09 10:37 GMT+08:00 yu feng <olaptes...@gmail.com>: > the cube is using hllc15, we are tracing the code and try to find the > reason. > > 2017-10-08 14:52 GMT+08:00 Li Yang <liy...@apache.org>: > >> Interesting... is it HLL count distinct or bitmap count distinct? >> >> On Wed, Sep 27, 2017 at 11:19 AM, yu feng <olaptes...@gmail.com> wrote: >> >>> I add some log and find data from hbase is incorrect. >>> >>> 2017-09-27 11:17 GMT+08:00 yu feng <olaptes...@gmail.com>: >>> >>>> I have a cube like this : >>>> dimensions : source_type, source_id, name, dt >>>> measures:count(distinct uid), count(1) , count(distinct buyer) >>>> >>>> I run the query : >>>> >>>> select source_type, source_id, name, >>>> count(distinct uid), count(uid) as cnum, count(distinct buyer) as >>>> buyerNum, >>>> count(buyer) as bnum >>>> from >>>> table_name >>>> where >>>> dt between '2017-06-01' and '2017-09-18' >>>> and source_id is not null >>>> and source_type is not null >>>> group by >>>> source_type, source_id, name >>>> order by buyerNum desc limit 1 offset 0 >>>> >>>> return : >>>> >>>> mv >>>> 423031 >>>> 起点‧终站 >>>> 193794 >>>> 92 >>>> 42043 >>>> 92 >>>> >>>> >>>> >>>> >>>> >>>> obviously, it is error result, I query the sourceid like this: >>>> >>>> select source_type, source_id, name, >>>> count(distinct uid), count(uid) as cnum, count(distinct buyer) as >>>> buyerNum, >>>> count(buyer) as bnum >>>> from >>>> vip_buying_funnel_cube_view >>>> where >>>> dt between '2017-06-01' and '2017-09-18' >>>> and source_id is not null >>>> and source_type is not null >>>> and source_id = '423031' >>>> group by >>>> source_type, source_id, name >>>> order by buyerNum desc limit 1 offset 0 >>>> >>>> the result is corrent : >>>> >>>> mv >>>> 423031 >>>> 起点‧终站 >>>> 77 >>>> 92 >>>> 11 >>>> 92 >>>> >>> >>> >> >