I checked the code, and find root cause is DumpMerger.enqueueFromDump()

I create a jira KYLIN-2926
<https://issues.apache.org/jira/browse/KYLIN-2926> to trace the bug.



2017-10-09 10:37 GMT+08:00 yu feng <olaptes...@gmail.com>:

> the cube is using hllc15, we are tracing the code and try to find the
> reason.
>
> 2017-10-08 14:52 GMT+08:00 Li Yang <liy...@apache.org>:
>
>> Interesting... is it HLL count distinct or bitmap count distinct?
>>
>> On Wed, Sep 27, 2017 at 11:19 AM, yu feng <olaptes...@gmail.com> wrote:
>>
>>> I add some log and find data from hbase is incorrect.
>>>
>>> 2017-09-27 11:17 GMT+08:00 yu feng <olaptes...@gmail.com>:
>>>
>>>> I have a cube like this :
>>>> dimensions : source_type, source_id, name, dt
>>>> measures:count(distinct uid), count(1) , count(distinct buyer)
>>>>
>>>> I run the query :
>>>>
>>>> select source_type, source_id, name,
>>>>     count(distinct uid), count(uid) as cnum, count(distinct buyer) as
>>>> buyerNum,
>>>>     count(buyer) as bnum
>>>> from
>>>>     table_name
>>>> where
>>>>     dt between '2017-06-01' and '2017-09-18'
>>>>     and source_id is not null
>>>>     and source_type is not null
>>>> group by
>>>>     source_type, source_id, name
>>>> order by buyerNum desc limit 1 offset 0
>>>>
>>>> return :
>>>>
>>>> mv
>>>> 423031
>>>> 起点‧终站
>>>> 193794
>>>> 92
>>>> 42043
>>>> 92
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> obviously, it is error result, I query the sourceid like this:
>>>>
>>>> select source_type, source_id, name,
>>>>     count(distinct uid), count(uid) as cnum, count(distinct buyer) as
>>>> buyerNum,
>>>>     count(buyer) as bnum
>>>> from
>>>>     vip_buying_funnel_cube_view
>>>> where
>>>>     dt between '2017-06-01' and '2017-09-18'
>>>>     and source_id is not null
>>>>     and source_type is not null
>>>>     and source_id = '423031'
>>>> group by
>>>>     source_type, source_id, name
>>>> order by buyerNum desc limit 1 offset 0
>>>>
>>>> the result is corrent :
>>>>
>>>> mv
>>>> 423031
>>>> 起点‧终站
>>>> 77
>>>> 92
>>>> 11
>>>> 92
>>>>
>>>
>>>
>>
>

Reply via email to