We have 30+ million event log of each day, and amost 15w+ cardinality of 6
dimension. We used 4.86% precision of HLL measure, and queried page by page.

hongbin ma <[email protected]>于2016年8月4日周四 下午11:39写道:

> after you run such query, check out the KYLIN_HOME/logs/kylin.log, there
> should be snippet like:
>
> 2016-08-04 00:48:31,990 INFO  [http-bio-7070-exec-7]
> service.QueryService:399 : Scan count for each storageContext: 12306477,
> 2016-08-04 00:48:31,991 INFO  [http-bio-7070-exec-7]
> controller.QueryController:197 : Stats of SQL response: isException: false,
> duration: 56152, total scan count 12306477
> 2016-08-04 00:48:32,000 WARN  [http-bio-7070-exec-7]
>
> can you let us know " Scan count for each storageContext" and the size of
> your query result?
>
> On Thu, Aug 4, 2016 at 2:21 PM, Li Yang <[email protected]> wrote:
>
>> Depending on how many rows and how many count distinct values are
>> returned, the query may take much memory and become slow.
>>
>> By saying querying uv of a month data, how many rows do you expect? Also
>> what's the precision of the HLL measure? Lower the precision can ease the
>> problem too.
>>
>> On Fri, Jul 29, 2016 at 4:54 PM, 张天生 <[email protected]> wrote:
>>
>>> I'm using kylin 1.5.2.1. I built a cube for a month's event data of
>>> advertisment impression/click/conversion. It consists of 6 dimensions and 8
>>> measures. It consists of 2 uv measures, and uv measure was computed by
>>> DISTINCT COUNT. The cube size is 2G. When i queried uv measures of a month
>>> data, the memory quickly increased to 30G+, and the quey was also slowly. I
>>> don't known why it occupied so much memory, but cube size is only 2G,
>>> memory data expanded so big. Hower, when i executed simple silimar sum or
>>> count query ,it was fast and occupied memory not too much.
>>>
>>
>>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
>

Reply via email to