Seems your query reads a lot of records out of HBase, which is not by
design a normal case. Normally data should be very aggregated and only a
few thousands of records are read. The query response time must be slow too
I guess. Your cube could use some optimization to match the use case.

For the properties:

kylin.query.scan.threshold=10000000
#default is 3M
- This controls how many records read from HBase at most. A safety valve to
keep Kylin from overloaded by bad queries.

kylin.query.mem.budget=64424509440
#default is 3G
- This controls how the memory cap each query can use in query server. The
memory is used for final aggregation before returning correct result.

kylin.query.cube.visit.timeout.times=3
#default is 1
- A timeout for waiting HBase scan to return.

Cheers
Yang


On Tue, Jun 7, 2016 at 10:10 AM, 吴钰彬 <[email protected]> wrote:

> Hi, Liyang
>
> Many thanks for reply this mail, by look into the source code, I have add
> below 3 parameter into the kylin.profile to somehow solve the problem
>
> Btw, can you explain more on below parameter propose?
>
> kylin.query.scan.threshold=10000000
> #default is 3M
> kylin.query.mem.budget=64424509440
> #default is 3G
> kylin.query.cube.visit.timeout.times=3
> #default is 1
>
>
>
> B.R
> Austin.Woo
>
> -----邮件原件-----
> 发件人: Li Yang [mailto:[email protected]]
> 发送时间: 2016年6月7日 8:55
> 收件人: [email protected]
> 抄送: 李欣 <[email protected]>; 亓庆国 <[email protected]>
> 主题: Re: 答复: how to extend the threshold for kylin query?(from baixing.com)
>
> Hi Austin,
>
> Note the image didn't get through mail list, thus was not displayed.
>
> So we didn't quite get you issue yet. Could you try describe again? You
> can use file hosting service to communicate attachments.
>
> Also it's always better to adopt the latest version. If you are early in
> pilot stage, the shift should be easy.
>
> Cheers
> Yang
>
> On Tue, May 31, 2016 at 3:43 PM, 吴钰彬 <[email protected]> wrote:
>
> > Hi, Kylin developers
> >
> >
> >
> > We further investigate our query, and when I change my SQL query and
> > remove the count(distinct) , it work fine , and from our query only
> > return
> > 231 records.
> >
> > When add the count(distinct) measure should not increase the return
> > records, but I don’t know how kylin work inside to calucate the
> > count(distinct), it seem need scan more many record to return the result?
> >
> > How to solve this problem?
> >
> >
> >
> > Ps: right now we are using kylin version 1.3
> >
> >
> >
> >
> >
> >
> >
> > B.R
> >
> > Austin.Woo
> >
> >
> >
> > *发件人:* 吴钰彬
> > *发送时间:* 2016年5月31日 10:35
> > *收件人:* '[email protected]' <[email protected]>
> > *抄送:* '[email protected]'
> > <[email protected]>;
> > 李欣 <[email protected]>; 亓庆国 <[email protected]>
> > *主题:* how to extend the threshold for kylin query?(from baixing.com)
> >
> >
> >
> > Hi, Kylin developers.
> >
> >
> >
> > This is Austin, DW team lead from Baixing.com. thanks for reading this
> > mail.
> >
> >
> >
> > Right now we are research on kylin solution adopt for our big data
> > query engine to consume our website click and event data,
> >
> >
> >
> > When we build some cube and try query from them, we face an issue as
> > below.
> >
> >
> >
> >
> >
> >
> >
> > But when we check the configure in /conf/kylin.properties set as below
> >
> > l  Kylin.query.scan.threshold=40000000
> >
> >
> >
> > Can you help advise is there anything we missing here?
> >
> >
> >
> > Looking forward your reply, and many thanks for your time.
> >
> >
> >
> >
> >
> > B.R
> >
> > Austin.Woo
> >
>

Reply via email to