Thank you Ma, for response and making me correct about query filters( WHERE
clause) in Kylin.

I got a an idea that Kylin is using HBase FuzzyFilters
<https://github.com/apache/kylin/blob/master/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/cube/v2/CubeHBaseRPC.java#L101>
while
scanning rows from HBase at coprocessor end.

This also clarified me  importance of order of the dimensions in the
rowkey.

Regards,
Shrikant Bang.


On Wed, Oct 24, 2018 at 7:56 AM Ma Gang <mg4w...@163.com> wrote:

> Hi Shrikant,
> Which Kylin version you used? By default, most of the filters will be
> pushed down to coprocessor, you may grep "Total filtered row:" in your
> log to see how many rows are filtered from coprocessor side.
>
> If you want to change code to add some log there, you can add log at
> method: CubeHBaseEndpointRPC.getGTScanner(final GTScanRequest scanRequest)
> the parameter GTScanRequest has a field:filterPushDown, is the filter
> push down to coprocessor, you may log some information there, but the
> filter is not the original filter you input but the converted filter that
> has constant values encoded.
>
>
> At 2018-10-24 01:00:15, "Shrikant Bang" <b.shrikan...@gmail.com> wrote:
>
> Hi Team,
>
>   I tried debugging code to understand query execution (on Kylin v2.5
> hbase 1.x) . As I understood query is formed and executed using Calcite
> APIs.
>
>   Could you please help me to get answer of below?
>
>   1.  With my understanding default implementation of filter operation
> (WHERE clause) is running on Kylin’s node. Is it correct?
>
>   2.  How to make use of CoprocessorFilter
> <https://github.com/apache/kylin/blob/master/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/common/coprocessor/CoprocessorFilter.java>
> for running filters on HBase side (if it is not enabled by default)?
>
>
> Thank You,
> Shrikant Bang
>
> On Tue, Oct 23, 2018 at 4:33 PM Shrikant Bang <b.shrikan...@gmail.com>
> wrote:
>
>> Thank you Ma for response.
>>
>> I didn't find any logs about applying filters in co-processors. Also I
>> don't see logs for logical/physical query plans in kylin.log. Am I missing
>> anything?
>>
>> I would like to put logger statements in code and build specific
>> module(s) as required for tracing latencies -- would you please suggest any
>> classes which can provide me these details?
>>
>> Regards,
>> Shrikant Bang.
>>
>> On Tue, Oct 23, 2018 at 3:09 PM Ma Gang <mg4w...@163.com> wrote:
>>
>>> You may post your query related log here, there should be some query log
>>> that indicated whether the filter is push down or not, at least in the
>>> returned response stats, there's some log show how many rows are filtered
>>> in the coprocessor side.
>>>
>>> At 2018-10-23 16:58:36, "Sachin Aggarwal" <different.sac...@gmail.com> 
>>> wrote:
>>> >Hi Team,
>>> >
>>> >     Could you please confirm if filtering of rows (WHERE clause) is done
>>> >in CoProcessor side?
>>> >
>>> >     Is there any APIs/logging to get physical plan of query? It will help
>>> >us in optimising the cube.
>>> >
>>> >
>>> >
>>> >On Mon, Oct 22, 2018 at 8:58 PM Shrikant Bang <b.shrikan...@gmail.com>
>>> >wrote:
>>> >
>>> >> Thanks ShaoFeng for response. I will try this and will update the results
>>> >> of my queries.
>>> >>
>>> >> I would like to learn, how to identify the bottleneck in query 
>>> >> executions.
>>> >> Can we trace the query execution in each stage with timestamp?
>>> >>
>>> >> Also is there way we can get physical plan of query? This could help me 
>>> >> to
>>> >> design/tune my cube/queries for better response time.
>>> >>
>>> >> Regards,
>>> >> Shrikant Bang
>>> >>
>>> >> On Mon, Oct 22, 2018 at 8:01 PM ShaoFeng Shi <shaofeng...@apache.org>
>>> >> wrote:
>>> >>
>>> >>> Hi Shrikant,
>>> >>>
>>> >>> What's the order of the dimensions in the rowkey? In this case, you need
>>> >>> to put "d3" at the heading position of the rowkey.
>>> >>>
>>> >>> Here is a good reference on how to design a cube, maybe we need add that
>>> >>> into the FAQ or as a part of document:
>>> >>> https://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
>>> >>>
>>> >>> Shrikant Bang <b.shrikan...@gmail.com> 于2018年10月22日周一 下午3:51写道:
>>> >>>
>>> >>>> Hi Team,
>>> >>>>
>>> >>>> We are working on benchmark test for Kylin v2.5-Hbase-1.x as part of 
>>> >>>> PoC.
>>> >>>>
>>> >>>> Here is my cube (pseudo) :
>>> >>>>
>>> >>>> *Dimension Table* : D1
>>> >>>> *Fact Table* : F1, F2
>>> >>>>
>>> >>>> *Metrics* : SUM(D1.m1), SUM(D2.m2)
>>> >>>> *Dimension Columns* -- Normal (D1.d1, D1.d2, D1.d3, F1.a1, F2.b1 )
>>> >>>>
>>> >>>> JOIN (D1.d1 = F1.a1 AND D2.d2 = F2.b1)
>>> >>>>
>>> >>>> When I run a query matching to the cuboids it runs very fast :
>>> >>>> pseudo example query:
>>> >>>>
>>> >>>> SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3
>>> >>>> FROM D1
>>> >>>> JOIN F1
>>> >>>> ON D1.d1 = F1.a1
>>> >>>> JOIN F2
>>> >>>> ON D1.d2 = F2.b1
>>> >>>> GROUP BY d1, d2, d3
>>> >>>>
>>> >>>>
>>> >>>> But when I add where clause to query it become very slow in response
>>> >>>> pseudo example query:
>>> >>>>
>>> >>>> SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3
>>> >>>> FROM D1
>>> >>>> JOIN F1
>>> >>>> ON D1.d1 = F1.a1
>>> >>>> JOIN F2
>>> >>>> ON D1.d2 = F2.b1
>>> >>>> *WHERE d3 > 100 AND d3 < 1000*
>>> >>>> GROUP BY d1, d2, d3
>>> >>>>
>>> >>>> *In my case d3 is High Cardinality dimension which is part of row key (
>>> >>>> Normal Dimension ).*
>>> >>>>
>>> >>>> Here are question:
>>> >>>>
>>> >>>> 1. I have installed Kylin Co-Processor
>>> >>>> <http://kylin.apache.org/docs20/howto/howto_update_coprocessor.html> 
>>> >>>> before
>>> >>>> running queries. Do Kylin query results gets filtered Co-Processor end?
>>> >>>>
>>> >>>> 2. How to find query traces to identify the bottleneck in response 
>>> >>>> time?
>>> >>>>
>>> >>>> 3. Even though I have enabled Query Cache, it seems its not getting 
>>> >>>> used
>>> >>>> when query runs ( in case of multiple times also) .
>>> >>>>
>>> >>>> 4. Any best practises to tune the queries with WHERE clause?
>>> >>>>
>>> >>>>
>>> >>>> Thank You,
>>> >>>> Shrikant Bang.
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Best regards,
>>> >>>
>>> >>> Shaofeng Shi 史少锋
>>> >>>
>>> >>>
>>> >
>>> >--
>>> >
>>> >Thanks & Regards
>>> >
>>> >Sachin Aggarwal
>>>
>>>
>>>
>>>
>>>
>>
>
>
>

Reply via email to