Hi Team,

We are working on benchmark test for Kylin v2.5-Hbase-1.x as part of PoC.

Here is my cube (pseudo) :

*Dimension Table* : D1
*Fact Table* : F1, F2

*Metrics* : SUM(D1.m1), SUM(D2.m2)
*Dimension Columns* -- Normal (D1.d1, D1.d2, D1.d3, F1.a1, F2.b1 )

JOIN (D1.d1 = F1.a1 AND D2.d2 = F2.b1)

When I run a query matching to the cuboids it runs very fast :
pseudo example query:

SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3
FROM D1
JOIN F1
ON D1.d1 = F1.a1
JOIN F2
ON D1.d2 = F2.b1
GROUP BY d1, d2, d3


But when I add where clause to query it become very slow in response
pseudo example query:

SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3
FROM D1
JOIN F1
ON D1.d1 = F1.a1
JOIN F2
ON D1.d2 = F2.b1
*WHERE d3 > 100 AND d3 < 1000*
GROUP BY d1, d2, d3

*In my case d3 is High Cardinality dimension which is part of row key (
Normal Dimension ).*

Here are question:

1. I have installed Kylin Co-Processor
<http://kylin.apache.org/docs20/howto/howto_update_coprocessor.html> before
running queries. Do Kylin query results gets filtered Co-Processor end?

2. How to find query traces to identify the bottleneck in response time?

3. Even though I have enabled Query Cache, it seems its not getting used
when query runs ( in case of multiple times also) .

4. Any best practises to tune the queries with WHERE clause?


Thank You,
Shrikant Bang.

Reply via email to