[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-21 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773830#comment-16773830
 ] 

Shaofeng SHI commented on KYLIN-3322:
-

TopN does need a separate SUM measure to work. We will update the document to 
let user know this. 

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-22 Thread Vsevolod Ostapenko (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775591#comment-16775591
 ] 

Vsevolod Ostapenko commented on KYLIN-3322:
---

It's also reported as KYLIN-3687. It's good to have documentation updated, but 
it's better to prevent creation of the incomplete TopN cube definitions through 
the UI and via API calls.

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-22 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775621#comment-16775621
 ] 

KANG-SEN LU commented on KYLIN-3322:


I am disappointed that kylin group takes this approach to solve this problem.

I know that in the sample project, we have only one cube designed, so the 
TOPN(SUM(X), GROUP-BY-B) does come with SUM(X).

But what if some project needs to have two cube design. One for non-TOPN 
application, and the other for purely TOPN application. Now we may end up with 
both cube can support SUM(X) and one query may be routed to the wrong cube and 
costs long execution.

Even worse, if a SUM(X) query was supposed to be routed to non-TOPN cube, but 
now TOPN cube also satisfy the require. So the SUM(X) query is routed to the 
TOPN cube. In side the TOPN-cube, the correct answer would be go after the 
SUM(X) metric directly. But my experiment seems to suggest the kylin took the 
path of going after TOPN(SUM(X)), then sum over the dimension B, which in 
general has a hugh cardinality, and therefore, takes a long tim eto finish.

Now the question is that if you add documentation to inform user that SUM(X) is 
required if TOPN(SUM(X)) is configured. Then can you make sure the proper 
metric is used in case non-TOPN metric is accessed.

 

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-23 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776087#comment-16776087
 ] 

Shaofeng SHI commented on KYLIN-3322:
-

Hi Kang-sen, thanks for your feedback!

 

"In side the TOPN-cube, the correct answer would be go after the SUM(X) metric 
directly. But my experiment seems to suggest the kylin took the path of going 
after TOPN(SUM(X)), then sum over the dimension B, which in general has a hugh 
cardinality, and therefore, takes a long tim eto finish." If this is true, then 
it is a bug, we should fix it.

 

The reason of TopN need a separate SUM measure is, user's query may not have 
the high-cardinality column (which is the B in your sample), in this case, if 
we use TopN to answer, the result will be wrong (because only has limited B 
values), and its performance is bad. While if we add a SUM, there is no such 
issue: when user's query has B, we use TopN to anwer; if not, using the 
accurate SUM measure to answer. Besides, a SUM measure's size is much less than 
a TopN, it almost won't add overhead.

 

 

Let's work together to make Kylin better !

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-25 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776848#comment-16776848
 ] 

KANG-SEN LU commented on KYLIN-3322:


Hi, Shaofeng:

 

Thanks for your response. I have two points to add.
 # What if I already put SUM(X) in a separated cube, why do I have to add 
SUM(X) into second cube while I am defining TOPN(X) in the second cube. If it 
is just redundant metadata data, I will not complain about the extra human 
effort. I am worried if the kylin may not be able to find the right cube to 
compute SUM(X), because now there are two cubes both are, supposed, equally 
qualified to answer the query. It will create more challenge to the cost 
evaluation function to kylin.
 #  My experiment seems to suggest that when SUM(X) not group by B was issued, 
the cost evaluation function sent the query to the cube containing both 
TOPN(SUM(X)) and SUM(X) and, more importantly, it goes after TOPN(SUM(X)), then 
perform SUM(X), that takes more than 20 seconds in my test case. If it goes 
after SUM(X) directly, it took less than 0.2 second. I think how kylin try to 
accomplish SUM(X) in a cube containing both TOPN(SUM(X)) and SUM(X) may not be 
correct. That is the main reason I am against this decision that in a cube 
containing TOPN(SUM(X)), one must also configure SUM(X)
 # 

 

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779016#comment-16779016
 ] 

ASF subversion and git services commented on KYLIN-3322:


Commit 8e16be631ed74a2b31b5936d1266641ed3be98c8 in kylin's branch 
refs/heads/document from GinaZhai
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=8e16be6 ]

KYLIN-3322 TopN requires a SUM to work


> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-27 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779298#comment-16779298
 ] 

Shaofeng SHI commented on KYLIN-3322:
-

Hi Kang-sen, good points.

 

For the first concern, I'm afraid Kylin may not know which is the best cube to 
anser the SUM(x), the cost based selection have much room to improve.

 

As for the second, KYLIN-2620 is trying to add more checks for TOPN queries. If 
the query is not a TopN, will not use it to answer.

 

 

 

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3322) TopN requires a SUM to work

2019-02-27 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779823#comment-16779823
 ] 

KANG-SEN LU commented on KYLIN-3322:


If the second point is properly handled, then first point is not important. 
Could it be that when kylin decided to access TOPN(SUM(X), GROUP-BY B), it does 
not check if in the real query the GROUP-BY list does include dimension B? That 
would be a serious problem.

I am curious why you are willing to accept this unnecessary requirement that 
the SUM(X) must be defined in the same cube that TOPN(SUM(X)) is configured. I 
suspect there are some code somehow assumes SUM(X) is a metric when processing 
TOPN(SUM(X)). Maybe it is just too difficult to debug this buggy code.

Kang-sen

> TopN requires a SUM to work
> ---
>
> Key: KYLIN-3322
> URL: https://issues.apache.org/jira/browse/KYLIN-3322
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: liyang
>Assignee: Na Zhai
>Priority: Major
>
> Currently if user creates a measure of TopN seller by sum of price, it is 
> required that user also creates a measure of SUM(price). Otherwise, NPE will 
> be thrown at query time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)