[ https://issues.apache.org/jira/browse/KYLIN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhao jintao updated KYLIN-3961: ------------------------------- Description: Hi Team: I use "Top-N "measure to query such sql "select sum(AAA) from BBB group by CCC,DDD", It is much better than a cube without "Top-N". In my system, kylin cost just 0.2s to query sql with "Top-N" measure cube; If without "Top-N" measure it may be cost 10s. But I find that Top-N measure can be optimized to reduce errors. I use kylin demo to test "TopN". I build two cube using "KYLIN_SALES". The first cube has three dimentions:"SELLER_ID","BUYER_ID" and "PART_DT", has one measures: "SUM(PRICE)" . The second cube has one dimention:"PART_DT", has twon measures: "SUM(PRICE)" and "TOPN(10)", the "ORDER|SUM by Column" of "TOPN(10)" is "PRICE", the "Group by Column" of “TOPN(10)” is "SELLER_ID" and "BUYER_ID",the "Return Type" of "TOPN(10)" is "Top 10". Then I build cube from "2012-01-01" to "2014-01-01". I use same sql to query two cube. I find that 2 cubes have a larger error. The top5 "SUM PRICE" of first cube without "TopN" is "167.7269", "99.9908", "99.9888","99.9865","99.978". The top5 "SUM PRICE" of second cube with "TopN" is "179.27699...","167.6320...","167.3050...","167.2069...","166.7429...". Does any one meet same problem? Best regards. was: Hi Team: I use "Top-N "measure to query such sql "select sum(AAA) from BBB group by CCC,DDD", It is much better than a cube without "Top-N". In my system, kylin cost just 0.2s to query sql with "Top-N" measure cube; If without "Top-N" measure it may be cost 10s. But I find that Top-N measure can be optimized to reduce mistaks. I use kylin demo to test "TopN". I build two cube using "KYLIN_SALES". The first cube has three dimentions:"SELLER_ID","BUYER_ID" and "PART_DT", has one measures: "SUM(PRICE)" . The second cube has one dimention:"PART_DT", has twon measures: "SUM(PRICE)" and "TOPN(10)", the "ORDER|SUM by Column" of "TOPN(10)" is "PRICE", the "Group by Column" of “TOPN(10)” is "SELLER_ID" and "BUYER_ID",the "Return Type" of "TOPN(10)" is "Top 10". Then I build cube from "2012-01-01" to "2014-01-01". I use same sql to query two cube. I find that 2 cubes have a larger error. The top5 "SUM PRICE" of first cube without "TopN" is "167.7269", "99.9908", "99.9888","99.9865","99.978". The top5 "SUM PRICE" of second cube with "TopN" is "179.27699...","167.6320...","167.3050...","167.2069...","166.7429...". Does any one meet same problem? Best regards. > Optimize TopN measure merge function to reduce TopNCounter errors > --------------------------------------------------------------------- > > Key: KYLIN-3961 > URL: https://issues.apache.org/jira/browse/KYLIN-3961 > Project: Kylin > Issue Type: Improvement > Components: Measure - TopN > Affects Versions: v2.5.2 > Environment: Huawei FusionInsight > Reporter: zhao jintao > Assignee: zhao jintao > Priority: Major > Labels: easyfix > Original Estimate: 168h > Remaining Estimate: 168h > > Hi Team: > I use "Top-N "measure to query such sql "select sum(AAA) from BBB group by > CCC,DDD", It is much better than a cube without "Top-N". > In my system, kylin cost just 0.2s to query sql with "Top-N" measure cube; If > without "Top-N" measure it may be cost 10s. > But I find that Top-N measure can be optimized to reduce errors. > I use kylin demo to test "TopN". > I build two cube using "KYLIN_SALES". The first cube has three > dimentions:"SELLER_ID","BUYER_ID" and "PART_DT", has one measures: > "SUM(PRICE)" . The second cube has one dimention:"PART_DT", has twon > measures: "SUM(PRICE)" and "TOPN(10)", the "ORDER|SUM by Column" of > "TOPN(10)" is "PRICE", the "Group by Column" of “TOPN(10)” is "SELLER_ID" > and "BUYER_ID",the "Return Type" of "TOPN(10)" is "Top 10". Then I build cube > from "2012-01-01" to "2014-01-01". > I use same sql to query two cube. I find that 2 cubes have a larger error. > The top5 "SUM PRICE" of first cube without "TopN" is "167.7269", "99.9908", > "99.9888","99.9865","99.978". > The top5 "SUM PRICE" of second cube with "TopN" is > "179.27699...","167.6320...","167.3050...","167.2069...","166.7429...". > Does any one meet same problem? > > Best regards. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)