[jira] [Issue Comment Deleted] (KYLIN-2248) TopN merge further optimization after KYLIN-1917

Shaofeng SHI (JIRA) Mon, 05 Dec 2016 01:45:29 -0800

     [ 
https://issues.apache.org/jira/browse/KYLIN-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shaofeng SHI updated KYLIN-2248:
--------------------------------
    Comment: was deleted

(was: Change made in 
https://github.com/apache/kylin/commit/59a30f66d47cc1838e6852405699fd7957bfac29)

> TopN merge further optimization after KYLIN-1917
> ------------------------------------------------
>
>                 Key: KYLIN-2248
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2248
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>            Reporter: Shaofeng SHI
>            Assignee: Shaofeng SHI
>             Fix For: v1.6.1
>
>
> After KYLIN-1917, there still be room for performance optimization when 
> building a cube which has very large amount rows but the cardinality of all 
> dimension are quite small.
> Then there will be much aggregation happens in building base cuboid. The 
> reducer has a big pressure on CPU. With JStack we observed the CPU was spent 
> on the TopNCounter.merge(), in the HashMap.get() method.
> {code}
> Thread 28679: (state = IN_JAVA)
>  - java.util.HashMap.getEntry(java.lang.Object) @bci=81, line=465 (Compiled 
> frame; information may be imprecise)
>  - java.util.HashMap.get(java.lang.Object) @bci=11, line=417 (Compiled frame)
>  - 
> org.apache.kylin.measure.topn.TopNCounter.merge(org.apache.kylin.measure.topn.TopNCounter)
>  @bci=117, line=174 (Compiled frame)
>  - 
> org.apache.kylin.measure.topn.TopNAggregator.aggregate(org.apache.kylin.measure.topn.TopNCounter)
>  @bci=38, line=44 (Compiled frame)
>  - org.apache.kylin.measure.topn.TopNAggregator.aggregate(java.lang.Object) 
> @bci=5, line=27 (Compiled frame)
>  - org.apache.kylin.measure.MeasureAggregators.aggregate(java.lang.Object[]) 
> @bci=42, line=76 (Compiled frame)
>  - 
> org.apache.kylin.engine.mr.steps.CuboidReducer.doReduce(org.apache.hadoop.io.Text,
>  java.lang.Iterable, org.apache.hadoop.mapreduce.Reducer$Context) @bci=95, 
> line=97 (Compiled frame)
>  - org.apache.kylin.engine.mr.steps.CuboidReducer.doReduce(java.lang.Object, 
> java.lang.Iterable, org.apache.hadoop.mapreduce.Reducer$Context) @bci=7, 
> line=42 (Interpreted frame)
>  - org.apache.kylin.engine.mr.KylinReducer.reduce(java.lang.Object, 
> java.lang.Iterable, org.apache.hadoop.mapreduce.Reducer$Context) @bci=4, 
> line=40 (Interpreted frame)
>  - 
> org.apache.hadoop.mapreduce.Reducer.run(org.apache.hadoop.mapreduce.Reducer$Context)
>  @bci=22, line=171 (Interpreted frame)
>  - 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(org.apache.hadoop.mapred.JobConf,
>  org.apache.hadoop.mapred.TaskUmbilicalProtocol, 
> org.apache.hadoop.mapred.Task$TaskReporter, 
> org.apache.hadoop.mapred.RawKeyValueIterator, 
> org.apache.hadoop.io.RawComparator, java.lang.Class, java.lang.Class) 
> @bci=119, line=627 (Interpreted frame)
>  - org.apache.hadoop.mapred.ReduceTask.run(org.apache.hadoop.mapred.JobConf, 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=384, line=389 
> (Interpreted frame)
>  - org.apache.hadoop.mapred.YarnChild$2.run() @bci=36, line=164 (Interpreted 
> frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Interpreted frame)
>  - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=415 (Interpreted frame)
>  - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1709 (Interpreted frame)
>  - org.apache.hadoop.mapred.YarnChild.main(java.lang.String[]) @bci=514, 
> line=158 (Interpreted frame)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (KYLIN-2248) TopN merge further optimization after KYLIN-1917

Reply via email to