Guangxu Cheng created KYLIN-4567:
------------------------------------

             Summary: Improve TopN merge performance in MR engine
                 Key: KYLIN-4567
                 URL: https://issues.apache.org/jira/browse/KYLIN-4567
             Project: Kylin
          Issue Type: Improvement
          Components: Measure - TopN
            Reporter: Guangxu Cheng
            Assignee: Guangxu Cheng


We have a cube that needs to calculate the TOPN of 13 columns. The number of 
data source is only 500k. But, the cubing job always fail when building base 
cuboid.

we found that the map task always killed by the ApplicationMaster due to time 
out

{noformat}
ERROR-[-10001]-[MR]:[Mr Task 
Timeout]:[AttemptID:attempt_1591996262448_229922_m_000000_1 Timed out after 
3600 secs!] ERROR-[-10015]-[MR]:[Container Exit Accidentally]:[Container killed 
by the ApplicationMaster. Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 ]
{noformat}

the stack information as below:
{noformat}
"SpillThread" #35 daemon prio=5 os_prio=0 tid=0x00007f9a89771800 nid=0x133a2 
runnable [0x00007f9a56e3f000]
   java.lang.Thread.State: RUNNABLE
        at java.util.LinkedList.toArray(LinkedList.java:1052)
        at java.util.List.sort(List.java:477)
        at java.util.Collections.sort(Collections.java:175)
        at 
org.apache.kylin.measure.topn.TopNCounter.sortAndRetain(TopNCounter.java:96)
        at org.apache.kylin.measure.topn.TopNCounter.merge(TopNCounter.java:183)
        at 
org.apache.kylin.measure.topn.TopNAggregator.aggregate(TopNAggregator.java:44)
        at 
org.apache.kylin.measure.topn.TopNAggregator.aggregate(TopNAggregator.java:27)
        at 
org.apache.kylin.measure.MeasureAggregators.aggregate(MeasureAggregators.java:83)
        at 
org.apache.kylin.engine.mr.steps.CuboidReducer.doReduce(CuboidReducer.java:108)
        at 
org.apache.kylin.engine.mr.steps.CuboidReducer.doReduce(CuboidReducer.java:44)
        at org.apache.kylin.engine.mr.KylinReducer.reduce(KylinReducer.java:77)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
        at 
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1688)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1645)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$900(MapTask.java:884)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1540)
{noformat}

>From the stack information, we found that sorting takes a lot of time. After 
>merge another counter into this counter, need to re-sort this counter. Maybe 
>we can reduce the frequency of sorting




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to