[ 
https://issues.apache.org/jira/browse/KYLIN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734517#comment-14734517
 ] 

ZhouQianhao commented on KYLIN-941:
-----------------------------------

We have had a small test on a data set which has 3M records.
MR cost about 6 mins while Spark cost about 12 mins.
Since we are not able to see the web UI of spark temporarily (due to some 
firewall issues), we are not able to figure out the where is the bottleneck.
Our guess is that some extra shuffle.
We are trying to tune the performance, once the tuning is complete, we will run 
benchmark on a larger data set.

> poc and benchmark for cubing on spark
> -------------------------------------
>
>                 Key: KYLIN-941
>                 URL: https://issues.apache.org/jira/browse/KYLIN-941
>             Project: Kylin
>          Issue Type: Sub-task
>          Components: Spark Engine
>            Reporter: ZhouQianhao
>            Assignee: ZhouQianhao
>             Fix For: v2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to