[ 
https://issues.apache.org/jira/browse/HIVE-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248622#comment-14248622
 ] 

Brock Noland commented on HIVE-9124:
------------------------------------

There is 5.6 minutes of split generation:

{noformat}
2014-12-16 12:06:30,757 INFO  [sparkDriver-akka.actor.default-dispatcher-15]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418749551759 end=1418749590756 duration=38997 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2014-12-16 12:13:39,512 INFO  [sparkDriver-akka.actor.default-dispatcher-15]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418749978847 end=1418750019512 duration=40665 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2014-12-16 12:14:52,475 INFO  [sparkDriver-akka.actor.default-dispatcher-15]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418750020483 end=1418750092475 duration=71992 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2014-12-16 12:16:19,405 INFO  [sparkDriver-akka.actor.default-dispatcher-15]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418750094132 end=1418750179405 duration=85273 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2014-12-16 12:18:42,716 INFO  [sparkDriver-akka.actor.default-dispatcher-15]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418750181259 end=1418750322716 duration=141457 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2014-12-16 12:45:13,361 INFO  [sparkDriver-akka.actor.default-dispatcher-3]: 
log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits 
start=1418751829900 end=1418751913361 duration=83461 
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
{noformat}

> Performance of query 28 from tpc-ds
> -----------------------------------
>
>                 Key: HIVE-9124
>                 URL: https://issues.apache.org/jira/browse/HIVE-9124
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Brock Noland
>         Attachments: Screen Shot 2014-12-16 at 9.30.41 AM.png, 
> query28-explain.txt
>
>
> As you can see the from the attached screenshot, one stage was submitted at 
> {{2014/12/16 12:06:30}} and took 6 minutes (ending around 12:12). However the 
> next stage was not submitted until {{2014/12/16 12:18:42}}. We should 
> understand:
> * What is going on the mean time
> * Why is it taking so long
> {noformat}
> select  *
> from (select avg(ss_list_price) B1_LP
>             ,count(ss_list_price) B1_CNT
>             ,count(distinct ss_list_price) B1_CNTD
>       from store_sales
>       where ss_quantity between 0 and 5
>         and (ss_list_price between 11 and 11+10 
>              or ss_coupon_amt between 460 and 460+1000
>              or ss_wholesale_cost between 14 and 14+20)) B1,
>      (select avg(ss_list_price) B2_LP
>             ,count(ss_list_price) B2_CNT
>             ,count(distinct ss_list_price) B2_CNTD
>       from store_sales
>       where ss_quantity between 6 and 10
>         and (ss_list_price between 91 and 91+10
>           or ss_coupon_amt between 1430 and 1430+1000
>           or ss_wholesale_cost between 32 and 32+20)) B2,
>      (select avg(ss_list_price) B3_LP
>             ,count(ss_list_price) B3_CNT
>             ,count(distinct ss_list_price) B3_CNTD
>       from store_sales
>       where ss_quantity between 11 and 15
>         and (ss_list_price between 66 and 66+10
>           or ss_coupon_amt between 920 and 920+1000
>           or ss_wholesale_cost between 4 and 4+20)) B3,
>      (select avg(ss_list_price) B4_LP
>             ,count(ss_list_price) B4_CNT
>             ,count(distinct ss_list_price) B4_CNTD
>       from store_sales
>       where ss_quantity between 16 and 20
>         and (ss_list_price between 142 and 142+10
>           or ss_coupon_amt between 3054 and 3054+1000
>           or ss_wholesale_cost between 80 and 80+20)) B4,
>      (select avg(ss_list_price) B5_LP
>             ,count(ss_list_price) B5_CNT
>             ,count(distinct ss_list_price) B5_CNTD
>       from store_sales
>       where ss_quantity between 21 and 25
>         and (ss_list_price between 135 and 135+10
>           or ss_coupon_amt between 14180 and 14180+1000
>           or ss_wholesale_cost between 38 and 38+20)) B5,
>      (select avg(ss_list_price) B6_LP
>             ,count(ss_list_price) B6_CNT
>             ,count(distinct ss_list_price) B6_CNTD
>       from store_sales
>       where ss_quantity between 26 and 30
>         and (ss_list_price between 28 and 28+10
>           or ss_coupon_amt between 2513 and 2513+1000
>           or ss_wholesale_cost between 42 and 42+20)) B6
> limit 100
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to