[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207016#comment-15207016
 ] 

Sergey Shelukhin commented on HIVE-13226:
-

Is it possible to rename "Start" and "FInish" to something less confusing? DAG 
startup, DAG runtime?

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.1.0
>
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195839#comment-15195839
 ] 

Gopal V commented on HIVE-13226:


LGTM - +1.

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194446#comment-15194446
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

Test failures are unrealted. [~gopalv] Can you please review the latest patch? 

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192098#comment-15192098
 ] 

Hive QA commented on HIVE-13226:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12792967/HIVE-13226.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9732 tests executed
*Failed tests:*
{noformat}
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7251/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7251/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7251/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12792967 - PreCommit-HIVE-TRUNK-Build

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184681#comment-15184681
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

Sounds good. Updated patch.

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184675#comment-15184675
 ] 

Gopal V commented on HIVE-13226:


Compile Query
Prepare Plan
Submit Plan
Start 
Finish 

?

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184674#comment-15184674
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

Atleast we don't have space constraints here :)

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184667#comment-15184667
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

The problem is from user perspective, printing methods is not really helpful. 
"Analyze" for example have no context. Its also combination of semantic 
analyze, logical optimization and task compilation. Also it misses some steps 
in-between which will be useful to find where time is spent. For example, time 
between TezBuildDag and TezSubmitToRunningDag is not accounted which is the 
time taken for resource localization, session restart etc. 

"DAG Submit to DAG Accept" -> "DAG Submit to Accept".. is that any better?

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184653#comment-15184653
 ] 

Gopal V commented on HIVE-13226:


Agree that printing the methods is sub-optimal, but we're dealing with computer 
science's #1 problem now - naming things.

Too much use of the word "DAG" :)

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-08 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184640#comment-15184640
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

[~gopalv] Could you please take a look?

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)