[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207016#comment-15207016 ] Sergey Shelukhin commented on HIVE-13226: - Is it possible to rename "Start" and "FInish" to something less confusing? DAG startup, DAG runtime? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.1.0 > > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195839#comment-15195839 ] Gopal V commented on HIVE-13226: LGTM - +1. > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194446#comment-15194446 ] Prasanth Jayachandran commented on HIVE-13226: -- Test failures are unrealted. [~gopalv] Can you please review the latest patch? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192098#comment-15192098 ] Hive QA commented on HIVE-13226: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12792967/HIVE-13226.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9732 tests executed *Failed tests:* {noformat} TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7251/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7251/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7251/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12792967 - PreCommit-HIVE-TRUNK-Build > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184681#comment-15184681 ] Prasanth Jayachandran commented on HIVE-13226: -- Sounds good. Updated patch. > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184675#comment-15184675 ] Gopal V commented on HIVE-13226: Compile Query Prepare Plan Submit Plan Start Finish ? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184674#comment-15184674 ] Prasanth Jayachandran commented on HIVE-13226: -- Atleast we don't have space constraints here :) > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184667#comment-15184667 ] Prasanth Jayachandran commented on HIVE-13226: -- The problem is from user perspective, printing methods is not really helpful. "Analyze" for example have no context. Its also combination of semantic analyze, logical optimization and task compilation. Also it misses some steps in-between which will be useful to find where time is spent. For example, time between TezBuildDag and TezSubmitToRunningDag is not accounted which is the time taken for resource localization, session restart etc. "DAG Submit to DAG Accept" -> "DAG Submit to Accept".. is that any better? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184653#comment-15184653 ] Gopal V commented on HIVE-13226: Agree that printing the methods is sub-optimal, but we're dealing with computer science's #1 problem now - naming things. Too much use of the word "DAG" :) > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184640#comment-15184640 ] Prasanth Jayachandran commented on HIVE-13226: -- [~gopalv] Could you please take a look? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)