[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058933#comment-15058933 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11107: -- HIVE-12681 is the follow-up jira to address the above comments from Ashutosh. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch, > HIVE-11107.6.patch, HIVE-11107.7.patch, HIVE-11107.8.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057153#comment-15057153 ] Ashutosh Chauhan commented on HIVE-11107: - * We only need rawDS and numRows fields. All extra fields arent needed. * I dont see much value in TestPerfCliDriver.vm. We can achieve its effect from TestCliDriver either by passing mode parameter or via creating a mapping in pom.xml +1 for existing patch. We should take up these improvements in follow-up. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch, > HIVE-11107.6.patch, HIVE-11107.7.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048417#comment-15048417 ] Hive QA commented on HIVE-11107: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12776381/HIVE-11107.6.patch {color:green}SUCCESS:{color} +1 due to 63 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9933 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6289/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6289/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6289/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12776381 - PreCommit-HIVE-TRUNK-Build > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch, HIVE-11107.6.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15042148#comment-15042148 ] Hive QA commented on HIVE-11107: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12775692/HIVE-11107.5.patch {color:green}SUCCESS:{color} +1 due to 63 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 70 failed/errored test(s), 9953 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query12 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query13 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query15 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query18 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query19 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query20 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query21 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query22 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query25 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query26 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query27 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query28 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query29 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query3 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query31 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query32 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query34 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query39 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query40 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query42 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query43 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query45 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query46 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query48 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query50 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query51 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query52 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query54 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query55 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query56 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query58 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query60 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query64 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query65 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query66 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query67 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query68 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query7 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query70 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query71 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query72 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query73 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query75 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query76 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query79 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query80 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query82 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query84 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query87 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query88 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query89 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query90 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query91 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPe
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15039525#comment-15039525 ] Ashutosh Chauhan commented on HIVE-11107: - 1. Why do we have sql scripts for stats copied in? They should be sourced from original location which is in metastore/scripts/derby/ 2. Please add comments in TestPerfCliDriver.vm on how is it different from TestCliDriver.vm 3. I also dont see any changes in ptest2/ How are we making sure QA will pick this new Driver. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038659#comment-15038659 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11107: -- [~ashutoshc] Thanks for the review comments. 1. Modified the tests to run on MiniTezCluster 2. Moved to QTestUtil 3. This is something I will look at once I add support to use Hbase metastore to run these queries. As discuss, I will need to modify setupMetaStoreTableColumnStatsFor30TBTPCDSWorkload() to a more common function which can be used with different metastore db flavors. Thanks Hari > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036686#comment-15036686 ] Ashutosh Chauhan commented on HIVE-11107: - * It will be more useful to have tez plans for this instead of MR, since MR is deprecated. * Move function setupMetaStoreTableColumnStatsFor30TBTPCDSWorkload() to QTestUtil that way it will be easier to maintain, since its in java file and not in template. * Derby is now integrated into jdk7 so instead of accessing over jdbc connection, you may want to use apis directly, that will make it much easier to maintain and debug this function. http://www.oracle.com/technetwork/java/javadb/overview/javadb-156712.html > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036526#comment-15036526 ] Ashutosh Chauhan commented on HIVE-11107: - [~hsubramaniyan] Seems like RB is not updated with latest patch. Can you update it ? > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031422#comment-15031422 ] Hive QA commented on HIVE-11107: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12774693/HIVE-11107.4.patch {color:green}SUCCESS:{color} +1 due to 50 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9916 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6168/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6168/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6168/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12774693 - PreCommit-HIVE-TRUNK-Build > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030449#comment-15030449 ] Damien Carol commented on HIVE-11107: - Profile _hadoop-2_ will be removed from master branch. I think you should change description on this JIRA. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030291#comment-15030291 ] Hive QA commented on HIVE-11107: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12774221/HIVE-11107.3.patch {color:green}SUCCESS:{color} +1 due to 50 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 9916 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query12 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query15 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query18 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query19 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query20 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query22 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query25 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query26 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query27 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query29 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query3 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query40 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query42 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query43 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query45 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query46 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query50 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query52 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query54 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query55 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query68 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query7 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query70 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query72 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query75 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query76 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query79 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query80 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query82 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query84 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query90 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query93 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query94 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query96 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query97 org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6147/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6147/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6147/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 42 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12774221 - PreCommit-HIVE-TRUNK-Build > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to t
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602018#comment-14602018 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11107: -- [~xuefuz] This benchmark is intended to make sure that subsequent changes to the optimizer or any hive code do not yield any unexpected plan changes. i.e. I don't intend to run the entire TPCDS query set, but just "explain plan" for the TPCDS queries. As part of this jira, I will manually verify that expected hive optimizations kick in for the queries (for given stats/dataset). If there is a difference in plan within this test suite due to a future commit, it needs to be analyzed and we need to make sure that it is not a regression. In subsequent patches, I am planning to import stats from 1G (and possibly higher scales) of TPCDS data-set before running the explain queries instead of adding the .dat files. The test suite can be run in master branch by mvn test -Dtest=TestPerfCliDriver -Phadoop-2 I believe we don't have dedicated unit tests to cover the scenario mentioned here, hence this jira. I will add some of the details in the jira description for better clarity. Thanks Hari > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601966#comment-14601966 ] Xuefu Zhang commented on HIVE-11107: [~hsubramaniyan], thanks for working on this. One question though, I'm not aware of so said "performance regression test suite", so curious about what it is and how we can run it. A little more details than the description may help. Thanks. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601962#comment-14601962 ] Hive QA commented on HIVE-11107: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12741835/HIVE-11107.1.patch {color:green}SUCCESS:{color} +1 9076 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4384/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4384/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4384/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12741835 - PreCommit-HIVE-TRUNK-Build > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)