[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944520#comment-13944520 ] Jitendra Nath Pandey commented on HIVE-6222: I have committed this to branch-0.13 as well. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944142#comment-13944142 ] Jitendra Nath Pandey commented on HIVE-6222: [~rhbutani] For too many distinct keys, map side grouping can become too slow in vectorized mode under memory pressure. This affects hive-0.13 as well, therefore we should port it to branch-0.13 as well. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.14.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944165#comment-13944165 ] Harish Butani commented on HIVE-6222: - +1 for 0.13 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.14.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943714#comment-13943714 ] Hive QA commented on HIVE-6222: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12636015/HIVE-6222.5.patch {color:green}SUCCESS:{color} +1 5437 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1894/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1894/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12636015 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938967#comment-13938967 ] Remus Rusanu commented on HIVE-6222: [~gopalv] I've merged the HIVE-6518 fix into the refactoring of VectorGroupByOperator, see .4.patch. Everything GCCanary related is moved into ProcessingModeHashAggregate class. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932874#comment-13932874 ] Jitendra Nath Pandey commented on HIVE-6222: +1 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925526#comment-13925526 ] Hive QA commented on HIVE-6222: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12633560/HIVE-6222.1.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1687/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1687/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1687/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Ujdbc/src/java/org/apache/hive/jdbc/HiveConnection.java Uservice/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java A service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1575861. Updated to revision 1575861. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12633560 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: HIVE-6222.1.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925529#comment-13925529 ] Remus Rusanu commented on HIVE-6222: Conflict with HIVE-6531, I'll upload a new patch Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: HIVE-6222.1.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925836#comment-13925836 ] Hive QA commented on HIVE-6222: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12633651/HIVE-6222.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5374 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1696/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1696/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12633651 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13926196#comment-13926196 ] Remus Rusanu commented on HIVE-6222: I'm investigating the diff, trying to see if is a regression or correct result Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925004#comment-13925004 ] Remus Rusanu commented on HIVE-6222: https://reviews.apache.org/r/18943/ Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: HIVE-6222.1.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)