[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853189#comment-15853189 ] Hive QA commented on HIVE-15573: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851065/HIVE-15573.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 73 failed/errored test(s), 10226 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_mapjoin1] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_simple] (batchId=42) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] (batchId=65) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_expressions] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_non_string_partition] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join2] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce1] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce2] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce3] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_simple] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] (batchId=43) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_div0] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_offset_limit] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs] (batchId=28) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLo
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853156#comment-15853156 ] Matt McCline commented on HIVE-15573: - New patch has review comment changes except guard-rail. Other changes for EXPLAIN VECTORIZATION. > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837353#comment-15837353 ] Hive QA commented on HIVE-15573: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12849210/acid-test.svg {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3166/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3166/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3166/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-01-25 08:07:59.184 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-3166/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-01-25 08:07:59.188 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 881deac..c31c296 master -> origin/master + git reset --hard HEAD HEAD is now at 881deac HIVE-15664 : LLAP text cache: improve first query perf I (Sergey Shelukhin, reviewed by Prasanth Jayachandran) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at c31c296 HIVE-15647 Combination of a boolean condition and null-safe comparison leads to NPE + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-01-25 08:08:00.529 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. fatal: unrecognized input The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12849210 - PreCommit-HIVE-Build > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837074#comment-15837074 ] Gopal V commented on HIVE-15573: [~mmccline]: LGTM - +1 tests pending. Nits on the LOG.debug(), wrap the ones which do Arrays. calls with an isDebugEnabled. There needs to be a guard-rail to check the 2 enums together, in one place. Not all combinations of {{BucketNumKind}} x {{PartitionHashCodeKind PartitionHashCodeKind}} matrix are valid. Also final variables in the loop are very useful to catch issues ahead of time - moving these into the loop + finals, means the compiler ensures no left over state from a previous row & that all branches perform assignments to all variables. {code} + int batchIndex; + int bucketNum; + int hashCode; + int keyLength; {code} > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834180#comment-15834180 ] Hive QA commented on HIVE-15573: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848847/HIVE-15573.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10989 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=129) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[timestamp_udf] (batchId=129) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct] (batchId=106) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3126/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3126/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3126/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848847 - PreCommit-HIVE-Build > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834037#comment-15834037 ] Matt McCline commented on HIVE-15573: - Ok, thanks Gopal, I look at optimizing int/bigint case. > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, > HIVE-15573.03.patch, screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834024#comment-15834024 ] Gopal V commented on HIVE-15573: bq. there are cases where VectorExtractRow has to set Hive writable objects so the Java hash code can be obtained. The "easy case" is int/bigint, where hashCode() is an identity function. cross-check this section, please for bucketFieldValues & partitionFieldValues? NPE? {code} + partitionVectorExtractRow.extractRow(batch, batchIndex, partitionFieldValues); + hashCode = ObjectInspectorUtils.getBucketHashCode(bucketFieldValues, partitionObjectInspectors); {code} > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, > screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834002#comment-15834002 ] Hive QA commented on HIVE-15573: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848836/HIVE-15573.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10990 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3124/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3124/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3124/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848836 - PreCommit-HIVE-Build > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Matt McCline > Fix For: 2.2.0 > > Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, > screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833922#comment-15833922 ] Hive QA commented on HIVE-15573: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848822/HIVE-15573.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 86 failed/errored test(s), 10990 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_nullscan] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap_nonvector] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_acid3] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_without_gby] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_columns] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_in] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_cast] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_mapjoin1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_data_types] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_date_1] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_10_0] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_1] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_2] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_3] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_4] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_5] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_6] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round_2] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_trailing] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_mapjoin] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupb
[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816767#comment-15816767 ] Gopal V commented on HIVE-15573: The timings for Map 1 went from 85s -> 13s, when vectorization (incorrectly bucketed) applied to this ReduceSink. > Vectorization: ACID shuffle ReduceSink is not specialized > -- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V > Attachments: screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)