[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853189#comment-15853189
 ] 

Hive QA commented on HIVE-15573:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12851065/HIVE-15573.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 73 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
 (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_mapjoin1] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_simple] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_expressions]
 (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic]
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_non_string_partition]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join2] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce1] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce2] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_simple] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_div0] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_offset_limit]
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
 (batchId=28)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLo

[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-02-04 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853156#comment-15853156
 ] 

Matt McCline commented on HIVE-15573:
-

New patch has review comment changes except guard-rail.  Other changes for 
EXPLAIN VECTORIZATION.

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, HIVE-15573.04.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837353#comment-15837353
 ] 

Hive QA commented on HIVE-15573:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12849210/acid-test.svg

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3166/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3166/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3166/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-01-25 08:07:59.184
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3166/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-01-25 08:07:59.188
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   881deac..c31c296  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 881deac HIVE-15664 : LLAP text cache: improve first query perf I 
(Sergey Shelukhin, reviewed by Prasanth Jayachandran)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at c31c296 HIVE-15647 Combination of a boolean condition and 
null-safe comparison leads to NPE
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-01-25 08:08:00.529
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12849210 - PreCommit-HIVE-Build

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: acid-test.svg, HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837074#comment-15837074
 ] 

Gopal V commented on HIVE-15573:


[~mmccline]: LGTM -  +1 tests pending. 

Nits on the LOG.debug(), wrap the ones which do Arrays. calls with an 
isDebugEnabled.

There needs to be a guard-rail to check the 2 enums together, in one place. Not 
all combinations of {{BucketNumKind}} x {{PartitionHashCodeKind 
PartitionHashCodeKind}} matrix are valid.

Also final variables in the loop are very useful to catch issues ahead of time 
- moving these into the loop + finals, means the compiler ensures no left over 
state from a previous row & that all branches perform assignments to all 
variables.

{code}
+  int batchIndex;
+  int bucketNum;
+  int hashCode;
+  int keyLength;
{code}

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834180#comment-15834180
 ] 

Hive QA commented on HIVE-15573:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848847/HIVE-15573.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=149)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=129)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[timestamp_udf] 
(batchId=129)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
 (batchId=106)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3126/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3126/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3126/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12848847 - PreCommit-HIVE-Build

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834037#comment-15834037
 ] 

Matt McCline commented on HIVE-15573:
-

Ok, thanks Gopal, I look at optimizing int/bigint case.

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, 
> HIVE-15573.03.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834024#comment-15834024
 ] 

Gopal V commented on HIVE-15573:


bq. there are cases where VectorExtractRow has to set Hive writable objects so 
the Java hash code can be obtained.

The "easy case" is int/bigint, where hashCode() is an identity function.

cross-check this section, please for bucketFieldValues & partitionFieldValues? 
NPE?

{code}
+  partitionVectorExtractRow.extractRow(batch, batchIndex, 
partitionFieldValues);
+  hashCode = ObjectInspectorUtils.getBucketHashCode(bucketFieldValues, 
partitionObjectInspectors);
{code}


> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, 
> screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834002#comment-15834002
 ] 

Hive QA commented on HIVE-15573:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848836/HIVE-15573.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=93)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit]
 (batchId=93)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3124/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3124/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3124/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12848836 - PreCommit-HIVE-Build

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, 
> screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833922#comment-15833922
 ] 

Hive QA commented on HIVE-15573:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848822/HIVE-15573.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 86 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_nullscan] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1]
 (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_1]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap] 
(batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap_nonvector]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_acid3]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_without_gby]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_columns]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_in]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_cast]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_mapjoin1]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_data_types]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_date_1]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_10_0]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_2]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_3]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_4]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_5]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_6]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round_2]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_trailing]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_mapjoin]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupb

[jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-10 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816767#comment-15816767
 ] 

Gopal V commented on HIVE-15573:


The timings for Map 1 went from 85s -> 13s, when vectorization (incorrectly 
bucketed) applied to this ReduceSink.

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
> Attachments: screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)