[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-08 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858756#comment-15858756
 ] 

Eugene Koifman commented on HIVE-15844:
---

update hive.txn.manager in HiveConf - remove ref to enforce bucketing - it 
doesn't exist in Hive2/master

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876829#comment-15876829
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853791/HIVE-15844.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 10251 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_non_partitioned]
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_partitioned] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_tmp_table] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid2]
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_update_delete] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts_special_characters]
 (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_non_partitioned]
 (batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_partitioned] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_two_cols] 
(batchId=19)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_non_partitioned]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_update_delete]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_table_update]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_update]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_table_update]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_after_multiple_inserts]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_non_partitioned]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_two_cols]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=93)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMerge2 (batchId=258)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMerge3 (batchId=258)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge2 
(batchId=268)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge3 
(batchId=268)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2
 (batchId=266)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3
 (batchId=266)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate
 (batchId=205)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParall

[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884301#comment-15884301
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854643/HIVE-15844.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid2]
 (batchId=29)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge] 
(batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3785/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3785/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3785/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854643 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886477#comment-15886477
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854946/HIVE-15844.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10268 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid2]
 (batchId=29)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3816/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3816/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3816/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854946 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887006#comment-15887006
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854976/HIVE-15844.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10273 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3822/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3822/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3822/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854976 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887286#comment-15887286
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855047/HIVE-15844.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10273 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testStoreWithNoSchema 
(batchId=173)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3826/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3826/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3826/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855047 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887367#comment-15887367
 ] 

Eugene Koifman commented on HIVE-15844:
---

Failures not related

Then change is to make ReduceSinkOperator independent of the type of operation 
(sql statement).


[~prasanth_j] could you review please?

[~mmccline] FYI changes in Vectorizer.java

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887526#comment-15887526
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855047/HIVE-15844.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10274 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3832/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3832/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3832/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855047 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888355#comment-15888355
 ] 

Eugene Koifman commented on HIVE-15844:
---

HIVE-16022 was committed while patch5 was running
patch6 fixes the new test

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888518#comment-15888518
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855170/HIVE-15844.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10276 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3844/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3844/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3844/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855170 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888547#comment-15888547
 ] 

Eugene Koifman commented on HIVE-15844:
---

no related failures

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888601#comment-15888601
 ] 

Prasanth Jayachandran commented on HIVE-15844:
--

Can you please attach patch to RB?

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888977#comment-15888977
 ] 

Prasanth Jayachandran commented on HIVE-15844:
--

mostly looks good to me. +1 for non-vectorization changes.

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889426#comment-15889426
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855251/HIVE-15844.07.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 125 failed/errored test(s), 10298 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_mapjoin1] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_simple] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_data_types] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_expressions]
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic]
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_non_string_partition]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join2] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce1] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce2] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_simple] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_div0] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_offset_limit]
 (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
 (batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_adaptor_usage_mode]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_between_columns]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniL

[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-03-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890766#comment-15890766
 ] 

Hive QA commented on HIVE-15844:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12855419/HIVE-15844.08.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10323 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
 (batchId=147)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=212)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3872/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3872/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3872/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12855419 - PreCommit-HIVE-Build

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch, HIVE-15844.07.patch, HIVE-15844.08.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-03-01 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890829#comment-15890829
 ] 

Eugene Koifman commented on HIVE-15844:
---

TestBeeLineWithArgs.testQueryProgressParallel is unstable, e.g. 
https://builds.apache.org/job/PreCommit-HIVE-Build/3870/
the other failure has age > 1



> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch, HIVE-15844.07.patch, HIVE-15844.08.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)