[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048470#comment-14048470 ] Hive QA commented on HIVE-7105: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653294/HIVE-7105.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5671 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/640/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/640/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-640/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653294 Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch, HIVE-7105.3.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048504#comment-14048504 ] Gopal V commented on HIVE-7105: --- Test failure unrelated to Tez. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch, HIVE-7105.3.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048510#comment-14048510 ] Gopal V commented on HIVE-7105: --- Committed to trunk, thanks [~mmccline], [~jnp] [~rajesh.balamohan]. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch, HIVE-7105.3.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038043#comment-14038043 ] Jitendra Nath Pandey commented on HIVE-7105: +1 Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034320#comment-14034320 ] Jitendra Nath Pandey commented on HIVE-7105: [~rusanu] Here is the RB link: https://reviews.apache.org/r/22540/ Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031498#comment-14031498 ] Hive QA commented on HIVE-7105: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650150/HIVE-7105.2.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5611 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/460/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/460/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-460/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650150 Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030321#comment-14030321 ] Remus Rusanu commented on HIVE-7105: Can you share the rb link? Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029955#comment-14029955 ] Gunther Hagleitner commented on HIVE-7105: -- Comments on rb. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004572#comment-14004572 ] Remus Rusanu commented on HIVE-7105: Extending the vectorized processing to the reduce side is a complex undertaking. None of the vector mode operators are implemented in reduce side. The thinking is that the bulk of the CPU intensive processing occurs on the map side and our goal was to provide maximum feature coverage (ie. implement as many operators as needed to cover the most queries) but atm vectorization only works for map side of first stage. I'm not sure whether at this stage we can call the map side effort stable/mature/complete enough to warrant a focus shift to reduce side. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Rajesh Balamohan Assignee: Jitendra Nath Pandey Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004882#comment-14004882 ] Eric Hanson commented on HIVE-7105: --- I agree with Remus. If you do want to get good performance with vectorization on the reduce side, you'll need to think carefully about how you can efficiently create full VectorizedRowBatches. Single-row or small VectorizedRowBatches will not give performance gains. Also, if it is expensive to load rows into the batches on the reduce side, that could dominate total runtime. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Rajesh Balamohan Assignee: Jitendra Nath Pandey Attachments: HIVE-7105.1.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)