[
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004882#comment-14004882
]
Eric Hanson commented on HIVE-7105:
-----------------------------------
I agree with Remus. If you do want to get good performance with vectorization
on the reduce side, you'll need to think carefully about how you can
efficiently create full VectorizedRowBatches. Single-row or small
VectorizedRowBatches will not give performance gains. Also, if it is expensive
to load rows into the batches on the reduce side, that could dominate total
runtime.
> Enable ReduceRecordProcessor to generate VectorizedRowBatches
> -------------------------------------------------------------
>
> Key: HIVE-7105
> URL: https://issues.apache.org/jira/browse/HIVE-7105
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Reporter: Rajesh Balamohan
> Assignee: Jitendra Nath Pandey
> Attachments: HIVE-7105.1.patch
>
>
> Currently, ReduceRecordProcessor sends one key,value pair at a time to its
> operator pipeline. It would be beneficial to send VectorizedRowBatch to
> downstream operators.
--
This message was sent by Atlassian JIRA
(v6.2#6252)