[
https://issues.apache.org/jira/browse/HIVE-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt McCline updated HIVE-7029:
-------------------------------
Attachment: HIVE-7029.1.patch
First time I have created an Apache patch. Doing this to see what errors
occur. And, to start the review and learn what is needed. Thanks.
Major changes to ql/optimizer/physical/Vectorizer.java
When the new configuration parameter
"hive.vectorized.execution.reduce.enabled=true" and previous
"hive.vectorized.execution.enabled=true" and "hive.execution.engine=tez", an
attempt is made to vectorize the ReduceWork of a TezTask.
The following previously existing vector specific
ql/src/test/queries/clientpositive q files were modified to add the new
"hive.vectorized.execution.reduce.enabled" option and in some q files an
EXPLAIN statement(s) was added to show the "Execution mode: vectorized" for
Reduce. Also, most of these q file tests were added to Tez shared q files in
the pom.xml:
vector_decimal_aggregate.q
vector_left_outer_join.q
vectorization_12.q
vectorization_13.q
vectorization_14.q
vectorization_15.q
vectorization_16.q
vectorization_9.q
vectorization_part.q
vectorization_part_project.q
vectorization_short_regress.q
vectorized_bucketmapjoin1.q
vectorized_context.q
vectorized_mapjoin.q
vectorized_nested_mapjoin.q
vectorized_rcfile_columnar.q
vectorized_shufflejoin.q
vectorized_timestamp_funcs.q
A new vectorized version of the ExtractOperator operator
(VectorExtractOperator) was added. The basic job of the ExtractOperator is to
project away (i.e. eliminate) the the Reduce KEY columns. However, recent
changes seemed to have eliminated much of the need to by not making
ReduceShuffle produce the unneeded Reduce KEY columns in the first place...
Note: This patch includes the patch from
https://issues.apache.org/jira/browse/HIVE-7105 -- so changes to the following
files should be ignored:
ql/exec/tez/RecordProcessor.java
ql/exec/tez/ReduceRecordProcessor.java
ql/exec/vector/VectorizedBatchUtil.java
ql/exec/vector/expressions/VectorExpressionWriterFactory.java
> Add an adapter to EXTRACT operator to serve vectorized data to reduce side
> operator pipeline
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-7029
> URL: https://issues.apache.org/jira/browse/HIVE-7029
> Project: Hive
> Issue Type: Sub-task
> Reporter: Matt McCline
> Attachments: HIVE-7029.1.patch
>
>
> This will enable vectorization team to independently work on vectorization on
> reduce side even before vectorized shuffle is ready.
--
This message was sent by Atlassian JIRA
(v6.2#6252)