[jira] [Updated] (HIVE-7029) Add an adapter to EXTRACT operator to serve vectorized data to reduce side operator pipeline

Matt McCline (JIRA) Mon, 16 Jun 2014 19:27:28 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matt McCline updated HIVE-7029:
-------------------------------

    Attachment: HIVE-7029.1.patch

First time I have created an Apache patch.  Doing this to see what errors 
occur.  And, to start the review and learn what is needed.  Thanks.

Major changes to ql/optimizer/physical/Vectorizer.java
   When the new configuration parameter 
"hive.vectorized.execution.reduce.enabled=true" and previous 
"hive.vectorized.execution.enabled=true" and "hive.execution.engine=tez", an 
attempt is made to vectorize the ReduceWork of a TezTask.

The following previously existing vector specific 
ql/src/test/queries/clientpositive q files were modified to add the new 
"hive.vectorized.execution.reduce.enabled" option and in some q files an 
EXPLAIN statement(s) was added to show the "Execution mode: vectorized" for 
Reduce.  Also, most of these q file tests were added to Tez shared q files in 
the pom.xml:

    vector_decimal_aggregate.q
    vector_left_outer_join.q
    vectorization_12.q                     
    vectorization_13.q
    vectorization_14.q
    vectorization_15.q
    vectorization_16.q
    vectorization_9.q
    vectorization_part.q
    vectorization_part_project.q
    vectorization_short_regress.q
    vectorized_bucketmapjoin1.q
    vectorized_context.q
    vectorized_mapjoin.q
    vectorized_nested_mapjoin.q
    vectorized_rcfile_columnar.q
    vectorized_shufflejoin.q
    vectorized_timestamp_funcs.q

A new vectorized version of the ExtractOperator operator 
(VectorExtractOperator) was added.  The basic job of the ExtractOperator is to 
project away (i.e. eliminate) the the Reduce KEY columns.  However, recent 
changes seemed to have eliminated much of the need to by not making 
ReduceShuffle produce the unneeded Reduce KEY columns in the first place...

Note: This patch includes the patch from 
https://issues.apache.org/jira/browse/HIVE-7105 -- so changes to the following 
files should be ignored:
   ql/exec/tez/RecordProcessor.java
   ql/exec/tez/ReduceRecordProcessor.java
   ql/exec/vector/VectorizedBatchUtil.java
   ql/exec/vector/expressions/VectorExpressionWriterFactory.java
   

> Add an adapter to EXTRACT operator to serve vectorized data to reduce side 
> operator pipeline
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7029
>                 URL: https://issues.apache.org/jira/browse/HIVE-7029
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Matt McCline
>         Attachments: HIVE-7029.1.patch
>
>
> This will enable vectorization team to independently work on vectorization on 
> reduce side even before vectorized shuffle is ready.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7029) Add an adapter to EXTRACT operator to serve vectorized data to reduce side operator pipeline

Reply via email to