[
https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Remus Rusanu updated HIVE-4850:
-------------------------------
Attachment: HIVE-4850.2.patch
This is a working implementation based on current trunk. It is simpler than the
.1 patch in as it delegates the JOIN entirely to the row-mode MapJoinOperator.
The vectorized operator is literally calling the row-mode implementaiton for
each row in the input batch and collects the row-mode forward into the output
batch. This is not as bad as it seems because the JOIN operators has to resort
to row-mode operations anyway, due to the small tables (hashtables) being
row-mode (objects and object-inspectors). By delegating the entire join logic
to the row mode we piggyback on the correctness of exiting implementation. I do
plan to come up with a full-vectorized mode implementation but that would
require changes to the hash table creation-serialization. Note that the
filtering and key evaluation of the big table *does* use vectorized operators.
the row mode applies only to the key HT lookup and to the JOIN logic.
> Implement vectorized JOIN operators
> -----------------------------------
>
> Key: HIVE-4850
> URL: https://issues.apache.org/jira/browse/HIVE-4850
> Project: Hive
> Issue Type: Sub-task
> Reporter: Remus Rusanu
> Assignee: Remus Rusanu
> Attachments: HIVE-4850.1.patch, HIVE-4850.2.patch
>
>
> Easysauce
--
This message was sent by Atlassian JIRA
(v6.1#6144)