[
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176506#comment-14176506
]
Gunther Hagleitner commented on HIVE-8498:
------------------------------------------
Tagging afaik only comes into play only for demux/mux. It might be easier to
fix the multi insert case, especially since I know the event broadcast is
already working (and you would disable this). The plan for this multi-insert
query should be something like:
ts -> fil[1] -> fs[1]
-> fil[2] -> fs[2]
-> fil[3] -> fs[3]
The problem might be as simple as making sure the TS fowards to all it's
children.
It might, however, also be a case of the vectorization code not converting
operators correctly.
If it's simple, the best approach might be to put a fix for the multi-insert
case, and disable correlation optimizer (tagging) when vectorization is on.
[~jnp] do you have any insights?
> Insert into table misses some rows when vectorization is enabled
> ----------------------------------------------------------------
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Affects Versions: 0.14.0, 0.13.1
> Reporter: Prasanth J
> Assignee: Matt McCline
> Priority: Critical
> Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch
>
>
> Following is a small reproducible case for the issue
> create table orc1
> stored as orc
> tblproperties("orc.compress"="ZLIB")
> as
> select rn
> from
> (
> select cast(1 as int) as rn from src limit 1
> union all
> select cast(100 as int) as rn from src limit 1
> union all
> select cast(10000 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 10000
> But with vectorization enabled we get
> 1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)