Abhishek Somani created HIVE-23143:
--------------------------------------

             Summary: Transactions: PPD in Delete deltas is broken
                 Key: HIVE-23143
                 URL: https://issues.apache.org/jira/browse/HIVE-23143
             Project: Hive
          Issue Type: Bug
          Components: Transactions
            Reporter: Abhishek Somani


The optimization introduced in HIVE-16812 seems broken. PPD is not happening 
for delete deltas, and in fact, also causes wrong results if data column names 
conflict with ACID ROW__ID column names (bucket, originalTransactionId etc).

This seems to be happening because after ORC-491, all PPD happens in data 
columns only for ACID orc files, so the filters for delete PPD never get 
applied on metadata columns and try to apply to data columns instead. And when 
the data columns have a column name (like "bucket" in the below example), it 
returns wrong results. 

Steps to repro:
{code:java}
set hive.fetch.task.conversion=none;
set hive.query.results.cache.enabled=false;
create table test(a int, bucket int) stored as orc 
tblproperties("transactional"="true");
insert into table test values (1, 1111), (2, 2222), (3, 3333);
delete from test where a = 2;
select * from test; //Will return the deleted row as well
set hive.txn.filter.delete.events=false;
select * from test; //Correct results returned. Will not return the deleted row
{code}
cc [~pvary] [~gopalv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to