Dataloss on skewjoin for FullAcid table.

Aditya Shah Thu, 19 Dec 2019 10:40:10 -0800

I am trying to test skewjoin and writing the result into a FullAcid table.
I see incorrect results for it.


Steps to reproduce:

It can be reproduced with the following qtest (similar to one done for mm
tables in mm_all.q) in the master branch.

--! qt:dataset:src1
> --! qt:dataset:src
> -- MASK_LINEAGE
> set hive.mapred.mode=nonstrict;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.skewjoin=true;
> set hive.skewjoin.key=2;
> set hive.optimize.metadataonly=false;
> CREATE TABLE skewjoin_acid(key INT, value STRING) STORED AS ORC
> tblproperties ("transactional"="true");
> FROM src src1 JOIN src src2 ON (src1.key = src2.key) INSERT into TABLE
> skewjoin_acid SELECT src1.key, src2.value;
> select count(distinct key) from skewjoin_acid;
> drop table skewjoin_acid;


The expected result of count was 309 but I got it to be 173.

On, looking into it I figured a skewjoin produces 2 delta directories one
is delta_x_x with x being the correct write id and another as
delta_0000000_0000000 (called d0 hereafter). The d0 is unexpected and
cannot be read. Some observations were:
1) The filesinkdesc carries the writeid which FSOP then uses to make the
correct paths.
2) This write id is set while acquiring the lock in the driver
3) In case of skew, I found the skewed data was written in the same job and
a new job (which perhaps does the map joins) is launched by the running job
with its own tasks. The filesinkdesc is again made for each of the tasks
and we loose the writeid here.

The questions I had regarding correcting this were, Is my understanding
correct and if so, I couldn't figure out where do I set this write id in
the filesink desc for each task or is there any other way around it?

I have created a JIRA for the same: HIVE-22636.

Thanks,
Aditya

Dataloss on skewjoin for FullAcid table.

Reply via email to