I am trying to test skewjoin and writing the result into a FullAcid table. I see incorrect results for it.
Steps to reproduce: It can be reproduced with the following qtest (similar to one done for mm tables in mm_all.q) in the master branch. --! qt:dataset:src1 > --! qt:dataset:src > -- MASK_LINEAGE > set hive.mapred.mode=nonstrict; > set hive.exec.dynamic.partition.mode=nonstrict; > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.optimize.skewjoin=true; > set hive.skewjoin.key=2; > set hive.optimize.metadataonly=false; > CREATE TABLE skewjoin_acid(key INT, value STRING) STORED AS ORC > tblproperties ("transactional"="true"); > FROM src src1 JOIN src src2 ON (src1.key = src2.key) INSERT into TABLE > skewjoin_acid SELECT src1.key, src2.value; > select count(distinct key) from skewjoin_acid; > drop table skewjoin_acid; The expected result of count was 309 but I got it to be 173. On, looking into it I figured a skewjoin produces 2 delta directories one is delta_x_x with x being the correct write id and another as delta_0000000_0000000 (called d0 hereafter). The d0 is unexpected and cannot be read. Some observations were: 1) The filesinkdesc carries the writeid which FSOP then uses to make the correct paths. 2) This write id is set while acquiring the lock in the driver 3) In case of skew, I found the skewed data was written in the same job and a new job (which perhaps does the map joins) is launched by the running job with its own tasks. The filesinkdesc is again made for each of the tasks and we loose the writeid here. The questions I had regarding correcting this were, Is my understanding correct and if so, I couldn't figure out where do I set this write id in the filesink desc for each task or is there any other way around it? I have created a JIRA for the same: HIVE-22636. Thanks, Aditya