[ https://issues.apache.org/jira/browse/HIVE-18429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HIVE-18429: ---------------------------------- Description: Suppose we start with empty delta_8_8 and delta_9_9 and compaction runs. It will currently produce an MR job with 0 splits and so {{CompactorMR.TMP_LOCATION}} never gets created. This causes {{CompactorOutputCommitted.commitJob()}} to fail when it tries to do {{FileStatus[] contents = fs.listStatus(tmpLocation);}} since tmpLocation doesn't exist. If compactor fails to produce delta_8_9 here it will fail to do further compaction unless new delta with data is created. If the number of empty deltas is > than HiveConf.ConfVars.COMPACTOR_MAX_NUM_DELTA, compaction will not be able to proceed at all. It should produce a delta_8_9 in this case even if it's empty. The error (in the log of standalone metastore process) would look like this {noformat} 2018-01-10T13:27:10,521 WARN [Thread-209] mapred.LocalJobRunner: job_local44610510_0003 java.io.FileNotFoundException: File file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1515619503884/warehouse/t/_tmp_60ce7a11-d798-474f-b223-7d0acdb6dd5c does not exist at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:464) ~[hadoop-common-3.0.0-beta1.jar:?] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1853) ~[hadoop-common-3.0.0-beta1.jar:?] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1895) ~[hadoop-common-3.0.0-beta1.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678) ~[hadoop-common-3.0.0-beta1.jar:?] at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:919) ~[classes/:?] at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567) [hadoop-mapreduce-client-common-3.0.0-beta1.jar:?] 2018-01-10T13:27:10,522 ERROR [main] compactor.Worker: Caught exception while trying to compact id:1,dbname:default,tableName:t,partName:null,state:^@,type:MAJOR,p\ roperties:null,runAs:null,tooManyAborts:false,highestTxnId:0. Marking failed to avoid repeated failures, java.io.IOException: Major at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:346) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:291) at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:167) {noformat} was: Suppose we start with empty delta_8_8 and delta_9_9 and compaction runs. It will currently produce an MR job with 0 splits and so {{CompactorMR.TMP_LOCATION}} never gets created. This causes {{CompactorOutputCommitted.commitJob()}} to fail when it tries to do {{FileStatus[] contents = fs.listStatus(tmpLocation);}} since tmpLocation doesn't exist. If compactor fails to produce delta_8_9 here it will fail to do further compaction unless new delta with data is created. If the number of empty deltas is > than HiveConf.ConfVars.COMPACTOR_MAX_NUM_DELTA, compaction will not be able to proceed at all. It should produce a delta_8_9 in this case even if it's empty. > Compaction should handle a case when it produces no output > ---------------------------------------------------------- > > Key: HIVE-18429 > URL: https://issues.apache.org/jira/browse/HIVE-18429 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > > Suppose we start with empty delta_8_8 and delta_9_9 and compaction runs. > It will currently produce an MR job with 0 splits and so > {{CompactorMR.TMP_LOCATION}} never gets created. This causes > {{CompactorOutputCommitted.commitJob()}} to fail when it tries to do > {{FileStatus[] contents = fs.listStatus(tmpLocation);}} since tmpLocation > doesn't exist. > If compactor fails to produce delta_8_9 here it will fail to do further > compaction unless new delta with data is created. > If the number of empty deltas is > than > HiveConf.ConfVars.COMPACTOR_MAX_NUM_DELTA, compaction will not be able to > proceed at all. > It should produce a delta_8_9 in this case even if it's empty. > The error (in the log of standalone metastore process) would look like this > {noformat} > 2018-01-10T13:27:10,521 WARN [Thread-209] mapred.LocalJobRunner: > job_local44610510_0003 > java.io.FileNotFoundException: File > file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1515619503884/warehouse/t/_tmp_60ce7a11-d798-474f-b223-7d0acdb6dd5c > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:464) > ~[hadoop-common-3.0.0-beta1.jar:?] > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1853) > ~[hadoop-common-3.0.0-beta1.jar:?] > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1895) > ~[hadoop-common-3.0.0-beta1.jar:?] > at > org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678) > ~[hadoop-common-3.0.0-beta1.jar:?] > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:919) > ~[classes/:?] > at > org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) > ~[hadoop-mapreduce-client-core-3.0.0-beta1.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567) > [hadoop-mapreduce-client-common-3.0.0-beta1.jar:?] > 2018-01-10T13:27:10,522 ERROR [main] compactor.Worker: Caught exception while > trying to compact > id:1,dbname:default,tableName:t,partName:null,state:^@,type:MAJOR,p\ > roperties:null,runAs:null,tooManyAborts:false,highestTxnId:0. Marking failed > to avoid repeated failures, java.io.IOException: Major > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:346) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:291) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:167) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)