KSHITIJ GAUTAM created HBASE-21211: -------------------------------------- Summary: Can't Read Partitions File - Partitions File deleted Key: HBASE-21211 URL: https://issues.apache.org/jira/browse/HBASE-21211 Project: HBase Issue Type: Bug Affects Versions: 1.5.0, 1.6.0 Environment: * HBase Version: 1.2.0-cdh5.11.1 (the line that deletes the file still exists) * hadoop version * Hadoop 2.6.0-cdh5.11.1 * Subversion http://github.com/cloudera/hadoop -r b581c269ca3610c603b6d7d1da0d14dfb6684aa3 * From source with checksum c6cbc4f20a8a571dd7c9f743984da1 * This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.11.1.jar Reporter: KSHITIJ GAUTAM Fix For: 1.5.0, 1.6.0 Attachments: 0001-do-not-delete-the-partitions-file-if-the-session-is-.patch
Hi team, we have a MapReduce job that uses the bulkload option instead of direct puts to import data e.g., {code:java} HFileOutputFormat2.configureIncrementalLoad(job, table, locator);{code} However we have been running into a situation where partitions file is deleted by the termination of the JVM process, where JVM process kicks off the MapReduce job but it's also waiting to run the `configureIncrementalLoad` that executes the configurePartitioner. _Error: java.lang.IllegalArgumentException: Can't read partitions file at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)_ We think the line#827 of [HFileOutputFormat2|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java#L827] could be the root cause. {code:java} fs.deleteOnExit(partitionsPath);{code} We have created our custom HFileOutputFormat that doesn't delete the partitions file and have fixed the problem for our cluster. We propose that a cleanup method could be created which deletes the partitions file once all the mappers have finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005)