Hey all, I'm having some trouble with the HBase bulk load, following the instructions from https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad. In the last step ("Sort Data") I get:
java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: No files found in hdfs://localhost/tmp/hive-cloudera/hive_2011-11-17_10-30-11_023_3494196694520237582/_tmp.-ext-10000/_tmp.000001_2 at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:311) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:479) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: No files found in hdfs://localhost/tmp/hive-cloudera/hive_2011-11-17_10-30-11_023_3494196694520237582/_tmp.-ext-10000/_tmp.000001_2 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:171) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:642) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:303) ... 7 more Caused by: java.io.IOException: No files found in hdfs://localhost/tmp/hive-cloudera/hive_2011-11-17_10-30-11_023_3494196694520237582/_tmp.-ext-10000/_tmp.000001_2 at org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$2.close(HiveHFileOutputFormat.java:144) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168) ... 11 more When I look at the source of HiveHFileOutputFormat.java it has: // Move the region file(s) from the task output directory // to the location specified by the user. There should // actually only be one (each reducer produces one HFile), // but we don't know what its name is. FileSystem fs = outputdir.getFileSystem(jc); fs.mkdirs(columnFamilyPath); Path srcDir = outputdir; for (;;) { FileStatus [] files = fs.listStatus(srcDir); if ((files == null) || (files.length == 0)) { throw new IOException("No files found in " + srcDir); } So I am getting the issue where the "task output directory" is empty. I assume this is because the earlier task failed, but I'm not sure how to check this. Does anyone know what is going on or how I can find the error log of whatever was supposed to populate this directory? Thanks! -Ben