Has anyone seen anything like this? Google searches turned up nothing, so I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it wrong.
If I ALTER a particular table with three partitions once, it works. Second time it works, too, but reports it is moving a directory to the Trash that doesn't exist (still, this doesn't kill it). The third time I ALTER the table, it crashes, because the directory structure has been modified to something invalid. Here's a nearly-full output of the 2nd and 3rd runs. The ALTER is exactly the same both times (I just press UP ARROW): *HQL, 2nd Run:*hive (analytics)> alter table bidtmp partition (log_type='bidder',dt='2014-05-01',hour=11) concatenate ; *Output:*Starting Job = job_1412894367814_0017, Tracking URL = ....application_1412894367814_0017/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1412894367814_0017 Hadoop job information for null: number of mappers: 97; number of reducers: 0 2014-10-13 20:28:23,143 null map = 0%, reduce = 0% 2014-10-13 20:28:36,042 null map = 1%, reduce = 0%, Cumulative CPU 49.69 sec ... 2014-10-13 20:31:56,415 null map = 99%, reduce = 0%, Cumulative CPU 812.65 sec 2014-10-13 20:31:57,458 null map = 100%, reduce = 0%, Cumulative CPU 813.88 sec MapReduce Total cumulative CPU time: 13 minutes 33 seconds 880 msec Ended Job = job_1412894367814_0017 Loading data to table analytics.bidtmp partition (log_type=bidder, dt=2014-05-01, hour=11) rmr: DEPRECATED: Please use 'rm -r' instead. Moved: '.../apps/hive/warehouse/analytics.db/bidtmp/ *dt=2014-05-01/hour=11/log_type=bidder*' to trash at: .../user/hdfs/.Trash/Current *// (note the bold-faced path doesn't exist, the partition is specified as log_type first, then dt, then hour)* Partition analytics.bidtmp*{log_type=bidder, dt=2014-05-01, hour=11}* stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0] *(here, the partition ordering is correct!)* MapReduce Jobs Launched: Job 0: Map: 97 Cumulative CPU: 813.88 sec HDFS Read: 30298871932 HDFS Write: 28746848923 SUCCESS Total MapReduce CPU Time Spent: 13 minutes 33 seconds 880 msec OK Time taken: 224.128 seconds *HQL, 3rd Run:*hive (analytics)> alter table bidtmp partition (log_type='bidder',dt='2014-05-01',hour=11) concatenate ; *Output:*java.io.FileNotFoundException: File does not exist: .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder *(because it should be log_type=.../dt=.../hour=... - not this order)* at org.apache.hadoop.hdfs. DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:419) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:214) at org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:511) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:458) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask –– *Tim Ellis:* 510-761-6610