[ https://issues.apache.org/jira/browse/HIVE-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242082#comment-14242082 ]
Navis commented on HIVE-9076: ----------------------------- Result {noformat} hive (default)> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/orc_merge5b/; Found 3 items -rwxr-xr-x 1 navis supergroup 1203 2014-12-11 12:42 /user/hive/warehouse/orc_merge5b/000000_0 -rwxr-xr-x 1 navis supergroup 600 2014-12-11 12:33 /user/hive/warehouse/orc_merge5b/000000_1_copy1 -rwxr-xr-x 1 navis supergroup 600 2014-12-11 12:33 /user/hive/warehouse/orc_merge5b/000000_1_copy2 {noformat} log {noformat} 2014-12-11 12:42:14,843 INFO exec.Task (SessionState.java:printInfo(825)) - Starting Job = job_1418171849140_0008, Tracking URL = http://localhost:8088/proxy/application_1418171849140_0008/ 2014-12-11 12:42:14,843 INFO exec.Task (SessionState.java:printInfo(825)) - Kill Command = /home/navis/hadoop-0.20/bin/hadoop job -kill job_1418171849140_0008 2014-12-11 12:42:17,010 INFO exec.Task (SessionState.java:printInfo(825)) - Hadoop job information for null: number of mappers: 0; number of reducers: 0 2014-12-11 12:42:17,064 WARN mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2014-12-11 12:42:17,065 INFO exec.Task (SessionState.java:printInfo(825)) - 2014-12-11 12:42:17,060 null map = 0%, reduce = 0% 2014-12-11 12:42:21,200 WARN mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2014-12-11 12:42:21,201 INFO exec.Task (SessionState.java:printInfo(825)) - 2014-12-11 12:42:21,200 null map = 100%, reduce = 0% 2014-12-11 12:42:21,206 WARN mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2014-12-11 12:42:21,210 INFO exec.Task (SessionState.java:printInfo(825)) - Ended Job = job_1418171849140_0008 2014-12-11 12:42:21,224 WARN exec.Utilities (Utilities.java:removeTempOrDuplicateFiles(1977)) - Duplicate taskid file removed: hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000/000000_1 with length 600. Existing file: hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000/000000_0 with length 1203 2014-12-11 12:42:21,225 INFO exec.AbstractFileMergeOperator (Utilities.java:mvFileToFinalPath(1805)) - Moving tmp dir: hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000 to: hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/-ext-10000 2014-12-11 12:42:21,233 INFO exec.AbstractFileMergeOperator (AbstractFileMergeOperator.java:jobCloseOp(247)) - jobCloseOp moved merged files to output dir: hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/-ext-10000 {noformat} > incompatFileSet in AbstractFileMergeOperator should be marked to skip task id > check > ----------------------------------------------------------------------------------- > > Key: HIVE-9076 > URL: https://issues.apache.org/jira/browse/HIVE-9076 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Navis > Assignee: Navis > Priority: Minor > > In some file composition, AbstractFileMergeOperator removes incompatible > files. For example, > {noformat} > 000000_0 (v12) > 000000_0_copy_1 (v12) > 000000_1 (v11) > 000000_1_copy_1 (v11) > 000000_1_copy_2 (v11) > 000000_2 (v12) > {noformat} > 000000_1 (v11) will be removed because 000000 is assigned to new merged file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)