[ 
https://issues.apache.org/jira/browse/HIVE-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242082#comment-14242082
 ] 

Navis commented on HIVE-9076:
-----------------------------

Result
{noformat}
hive (default)> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/orc_merge5b/;
Found 3 items
-rwxr-xr-x   1 navis supergroup       1203 2014-12-11 12:42 
/user/hive/warehouse/orc_merge5b/000000_0
-rwxr-xr-x   1 navis supergroup        600 2014-12-11 12:33 
/user/hive/warehouse/orc_merge5b/000000_1_copy1
-rwxr-xr-x   1 navis supergroup        600 2014-12-11 12:33 
/user/hive/warehouse/orc_merge5b/000000_1_copy2
{noformat}

log
{noformat}
2014-12-11 12:42:14,843 INFO  exec.Task (SessionState.java:printInfo(825)) - 
Starting Job = job_1418171849140_0008, Tracking URL = 
http://localhost:8088/proxy/application_1418171849140_0008/
2014-12-11 12:42:14,843 INFO  exec.Task (SessionState.java:printInfo(825)) - 
Kill Command = /home/navis/hadoop-0.20/bin/hadoop job  -kill 
job_1418171849140_0008
2014-12-11 12:42:17,010 INFO  exec.Task (SessionState.java:printInfo(825)) - 
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2014-12-11 12:42:17,064 WARN  mapreduce.Counters 
(AbstractCounters.java:getGroup(234)) - Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
2014-12-11 12:42:17,065 INFO  exec.Task (SessionState.java:printInfo(825)) - 
2014-12-11 12:42:17,060 null map = 0%,  reduce = 0%
2014-12-11 12:42:21,200 WARN  mapreduce.Counters 
(AbstractCounters.java:getGroup(234)) - Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
2014-12-11 12:42:21,201 INFO  exec.Task (SessionState.java:printInfo(825)) - 
2014-12-11 12:42:21,200 null map = 100%,  reduce = 0%
2014-12-11 12:42:21,206 WARN  mapreduce.Counters 
(AbstractCounters.java:getGroup(234)) - Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
2014-12-11 12:42:21,210 INFO  exec.Task (SessionState.java:printInfo(825)) - 
Ended Job = job_1418171849140_0008
2014-12-11 12:42:21,224 WARN  exec.Utilities 
(Utilities.java:removeTempOrDuplicateFiles(1977)) - Duplicate taskid file 
removed: 
hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000/000000_1
 with length 600. Existing file: 
hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000/000000_0
 with length 1203
2014-12-11 12:42:21,225 INFO  exec.AbstractFileMergeOperator 
(Utilities.java:mvFileToFinalPath(1805)) - Moving tmp dir: 
hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/_tmp.-ext-10000
 to: 
hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/-ext-10000
2014-12-11 12:42:21,233 INFO  exec.AbstractFileMergeOperator 
(AbstractFileMergeOperator.java:jobCloseOp(247)) - jobCloseOp moved merged 
files to output dir: 
hdfs://localhost:9000/tmp/hive/navis/80443669-b835-4a09-a73f-8b783d104f61/hive_2014-12-11_12-42-12_963_1945093075009681620-1/-ext-10000
{noformat}

> incompatFileSet in AbstractFileMergeOperator should be marked to skip task id 
> check
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-9076
>                 URL: https://issues.apache.org/jira/browse/HIVE-9076
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>
> In some file composition, AbstractFileMergeOperator removes incompatible 
> files. For example,
> {noformat}
> 000000_0 (v12)
> 000000_0_copy_1 (v12)
> 000000_1 (v11)
> 000000_1_copy_1 (v11)
> 000000_1_copy_2 (v11)
> 000000_2 (v12)
> {noformat}
> 000000_1 (v11) will be removed because 000000 is assigned to new merged file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to