[ 
https://issues.apache.org/jira/browse/HIVE-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047682#comment-15047682
 ] 

Mithun Radhakrishnan commented on HIVE-12364:
---------------------------------------------

Brilliant! Thank you, Prashanth. I'll check on these bugs for our internal 
branch. 

IMHO, given that they solve problems with an officially released version, we 
really should consider pulling them into branch-1. 

> Distcp job fails when run under Tez
> -----------------------------------
>
>                 Key: HIVE-12364
>                 URL: https://issues.apache.org/jira/browse/HIVE-12364
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12364-branch-1.patch, HIVE-12364.patch
>
>
> PROBLEM:
> insert into/overwrite directory '/path' invokes distcp for moveTask and fails
> query when execution engine is Tez 
> set hive.exec.copyfile.maxsize=40000;
> insert overwrite into '/tmp/testinser' select * from customer;
> failed at moveTask
> hive client log:
> {code}
> 2015-11-05 16:02:53,254 INFO  [main]: exec.FileSinkOperator 
> (Utilities.java:mvFileToFinalPath(1882)) - Moving tmp dir: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/_tmp.-ext-10000
>  to: 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
> 2015-11-05 16:02:53,611 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG 
> method=task.DEPENDENCY_COLLECTION.Stage-2 
> from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task 
> [Stage-2:DEPENDENCY_COLLECTION] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.MOVE.Stage-0 
> from=org.apache.hadoop.hive.ql.Driver>
> 2015-11-05 16:02:53,612 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1653)) - Starting task [Stage-0:MOVE] in serial mode
> 2015-11-05 16:02:53,612 INFO  [main]: exec.Task 
> (SessionState.java:printInfo(951)) - Moving data to: /tmp/testindir from 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
> 2015-11-05 16:02:53,637 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(551)) - Source is 491763261 bytes. (MAX: 40000)
> 2015-11-05 16:02:53,638 INFO  [main]: common.FileUtils 
> (FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
> 2015-11-05 16:03:03,924 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:04,081 INFO  [main]: impl.TimelineClientImpl 
> (TimelineClientImpl.java:serviceInit(296)) - Timeline service address: 
> http://hdpsece02.sece.hwxsup.com:8188/ws/v1/timeline/
> 2015-11-05 16:03:20,210 INFO  [main]: hdfs.DFSClient 
> (DFSClient.java:getDelegationToken(1047)) - Created HDFS_DELEGATION_TOKEN 
> token 1069 for haha on ha-hdfs:hdpsecehdfs
> 2015-11-05 16:03:20,249 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: HDFS_DELEGATION_TOKEN, Service: 
> ha-hdfs:hdpsecehdfs, Ident: (HDFS_DELEGATION_TOKEN token 1069 for haha)
> 2015-11-05 16:03:20,250 WARN  [main]: token.Token 
> (Token.java:getClassForIdentifier(121)) - Cannot find class for token kind 
> kms-dt
> 2015-11-05 16:03:20,250 INFO  [main]: security.TokenCache 
> (TokenCache.java:obtainTokensForNamenodesInternal(125)) - Got dt for 
> hdfs://hdpsecehdfs; Kind: kms-dt, Service: 172.25.17.102:9292, Ident: 00 04 
> 68 61 68 61 02 72 6d 00 8a 01 50 da 1a ca 29 8a 01 50 fe 27 4e 29 03 02
> 2015-11-05 16:03:22,561 INFO  [main]: Configuration.deprecation 
> (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 2015-11-05 16:03:22,562 INFO  [main]: Configuration.deprecation 
> (Configuration.java:warnOnceIfDeprecated(1173)) - io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 2015-11-05 16:03:33,733 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Unable to move 
> source 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
>  to destination /tmp/testindir
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://hdpsecehdfs/tmp/testindir/.hive-staging_hive_2015-11-05_15-59-44_557_1429894387987411483-1/-ext-10000
>  to destination /tmp/testindir
>         at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2665)
>         at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
>         at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: Cannot execute DistCp process: 
> java.io.IOException: mapreduce.job.inputformat.class is incompatible with map 
> compatability mode.
>         at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1156)
>         at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
>         at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2647)
>         ... 21 more
> Caused by: java.io.IOException: mapreduce.job.inputformat.class is 
> incompatible with map compatability mode.
>         at org.apache.hadoop.mapreduce.Job.ensureNotSet(Job.java:1194)
>         at org.apache.hadoop.mapreduce.Job.setUseNewAPI(Job.java:1229)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:183)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:153)
>         at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1153)
>         ... 23 more
> 2015-11-05 16:03:33,734 INFO  [main]: hooks.ATSHook (ATSHook.java:<init>(84)) 
> - Created ATS Hook
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to