[
https://issues.apache.org/jira/browse/FALCON-455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016410#comment-14016410
]
Satish Mittal commented on FALCON-455:
--------------------------------------
Verified with hive 0.13.1 and its working now. I was using a 0.13-SNAPSHOT that
was couple of fixes short of HIVE-6591.
[~svenkat] we would be using 0.13 for our testing here.
> Replication of output feed of an HCatalog process not working
> -------------------------------------------------------------
>
> Key: FALCON-455
> URL: https://issues.apache.org/jira/browse/FALCON-455
> Project: Falcon
> Issue Type: Bug
> Affects Versions: 0.5
> Reporter: Satish Mittal
> Attachments: hcat-in-feed.xml, hcat-out-feed.xml, hcat-process.xml,
> workflow.xml
>
>
> Suppose there is an HCatalog process (java type) that takes an HCat input
> feed and outputs another HCat feed. Further, this output feed is configured
> for replication across 2 clusters.
> The replication of output feed fails during Hive import step. The reason is
> that HCat process job output on HDFS consists of '_logs' directory if process
> writes to a static partition (or consists of an empty '_temporary' directory
> if process writes to a dynamic partition).
> The Hive import job logs contain following error:
> {noformat}
> 9036 [main] INFO org.apache.hadoop.hive.ql.Driver - Starting command:
> import table table5 partition
> (minute='25',month='05',year='2014',hour='12',day='29') from
> 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data'
> 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG
> method=TimeToSubmit start=1401367057244 end=1401367057579 duration=335
> from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=task.COPY.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying data from
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25
> to
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> 9069 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_SUCCESS
> 9096 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_logs
> 9190 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/part-r-00000
> 9222 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=task.DDL.Stage-1 from=org.apache.hadoop.hive.ql.Driver>
> 9580 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG
> method=task.COPY.Stage-0 start=1401367057579 end=1401367058123 duration=544
> from=org.apache.hadoop.hive.ql.Driver>
> 9580 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=task.MOVE.Stage-2 from=org.apache.hadoop.hive.ql.Driver>
> 9581 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Loading data to
> table default.table5 partition (day=29, hour=12, minute=25, month=05,
> year=2014) from
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> 9598 [main] INFO org.apache.hadoop.hive.ql.exec.MoveTask - Partition is:
> {day=29, hour=12, minute=25, month=05, year=2014}
> 9668 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Failed with
> exception checkPaths:
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> has nested
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs
> org.apache.hadoop.hive.ql.metadata.HiveException: checkPaths:
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> has nested
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs
> at org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2108)
> at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2298)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1230)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1532)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1305)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1136)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:976)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:966)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
> at
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:457)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:467)
> at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
> at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:318)
> at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:279)
> at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
> at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> at org.apache.hadoop.mapred.Child.main(Child.java:260)
> 9668 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG
> method=task.MOVE.Stage-2 start=1401367058123 end=1401367058211 duration=88
> from=org.apache.hadoop.hive.ql.Driver>
> 9672 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}
> Apprarently, Hive import doesn't like any directory in import path. This
> behavior can be seen on Hive CLI also.
> {noformat}
> hive> import table table5 partition
> (minute='32',month='05',year='2014',hour='12',day='29') from
> 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data'
> > ;
> Copying data from
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32
> Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_SUCCESS
> Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_logs
> Copying file:
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/part-r-00000
> Loading data to table default.table5 partition (day=29, hour=12, minute=32,
> month=05, year=2014)
> Failed with exception checkPaths:
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000
> has nested
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000/_logs
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MoveTask
> hive>
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)