[
https://issues.apache.org/jira/browse/FALCON-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347773#comment-15347773
]
Balu Vellanki commented on FALCON-2049:
---------------------------------------
This issue was introduced by https://issues.apache.org/jira/browse/FALCON-1844
where setDeleteMissing is set to true by default. Assume source dir is
/tmp/source/${YEAR}/${MONTH}/${DAY} and target is
/tmp/target/${YEAR}/${MONTH}/${DAY} Feed replication is triggered with DistCp
being equivalent to following CLI distcp command
{code}
hadoop distcp -update -delete
hdfs://c6401.ambari.apache.org:8020/tmp/source/1/2/3
hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/
{code}
The following scenarios can occur,
Case 1. Source dir is created but is empty, but availabilityFlag is created.
Result : DistCp succeeds, hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/
is created and availabilityFlag is copies to target
Case 2. Source dir is created and has files.
Result : DistCp succeeds, hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/
is created and target dir has same files as sourceDir with same dir structure.
Case 3. Source dir is created without any files and target dir is also
created.
Result : DistCp succeeds, both source and target have empty dirs.
Case 4. Source dir is created but is empty, availabilityFlag is not created.
Result : DistCp fails with error "Job commit failed:
org.apache.hadoop.tools.CopyListing$InvalidInputException:
hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3 does not exist"
There seems to be two solutions for this problem.
1. Return success when sourceDir has no files and targetDir is missing, thus
avoiding Case 4
OR
2. Create targetDir and then attempt to DistCp. This will trigger Case 3 and
replication job will succeed.
I recommend option 2 because having an empty source/target dir is a valid use
case for data directories.
> Feed Replication with Empty Directories are failing
> ---------------------------------------------------
>
> Key: FALCON-2049
> URL: https://issues.apache.org/jira/browse/FALCON-2049
> Project: Falcon
> Issue Type: Bug
> Components: feed
> Affects Versions: 0.10
> Reporter: Murali Ramasami
> Priority: Critical
> Fix For: 0.10
>
>
> Feed Replication with empty directories are failing with the following error
> in application log:
> {noformat}
> 2016-06-23 08:35:21,475 INFO [eventHandlingThread]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to
> done:
> hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml_tmp
> to
> hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml
> 2016-06-23 08:35:21,476 INFO [eventHandlingThread]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to
> done:
> hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist_tmp
> to
> hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist
> 2016-06-23 08:35:21,477 INFO [Thread-66]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped
> JobHistoryEventHandler. super.stop()
> 2016-06-23 08:35:21,479 INFO [Thread-66]
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics
> to No of maps and reduces are 0 job_1466658266370_0059
> Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException:
> hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/tmp/falcon-regression/FeedReplicationTest/target/2016/06/23/08/32
> doesn't exist
> at
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84)
> at
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
> at
> org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:241)
> at
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94)
> at
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
> at
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Feed submitted:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1"
> name="A7769e4e0-49663d60" description="Input File">
> <partitions>
> <partition name="colo"/>
> <partition name="eventTime"/>
> <partition name="impressionHour"/>
> <partition name="pricingModel"/>
> </partitions>
> <availabilityFlag>availabilityFlag.txt</availabilityFlag>
> <frequency>minutes(5)</frequency>
> <late-arrival cut-off="days(100000)"/>
> <clusters>
> <cluster name="A7769e4e0-0af6c74b" type="source">
> <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/>
> <retention limit="days(1000000)" action="delete"/>
> </cluster>
> <cluster name="A7769e4e0-25f87f0e" type="target">
> <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/>
> <retention limit="days(1000000)" action="delete"/>
> <locations>
> <location type="data"
> path="/tmp/falcon-regression/FeedReplicationTest/target/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
> </locations>
> </cluster>
> </clusters>
> <locations>
> <location type="data"
> path="/tmp/falcon-regression/FeedReplicationTest/source/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
> <location type="stats" path="/data/regression/fetlrc/billing/stats"/>
> <location type="meta"
> path="/data/regression/fetlrc/billing/metadata"/>
> </locations>
> <ACL owner="hrt_qa" group="users" permission="0x755"/>
> <schema location="/databus/streams_local/click_rr/schema/"
> provider="protobuf"/>
> <properties>
> <property name="field1" value="value1"/>
> <property name="field2" value="value2"/>
> <property name="job.counter" value="true"/>
> </properties>
> </feed>
> {noformat}
> It is failing because of the target directories are not exists to replicate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)