[
https://issues.apache.org/jira/browse/FALCON-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105263#comment-14105263
]
Shwetha G S edited comment on FALCON-623 at 8/21/14 10:42 AM:
--------------------------------------------------------------
Actually, I don't think it will work(we verified as well with these configs).
Here is the code in oozie that sets the JobConf:
{code}
public JobConf createBaseHadoopConf(Context context, Element actionXml) {
Namespace ns = actionXml.getNamespace();
String jobTracker = actionXml.getChild("job-tracker", ns).getTextTrim();
String nameNode = actionXml.getChild("name-node", ns).getTextTrim();
JobConf conf =
Services.get().get(HadoopAccessorService.class).createJobConf(jobTracker);
conf.set(HADOOP_USER,
context.getProtoActionConf().get(WorkflowAppService.HADOOP_USER));
conf.set(HADOOP_JOB_TRACKER, jobTracker);
conf.set(HADOOP_JOB_TRACKER_2, jobTracker);
conf.set(HADOOP_YARN_RM, jobTracker);
conf.set(HADOOP_NAME_NODE, nameNode);
conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "true");
return conf;
}
{code}
{{Services.get().get(HadoopAccessorService.class).createJobConf(jobTracker)}}
loads the correct conf(nn and jt endpoints) using oozie-site. But later, nn and
jt endpoints are overridden with the endpoints mentioned in the export action
which is for source cluster. This conf is used by JobClient to load the
libraries (the jar path doesn't contain the endpoint, just the path. old hadoop
even cdh4 has : as the lib separator in the conf. Hence the jar path contains
just the path without host).
Not sure if its anything to do with recent changes in oozie. We are using oozie
trunk(1-2 months old). Which version of oozie are you using? Did you test end
to end with these changes in oozie-site?
was (Author: shwethags):
Actually, I don't think it will work(we verified as well with these configs).
Here is the code in oozie that sets the JobConf:
{code}
public JobConf createBaseHadoopConf(Context context, Element actionXml) {
Namespace ns = actionXml.getNamespace();
String jobTracker = actionXml.getChild("job-tracker", ns).getTextTrim();
String nameNode = actionXml.getChild("name-node", ns).getTextTrim();
JobConf conf =
Services.get().get(HadoopAccessorService.class).createJobConf(jobTracker);
conf.set(HADOOP_USER,
context.getProtoActionConf().get(WorkflowAppService.HADOOP_USER));
conf.set(HADOOP_JOB_TRACKER, jobTracker);
conf.set(HADOOP_JOB_TRACKER_2, jobTracker);
conf.set(HADOOP_YARN_RM, jobTracker);
conf.set(HADOOP_NAME_NODE, nameNode);
conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "true");
return conf;
}
{code}
{{Services.get().get(HadoopAccessorService.class).createJobConf(jobTracker)}}
loads the correct conf(nn and jt endpoints) using oozie-site. But later, nn and
jt endpoints are overridden with the endpoints mentioned in the export action
which is for source cluster. This conf is used by JobClient to load the
libraries (the jar path doesn't contain the endpoint, just the path).
Not sure if its anything to do with recent changes in oozie. We are using oozie
trunk(1-2 months old). Which version of oozie are you using? Did you test end
to end with these changes in oozie-site?
> HCat replication fails on table-export
> --------------------------------------
>
> Key: FALCON-623
> URL: https://issues.apache.org/jira/browse/FALCON-623
> Project: Falcon
> Issue Type: Bug
> Components: replication
> Environment: QA
> Reporter: Karishma Gulati
>
> On scheduling a one-source, one-target HCat Replication job, table export
> fails, with error message:
> {code}
> JA008: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-73741e09/1373320570ef25b7d7c1ee474f1f0428_1408529998170/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> {code}
> Oozie track trace:
> {code}
> 2014-08-20 11:13:01,477 ERROR pool-2-thread-9 UserGroupInformation -
> SERVER[ip-192-168-138-139] PriviledgedActionException as:karishma
> (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.FileNotFoundException:
> File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> 2014-08-20 11:13:01,585 WARN pool-2-thread-9 ActionStartXCommand -
> SERVER[ip-192-168-138-139] USER[karishma] GROUP[-] TOKEN[]
> APP[FALCON_FEED_REPLICATION_raaw-logs16-105f5895]
> JOB[0000078-140813072435213-oozie-oozi-W]
> ACTION[0000078-140813072435213-oozie-oozi-W@table-export] Error starting
> action [table-export]. ErrorType [ERROR], ErrorCode [JA008], Message [JA008:
> File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar]
> org.apache.oozie.action.ActionExecutorException: JA008: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> at
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:396)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:930)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1085)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:395)
> at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:73)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:273)
> at
> org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:60)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:241)
> at
> org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:55)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
> Caused by: java.io.FileNotFoundException: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824)
> at
> org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185)
> at
> org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:821)
> at
> org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestampsAndCacheVisibilities(TrackerDistributedCacheManager.java:778)
> at
> org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:852)
> at
> org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743)
> at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:174)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:960)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:919)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:915)
> ... 20 more
> {code}
> I set up falcon in distributed mode, using different clusters for source and
> target.
--
This message was sent by Atlassian JIRA
(v6.2#6252)