[
https://issues.apache.org/jira/browse/FALCON-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108863#comment-14108863
]
Karishma Gulati edited comment on FALCON-623 at 8/25/14 7:57 AM:
-----------------------------------------------------------------
Dont have the original xmls. Re-ran the same test, and got the same error logs.
Attaching the source/target clusters and the feed xml for this test.
Source cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1"
name="corp-6d9f1878" description="" colo="ua1">
<interfaces>
<interface type="readonly" endpoint="hdfs://192.168.138.137:8020"
version="0.20.2"/>
<interface type="write" endpoint="hdfs://192.168.138.137:8020"
version="0.20.2"/>
<interface type="execute" endpoint="192.168.138.137:8021"
version="0.20.2"/>
<interface type="workflow"
endpoint="http://192.168.138.137:11000/oozie/" version="3.1"/>
<interface type="messaging"
endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
<interface type="registry" endpoint="thrift://192.168.138.137:14000"
version="0.11.0"/>
</interfaces>
<locations>
<location name="staging" path="/projects/ivory/staging"/>
<location name="temp" path="/tmp"/>
<location name="working" path="/projectsTest/ivory/working"/>
</locations>
<properties>
<property name="hive.metastore.client.socket.timeout" value="120"/>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</cluster>
{code}
Target cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1"
name="corp-0cd10609" description="" colo="ua2">
<interfaces>
<interface type="readonly" endpoint="hdfs://192.168.138.139:8020"
version="0.20.2"/>
<interface type="write" endpoint="hdfs://192.168.138.139:8020"
version="0.20.2"/>
<interface type="execute" endpoint="192.168.138.139:8021"
version="0.20.2"/>
<interface type="workflow"
endpoint="http://192.168.138.139:11000/oozie/" version="3.1"/>
<interface type="messaging"
endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
<interface type="registry" endpoint="thrift://192.168.138.139:14000"
version="0.11.0"/>
</interfaces>
<locations>
<location name="staging" path="/projects/ivory/staging"/>
<location name="temp" path="/tmp"/>
<location name="working" path="/projectsTest/ivory/working"/>
</locations>
<properties>
<property name="hive.metastore.client.socket.timeout" value="120"/>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</cluster>
{code}
Feed xml:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1"
name="raaw-logs16-69d5f138" description="clicks log">
<frequency>hours(1)</frequency>
<timezone>UTC</timezone>
<late-arrival cut-off="hours(6)"/>
<clusters>
<cluster name="corp-6d9f1878" type="source">
<validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
<retention limit="months(9000)" action="delete"/>
</cluster>
<cluster name="corp-0cd10609" type="target">
<validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
<retention limit="months(9000)" action="delete"/>
<table
uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
</cluster>
</clusters>
<table
uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
<ACL owner="karishma" group="default" permission="0x755"/>
<schema location="hcat" provider="hcat"/>
<properties>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</feed>
{code}
Oozie-site.xml -- change in property as asked
{code:xml}
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=hadoop-conf,192.168.138.137:8021=/home/users/dataqa/srcconf,192.168.138.139:8021=/etc/hadoop/conf,192.168.138.137:8020=/home/users/dataqa/srcconf,192.168.138.139:8020=/etc/hadoop/conf</value>
<description>
Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the
HOST:PORT of
the Hadoop service (JobTracker, HDFS). The wildcard '*'
configuration is
used when there is no exact match for an authority. The
HADOOP_CONF_DIR contains
the relevant Hadoop *-site.xml files. If the path is relative is
looked within
the Oozie configuration directory; though the path can be absolute
(i.e. to point
to Hadoop client conf/ directories in the local filesystem.
</description>
</property>
{code}
where 8021 is the jt port and 8020 is the nn port.
was (Author: karishmag9):
Dont have the original xmls. Re-ran the test, and got the same error logs.
Attaching the source/target clusters and the feed xml.
Source cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1"
name="corp-6d9f1878" description="" colo="ua1">
<interfaces>
<interface type="readonly" endpoint="hdfs://192.168.138.137:8020"
version="0.20.2"/>
<interface type="write" endpoint="hdfs://192.168.138.137:8020"
version="0.20.2"/>
<interface type="execute" endpoint="192.168.138.137:8021"
version="0.20.2"/>
<interface type="workflow"
endpoint="http://192.168.138.137:11000/oozie/" version="3.1"/>
<interface type="messaging"
endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
<interface type="registry" endpoint="thrift://192.168.138.137:14000"
version="0.11.0"/>
</interfaces>
<locations>
<location name="staging" path="/projects/ivory/staging"/>
<location name="temp" path="/tmp"/>
<location name="working" path="/projectsTest/ivory/working"/>
</locations>
<properties>
<property name="hive.metastore.client.socket.timeout" value="120"/>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</cluster>
{code}
Target cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1"
name="corp-0cd10609" description="" colo="ua2">
<interfaces>
<interface type="readonly" endpoint="hdfs://192.168.138.139:8020"
version="0.20.2"/>
<interface type="write" endpoint="hdfs://192.168.138.139:8020"
version="0.20.2"/>
<interface type="execute" endpoint="192.168.138.139:8021"
version="0.20.2"/>
<interface type="workflow"
endpoint="http://192.168.138.139:11000/oozie/" version="3.1"/>
<interface type="messaging"
endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
<interface type="registry" endpoint="thrift://192.168.138.139:14000"
version="0.11.0"/>
</interfaces>
<locations>
<location name="staging" path="/projects/ivory/staging"/>
<location name="temp" path="/tmp"/>
<location name="working" path="/projectsTest/ivory/working"/>
</locations>
<properties>
<property name="hive.metastore.client.socket.timeout" value="120"/>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</cluster>
{code}
Feed xml:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1"
name="raaw-logs16-69d5f138" description="clicks log">
<frequency>hours(1)</frequency>
<timezone>UTC</timezone>
<late-arrival cut-off="hours(6)"/>
<clusters>
<cluster name="corp-6d9f1878" type="source">
<validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
<retention limit="months(9000)" action="delete"/>
</cluster>
<cluster name="corp-0cd10609" type="target">
<validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
<retention limit="months(9000)" action="delete"/>
<table
uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
</cluster>
</clusters>
<table
uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
<ACL owner="karishma" group="default" permission="0x755"/>
<schema location="hcat" provider="hcat"/>
<properties>
<property name="field1" value="value1"/>
<property name="field2" value="value2"/>
</properties>
</feed>
{code}
Oozie-site.xml -- change in property as asked
{code:xml}
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=hadoop-conf,192.168.138.137:8021=/home/users/dataqa/srcconf,192.168.138.139:8021=/etc/hadoop/conf,192.168.138.137:8020=/home/users/dataqa/srcconf,192.168.138.139:8020=/etc/hadoop/conf</value>
<description>
Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the
HOST:PORT of
the Hadoop service (JobTracker, HDFS). The wildcard '*'
configuration is
used when there is no exact match for an authority. The
HADOOP_CONF_DIR contains
the relevant Hadoop *-site.xml files. If the path is relative is
looked within
the Oozie configuration directory; though the path can be absolute
(i.e. to point
to Hadoop client conf/ directories in the local filesystem.
</description>
</property>
{code}
where 8021 is the jt port and 8020 is the nn port.
> HCat replication fails on table-export
> --------------------------------------
>
> Key: FALCON-623
> URL: https://issues.apache.org/jira/browse/FALCON-623
> Project: Falcon
> Issue Type: Bug
> Components: replication
> Environment: QA
> Reporter: Karishma Gulati
>
> On scheduling a one-source, one-target HCat Replication job, table export
> fails, with error message:
> {code}
> JA008: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-73741e09/1373320570ef25b7d7c1ee474f1f0428_1408529998170/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> {code}
> Oozie track trace:
> {code}
> 2014-08-20 11:13:01,477 ERROR pool-2-thread-9 UserGroupInformation -
> SERVER[ip-192-168-138-139] PriviledgedActionException as:karishma
> (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.FileNotFoundException:
> File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> 2014-08-20 11:13:01,585 WARN pool-2-thread-9 ActionStartXCommand -
> SERVER[ip-192-168-138-139] USER[karishma] GROUP[-] TOKEN[]
> APP[FALCON_FEED_REPLICATION_raaw-logs16-105f5895]
> JOB[0000078-140813072435213-oozie-oozi-W]
> ACTION[0000078-140813072435213-oozie-oozi-W@table-export] Error starting
> action [table-export]. ErrorType [ERROR], ErrorCode [JA008], Message [JA008:
> File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar]
> org.apache.oozie.action.ActionExecutorException: JA008: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> at
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:396)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:930)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1085)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:395)
> at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:73)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:273)
> at
> org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:60)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at org.apache.oozie.command.XCommand.call(XCommand.java:352)
> at
> org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:241)
> at
> org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:55)
> at org.apache.oozie.command.XCommand.call(XCommand.java:283)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
> Caused by: java.io.FileNotFoundException: File does not exist:
> /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824)
> at
> org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185)
> at
> org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:821)
> at
> org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestampsAndCacheVisibilities(TrackerDistributedCacheManager.java:778)
> at
> org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:852)
> at
> org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743)
> at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:174)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:960)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:919)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:915)
> ... 20 more
> {code}
> I set up falcon in distributed mode, using different clusters for source and
> target.
--
This message was sent by Atlassian JIRA
(v6.2#6252)