Hi Harsh,

Many thanks! I got rid of the problem by updating the InputSplit's getLocations() to return hosts.

Patcharee

On 04/11/2014 06:16 AM, Harsh J wrote:
Do not use the InputSplit's getLocations() API to supply your file
path, it is not intended for such things, if thats what you've done in
your current InputFormat implementation.

If you're looking to store a single file path, use the FileSplit
class, or if not as simple as that, do use it as a base reference to
build you Path based InputSplit derivative. Its sources are at
https://github.com/apache/hadoop-common/blob/release-2.4.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileSplit.java.
Look for the Writable method overrides in particular to understand how
to use custom fields.

On Thu, Apr 10, 2014 at 9:54 PM, Patcharee Thongtra
<patcharee.thong...@uni.no> wrote:
Hi,

I wrote a custom InputFormat and InputSplit to handle netcdf file. I use
with a custom pig Load function. When I submitted a job by running a pig
script. I got an error below. From the error log, the network location name
is "hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02" -
my input file, containing "/", and hadoop does not allow.

It could be something missing in my custom InputFormat and InputSplit. Any
ideas? Any help is appreciated,

Patcharee


2014-04-10 17:09:01,854 INFO [CommitterEvent Processor #0]
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing
the event EventType: JOB_SETUP

2014-04-10 17:09:01,918 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:
job_1387474594811_0071Job Transitioned from SETUP to RUNNING

2014-04-10 17:09:01,982 INFO [AsyncDispatcher event handler]
org.apache.hadoop.yarn.util.RackResolver: Resolved
hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 to
/default-rack

2014-04-10 17:09:01,984 FATAL [AsyncDispatcher event handler]
org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.IllegalArgumentException: Network location name contains /:
hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02
     at org.apache.hadoop.net.NodeBase.set(NodeBase.java:87)
     at org.apache.hadoop.net.NodeBase.<init>(NodeBase.java:65)
     at
org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:111)
     at
org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.<init>(TaskAttemptImpl.java:548)
     at
org.apache.hadoop.mapred.MapTaskAttemptImpl.<init>(MapTaskAttemptImpl.java:47)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.MapTaskImpl.createAttempt(MapTaskImpl.java:62)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAttempt(TaskImpl.java:594)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAndScheduleAttempt(TaskImpl.java:581)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.access$1300(TaskImpl.java:100)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:871)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:866)
     at
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
     at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
     at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
     at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:632)
     at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:99)
     at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1237)
     at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231)
     at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
     at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
     at java.lang.Thread.run(Thread.java:662)
2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler]
org.apache.hadoop.



Reply via email to