Re: Multiple files with AvroStorage and comma separated lists

Philipp Tue, 24 Jan 2012 08:28:04 -0800

Hi Stan,

thanks a lot for your answer. We would be very interested and more thanglad if You could provide a patch.


Regards, Philipp

On 01/24/2012 05:22 PM, Stan Rosenberg wrote:

Philipp,

I would say that it is a bug.  I ran into the same problem some time
ago.  Essentially, AvroStorage does not recognize globs and does not
recognize commas, both of which
are supported by hadoop's FileInputFormat.  I ended up patching
AvroStorage to make it compatible with hadoop's semantics of input
paths.  I haven't submitted a patch though.
If there is some interest, I'd be more than glad to submit it.

Bets,

stan


On Tue, Jan 24, 2012 at 4:26 AM, Philipp<[email protected]>  wrote:

Dear Pig users,

I tried to load several files with AvroStorage by using a comma separated
list. The statement I used is:

test_data= LOAD 'repo_1/part-r-00000.avro,repo_2/part-r-00000.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();

Pig states that no input paths were specified in job. Please see the
stacktrace below.
I tried pig version0.8.1-cdh3u2 and 0.9.1.

Does anyone observe the same behavior? Is it a bug or a feature?

Thanks, Philipp





/Stacktrace:/

rg.apache.pig.backend.executionengine.ExecException: ERROR 2118: No input
paths specified in job
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:679)
Caused by: java.io.IOException: No input paths specified in job
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:186)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
    ... 7 more

Re: Multiple files with AvroStorage and comma separated lists

Reply via email to