[ 
https://issues.apache.org/jira/browse/PIG-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281201#comment-14281201
 ] 

Daniel Dai commented on PIG-4386:
---------------------------------

There is no hard limit of how many files in the input directory. You need to go 
to Jobtracker UI to find the real error message.

> How many files can be submitted to a pig job at once?
> -----------------------------------------------------
>
>                 Key: PIG-4386
>                 URL: https://issues.apache.org/jira/browse/PIG-4386
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.13.1
>         Environment: {code}
> $pig --version
> Apache Pig version 0.13.1-mapr-1410 (rexported) 
> compiled Nov 05 2014, 10:16:28
> {code}
>            Reporter: Madhavi Nadig
>
> Pig fails mysteriously when I specify the root of a large directory tree as 
> the LOAD input in my script. The exception that it throws offers no insight 
> into what's happening. The same script works perfectly when there are fewer 
> files.
> It's a very simple script as you can see below:
> {code}
> SET pig.noSplitCombination true;
> raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
> filtered = FILTER raw_record by $1 == 251068;
> filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, 
> (chararray)$2;
> STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
> {code}
> Here's the error message I see :
> {code}
>    ERROR 2244: Job scope-594 failed, hadoop does not return any error message
> org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job 
> scope-594 failed, hadoop does not return any error message
>     at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
>     at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
>     at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
>     at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
>     at org.apache.pig.Main.run(Main.java:608)
>     at org.apache.pig.Main.main(Main.java:156)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}
> How many files can PIG process at once?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to