[
https://issues.apache.org/jira/browse/PIG-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281201#comment-14281201
]
Daniel Dai commented on PIG-4386:
---------------------------------
There is no hard limit of how many files in the input directory. You need to go
to Jobtracker UI to find the real error message.
> How many files can be submitted to a pig job at once?
> -----------------------------------------------------
>
> Key: PIG-4386
> URL: https://issues.apache.org/jira/browse/PIG-4386
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.13.1
> Environment: {code}
> $pig --version
> Apache Pig version 0.13.1-mapr-1410 (rexported)
> compiled Nov 05 2014, 10:16:28
> {code}
> Reporter: Madhavi Nadig
>
> Pig fails mysteriously when I specify the root of a large directory tree as
> the LOAD input in my script. The exception that it throws offers no insight
> into what's happening. The same script works perfectly when there are fewer
> files.
> It's a very simple script as you can see below:
> {code}
> SET pig.noSplitCombination true;
> raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
> filtered = FILTER raw_record by $1 == 251068;
> filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1,
> (chararray)$2;
> STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
> {code}
> Here's the error message I see :
> {code}
> ERROR 2244: Job scope-594 failed, hadoop does not return any error message
> org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job
> scope-594 failed, hadoop does not return any error message
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
> at org.apache.pig.Main.run(Main.java:608)
> at org.apache.pig.Main.main(Main.java:156)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}
> How many files can PIG process at once?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)