Do you get the same effect if you specify exactly one of those?
If this then also happens and these files are these files gzipped: you may
be using a buggy inputformat.
See this
https://issues.apache.org/jira/plugins/servlet/mobile#issue/MAPREDUCE-2094
On Dec 24, 2014 5:54 PM, "Rodrigo Ferreira" <web...@gmail.com> wrote:

> Hi everyone, happy holidays!
>
> I have a Pig script that reads from 4 different folders in Amazon S3. This
> is the code:
>
> load_1 = LOAD 's3n://mybucket/{folder_1,folder_2,folder_3,folder_4}'
> USING...;
>
> It happens that instead of reading each folder just once and appending the
> files Pig/Hadoop reads each folder 4 times.
>
> The input should have 62174 records, but in the end I get 248696.
>
> Why is that? Any ideas?
>
> Thanks,
> Rodrigo.
>

Reply via email to