Take a look at tagFile/tagPath options

http://pig.apache.org/docs/r0.13.0/api/org/apache/pig/builtin/PigStorage.html

On Monday, August 18, 2014, Harrison Cavallero <[email protected]>
wrote:

> Hey all,
>
> I'm loading a group of csv files into pig storage, and I would like to
> include the filename in each tuple loaded from that file. So as to
> differentiate the tuple as unique to coming from that file (each file is
> for a particular user).
>
> So for example:
> csv_all =LOAD 'sample1.csv, sample2.csv' USING PigStorage('|')
> AS (upc:chararray, store_id:int, date:chararray,
> product_description:chararray);
>
> Is there a way to load each tuple from each csv to include another field
> that contains the filename or part of it (like filename:chararry)?
>
> Thanks in advance!
>
> --
> Harrison Cavallero
>
> *cavallero.me <http://cavallero.me>*
>

Reply via email to