Take a look at tagFile/tagPath options http://pig.apache.org/docs/r0.13.0/api/org/apache/pig/builtin/PigStorage.html
On Monday, August 18, 2014, Harrison Cavallero <[email protected]> wrote: > Hey all, > > I'm loading a group of csv files into pig storage, and I would like to > include the filename in each tuple loaded from that file. So as to > differentiate the tuple as unique to coming from that file (each file is > for a particular user). > > So for example: > csv_all =LOAD 'sample1.csv, sample2.csv' USING PigStorage('|') > AS (upc:chararray, store_id:int, date:chararray, > product_description:chararray); > > Is there a way to load each tuple from each csv to include another field > that contains the filename or part of it (like filename:chararry)? > > Thanks in advance! > > -- > Harrison Cavallero > > *cavallero.me <http://cavallero.me>* >
