Re: Use Filename in Tuple

2011-02-04 Thread Kim Vogt
I switched to using the CSVLoader in piggybank, and appended the filepath to the current RecordReader instead. -Kim On Thu, Feb 3, 2011 at 10:11 PM, Dmitriy Ryaboy wrote: > There's a CSV loader in the piggybank that does proper CSV escaping, > if you are interested. > > On Thu, Feb 3, 2011 at 9

Re: Use Filename in Tuple

2011-02-03 Thread Dmitriy Ryaboy
There's a CSV loader in the piggybank that does proper CSV escaping, if you are interested. On Thu, Feb 3, 2011 at 9:53 PM, Kim Vogt wrote: > And to include the filename in the tuple with the data, I copied PigStorage > (I'm loading csv), created a private PigSplit object, set this object in > "p

Re: Use Filename in Tuple

2011-02-03 Thread Kim Vogt
And to include the filename in the tuple with the data, I copied PigStorage (I'm loading csv), created a private PigSplit object, set this object in "prepareToRead", and added this code before returning the tuple in "getNext", if (mSplit != null) { FileSplit fs = (FileSplit) mSplit.getWrappedS

Re: Use Filename in Tuple

2011-02-03 Thread Dexin Wang
wow, I almost got it right. Double quote, fails. Single quote, works. Thanks. On Thu, Feb 3, 2011 at 9:40 PM, Kim Vogt wrote: > This should work: > > grunt> B = FOREACH A GENERATE f1, 'filename-2011-02-03'; > > or > > grunt> B = FOREACH A GENERATE f1, '$paramName'; > > -Kim > > On Thu, Feb 3, 2

Re: Use Filename in Tuple

2011-02-03 Thread Kim Vogt
This should work: grunt> B = FOREACH A GENERATE f1, 'filename-2011-02-03'; or grunt> B = FOREACH A GENERATE f1, '$paramName'; -Kim On Thu, Feb 3, 2011 at 8:32 PM, Dexin Wang wrote: > Similarly, is it possible to insert some literal values to a tuple stream? > > For example, when I invoke my

Re: Use Filename in Tuple

2011-02-03 Thread Dexin Wang
Similarly, is it possible to insert some literal values to a tuple stream? For example, when I invoke my Pig script, I already know what data source is (say, it's from filename_2011-02-03), so I can just pass it to Pig using -param, and I want to insert this known file name to the tuple stream. Ho

Re: Use Filename in Tuple

2011-02-03 Thread Kim Vogt
Thanks Dmitriy! I'm using pig 8 and no splitCombination (I don't think). I accept this challenge and will keep you pig'ites updated. -Kim On Feb 3, 2011, at 7:49 PM, Dmitriy Ryaboy wrote: > In pig 6, you can hook into bindTo() and save the file name. > > In pig 8 you have to find your way t

Re: Use Filename in Tuple

2011-02-03 Thread Dmitriy Ryaboy
In pig 6, you can hook into bindTo() and save the file name. In pig 8 you have to find your way to the underlying InputSplit via PigSplit.getWrappedSplit(), cast it as FileSplit, and call getPath() on it.. I think. Haven't done this. This will totally break if you have splitCombination turned on,

Use Filename in Tuple

2011-02-03 Thread Kim Vogt
Hey, I have a bunch of files where the filename is significant. I'm loading the files by supplying the top level directory that contains the files. Is there a way to capture the filename of the file and append to the tuple of data that's in that file? -Kim