I switched to using the CSVLoader in piggybank, and appended the filepath to
the current RecordReader instead.
-Kim
On Thu, Feb 3, 2011 at 10:11 PM, Dmitriy Ryaboy wrote:
> There's a CSV loader in the piggybank that does proper CSV escaping,
> if you are interested.
>
> On Thu, Feb 3, 2011 at 9
There's a CSV loader in the piggybank that does proper CSV escaping,
if you are interested.
On Thu, Feb 3, 2011 at 9:53 PM, Kim Vogt wrote:
> And to include the filename in the tuple with the data, I copied PigStorage
> (I'm loading csv), created a private PigSplit object, set this object in
> "p
And to include the filename in the tuple with the data, I copied PigStorage
(I'm loading csv), created a private PigSplit object, set this object in
"prepareToRead", and added this code before returning the tuple in
"getNext",
if (mSplit != null) {
FileSplit fs = (FileSplit) mSplit.getWrappedS
wow, I almost got it right. Double quote, fails. Single quote, works.
Thanks.
On Thu, Feb 3, 2011 at 9:40 PM, Kim Vogt wrote:
> This should work:
>
> grunt> B = FOREACH A GENERATE f1, 'filename-2011-02-03';
>
> or
>
> grunt> B = FOREACH A GENERATE f1, '$paramName';
>
> -Kim
>
> On Thu, Feb 3, 2
This should work:
grunt> B = FOREACH A GENERATE f1, 'filename-2011-02-03';
or
grunt> B = FOREACH A GENERATE f1, '$paramName';
-Kim
On Thu, Feb 3, 2011 at 8:32 PM, Dexin Wang wrote:
> Similarly, is it possible to insert some literal values to a tuple stream?
>
> For example, when I invoke my
Similarly, is it possible to insert some literal values to a tuple stream?
For example, when I invoke my Pig script, I already know what data source is
(say, it's from filename_2011-02-03), so I can just pass it to Pig using
-param, and I want to insert this known file name to the tuple stream. Ho
Thanks Dmitriy!
I'm using pig 8 and no splitCombination (I don't think). I accept this
challenge and will keep you pig'ites updated.
-Kim
On Feb 3, 2011, at 7:49 PM, Dmitriy Ryaboy wrote:
> In pig 6, you can hook into bindTo() and save the file name.
>
> In pig 8 you have to find your way t
In pig 6, you can hook into bindTo() and save the file name.
In pig 8 you have to find your way to the underlying InputSplit via
PigSplit.getWrappedSplit(), cast it as FileSplit, and call getPath()
on it.. I think. Haven't done this.
This will totally break if you have splitCombination turned on,
Hey,
I have a bunch of files where the filename is significant. I'm loading the
files by supplying the top level directory that contains the files. Is
there a way to capture the filename of the file and append to the tuple of
data that's in that file?
-Kim