Thanks for your suggestion.

I wouldn't want to run a map reduce job just to just get the file in a
single tuple. But also, I can't be sure I get the lines sorted within the
group, in the same order they are in the file.

Thanks

On 10 March 2015 at 06:39, Arvind S <arvind18...@gmail.com> wrote:

> while loading file you can attempt to use
> PigStorage(',','-tagFile')
> then regex on each line of the file .. then group by file name
>
>
> https://pig.apache.org/docs/r0.14.0/api/org/apache/pig/builtin/PigStorage.html
>
> *Cheers !!*
> Arvind
>
> On Fri, Mar 6, 2015 at 2:26 AM, Daniel Dai <da...@hortonworks.com> wrote:
>
> > DidnĀ¹t realize any, but it should be pretty easy to write a customized
> > Loader/InputFormat for that.
> >
> > Daniel
> >
> > On 3/5/15, 6:18 AM, "Ronald Green" <green.ron...@gmail.com> wrote:
> >
> > >Hi,
> > >
> > >I'm looking for a loader function that will let me read each file as a
> > >record on its own so I'll be able to treat each as a single
> record/field.
> > >For example:
> > >
> > >a = load '/files' USING TheLoader() as (file:chararray);
> > >b = foreach a GENERATE REGEX_EXTRACT(file,'...');
> > >
> > >PigStorage and TextLoader return each line in the file as a
> record/tuple.
> > >
> > >Do you know any other loader that allows to get an entire file as a
> > >record?
> > >
> > >Thanks,
> > >Ron
> >
> >
>

Reply via email to