Carrying headers and trailers through Pig (or really any ETL pipeline) as data rows will be awkward. De-concatenated (or pre-concatenated) files with the metadata already stripped out could be loaded using the PigStorage loader with the tag path setting. This would allow you to differentiate the records by source in your script.
On Tue, Jan 6, 2015 at 9:29 AM, Jumsheed <[email protected]> wrote: > Yes i checked SPLIT and MultiStorage , but i didn't find find any way to > group each section. > > On Tue, Jan 6, 2015 at 8:55 AM, Shahab Yunus <[email protected]> > wrote: > > > Have you looked at the SPLIT operator in Pig? Does that help? > > http://pig.apache.org/docs/r0.12.0/basic.html#SPLIT > > > > Regards, > > Shahab > > > > On Tue, Jan 6, 2015 at 8:51 AM, Jumsheed <[email protected]> wrote: > > > > > Hi, > > > > > > I have a file with data in below format, > > > > > > A > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 3 > > > B > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 2 > > > C > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 4 > > > > > > i need to create three files like > > > > > > file1: > > > A > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 3 > > > > > > file2: > > > B > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 2 > > > > > > file3: > > > C > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > abcdefghijklmnop > > > 4 > > > > > > is there any way you can suggest? > > > > > > Thanks > > > Jumsheed > > > > > >
