Re: left join on multiple columns

2015-01-08 Thread David Warshaw
Hi Patcharee, I wasn't able to reproduce either issue on Pig 0.14.0. 1: -- grunt dump join_height; (1,1,2009,0,559,447,1,-4.964739,1,1,2009,0,559,447,1,109.71929) grunt describe join_height; join_height: {r_four_dim1::date:

Re: split data into multiple files

2015-01-06 Thread David Warshaw
Carrying headers and trailers through Pig (or really any ETL pipeline) as data rows will be awkward. De-concatenated (or pre-concatenated) files with the metadata already stripped out could be loaded using the PigStorage loader with the tag path setting. This would allow you to differentiate the