You are doing a distinct on a Tuple, and not a Bag? In your example, DISTINCT on Field name on each record/tuple would not make sense as its always a single value. You need to group by on a certain key before a distinct.
On Wed, Apr 11, 2012 at 1:53 PM, Mohit Anchlia <[email protected]>wrote: > I am trying to get distinct from 2 fields in a record. something like > select distinct a, b from c; So I wrote this in pig which is actually not > working. I did: > > > A = LOAD '/examples/form_out/part-m-00000' USING PigStorage('\t') AS > (FILE_NAME:chararray,FORM_ID:chararray,SET_ID:chararray); > > B = foreach A {dist = DISTINCT A.FORM_ID, A.SET_ID; GENERATE dist;} > > ERROR 1000: Error during parsing. Invalid alias: A in {FILE_NAME: chararray > ... > > But this doesn't seem to be working. I thought A is a tuple and form_id and > set_id are fields that I can do DISTINCT on. I saw similar example online > but not exactly same. >
