Thanks Jonathan. I did the following:
a = load 'data1'; b = load 'data2'; c = join a by $0 left outer, b by $0; d = filter c by $1 is null; e = foreach d generate $0; ________________________________________ From: Jonathan Coveney [[email protected]] Sent: Tuesday, January 24, 2012 1:53 PM To: [email protected] Subject: Re: pig script similar to select from not in in SQL I would do the following (obviously this is a bit shorthand): a = load 'data1'; b = load 'data2'; c = cogroup a by $0, b by $0; d = filter c by IsEmpty(b); d would be a relation with only the keys and their corresponding rows which exist in a 2012/1/24 Chan, Tim <[email protected]> > I would like to generate a set of data that represents the items not found > in another set. > How would I do this using Pig? > > I'm thinking I would do an outer join and then filter off the items that > were matched. > >
