Thanks Jonathan.

I did the following:

a = load 'data1';
b = load 'data2';
c = join a by $0 left outer, b by $0;
d = filter c by $1 is null;
e = foreach d generate $0;
________________________________________
From: Jonathan Coveney [[email protected]]
Sent: Tuesday, January 24, 2012 1:53 PM
To: [email protected]
Subject: Re: pig script similar to select from not in in SQL

I would do the following (obviously this is a bit shorthand):

a = load 'data1';
b = load 'data2';
c = cogroup a by $0, b by $0;
d = filter c by IsEmpty(b);

d would be a relation with only the keys and their corresponding rows which
exist in a

2012/1/24 Chan, Tim <[email protected]>

> I would like to generate a set of data that represents the items not found
> in another set.
> How would I do this using Pig?
>
> I'm thinking I would do an outer join and then filter off the items that
> were matched.
>
>

Reply via email to