Hi,

Just wondering why you need 'NOT' udf.
(Comments below not related to Alex's mail).

On Wednesday 21 April 2010 10:55 PM, hc busy wrote:
In a previous version I've had to write a UDF called NOT that I invoke with

T = filter U by my.udf.NOT($2 is null);


For this, there is a pig construct iirc "$2 IS NOT NULL" works, you dont need the udf for that ...



-- and

T = filter U by my.udf.NOT(IsEmpty($3));

"IsEmpty($3) != false" ?
or "IsEmpty($3) != true" can replace NOT udf ?


Regards,
Mridul


it was for an older ver of pig..., soymmv.


2010/4/21 Mridul Muralidharan<mrid...@yahoo-inc.com>


In case of co-group, if nothing matched the group key, you get an empty
bag, not null.

So checking for COUNT(alias) == 0 is what you need.


Regards,
Mridul



On Wednesday 21 April 2010 03:37 PM, Alexander Schätzle wrote:

Hello,

I want to use IS NULL in a FILTER but the behavior seems to be a Bug:

I make a LeftJoin with a result of 7 tuples with fields 's' and 'nick'.
4 tuples have a value for 'nick', the other 3 don't have a value for
'nick'.
Afterwards I want to filter so that only the 3 tuples without a nick are
left:

Filter1 = FILTER LeftJoin1 BY nick is null;

But as result I get all 7 tuples but all of them now don't have a nick!
So what's going on there!?
If I use IS NOT NULL instead I get all 7 tuples unchanged!


This is the complete script, input data can be found in the attachment:

indata = LOAD 'foaf' USING PigStorage() AS (s,p,o);

f1 = FILTER indata BY p == 'ex:type' AND o == 'ex:Person';
BGP1 = FOREACH f1 GENERATE s AS s;

f1 = FILTER indata BY p == 'ex:nick';
BGP2 = FOREACH f1 GENERATE s AS s, o AS nick;

lj1 = JOIN BGP1 BY s LEFT OUTER, BGP2 BY s;
LEFTJOIN1 = FOREACH lj1 GENERATE $0 AS s, $2 AS nick;

FILTER1 = FILTER LEFTJOIN1 BY nick is null;

STORE FILTER1 INTO 'outfile' USING PigStorage();


Can anyone help me what's going wrong?

Thx,
Alex




Reply via email to