Hi,

Following a COGROUP I would like to filter results by one of the fields but
I'm getting an error: Operand of Regex can be CharArray only. The relevant
lines in my script are:
x1 = COGROUP p3 BY domain, rdt1 BY from, f4 BY target;
x2 = FILTER x1 BY ( IsEmpty(p3) AND (IsEmpty(rdt1) OR (rdt1.to matches
'.*com')) );
x3 = FOREACH x2 GENERATE flatten(f4);

describe of x1
x1: {group: chararray,p3: {domain: chararray},rdt1: {from: chararray,to:
chararray},f4: {source: chararray,target: chararray}}

I'm not sure why the error occurs. Is it because rdt1 inside x1 is a bag -
multiple rdt1 can exist in the same group ?

I can get around this with this script:
x1 = COGROUP p3 BY domain, rdt1 BY from, f4 BY target parallel 32;
x2 = FOREACH x1 GENERATE flatten(f4), COUNT(p3) as p3_count, COUNT(rdt1) as
rdt1_count, flatten(rdt1.to);
x3 = FILTER x2 BY ( p3_count==0 AND (rdt1_count==0 OR (to matches '.com'))
);
x4 = FOREACH x3 GENERATE source, target;

but it seems to me too complicated. Is there a way to make my first version
work ?

Thanks in advance,
Tamir

Reply via email to