This might be helpful for this use case -
http://hortonworks.com/blog/new-apache-pig-features-part-2-embedding/
On Tue, May 22, 2012 at 11:31 PM, Russell Jurney
wrote:
> I need to repeatedly CROSS a data set, then FOREACH it, reduce it with
> a filter, then group/test it to test if it's done yet,
yeah, you can enable lzo compression the normal way :
set mapred.output.compress true;
set mapred.output.compression.codec org.apache.hadoop.io.compress.LzoCodec;
store a into 'output' using RCFilePigStorage();
Raghu.
On Thu, May 24, 2012 at 11:08 PM, yingnan.ma wrote:
> Hi,
>
> the
Write a UDF that takes tuples/bag as input. Then do whatever processing you
want to do with the values inside the bag/tuple. Look how COUNT/SUM UDF is
written, you will get better picture.
Thanks,
Praveenesh
On Fri, May 25, 2012 at 3:17 PM, Fabian Alenius wrote:
> Hi,
>
> lets say I have a large
Hey Fabian,
You can try this:
inputData = LOAD 'input';
grouped = GROUP inputData BY $0;
result = FOREACH grouped {
filtered = FILTER inputData BY $1 == 'mystring';
GENERATE group, ( (COUNT(filtered) > 0) ? 'true' :
'false' ) AS StringExists;
}
Not sure whether
Hi,
lets say I have a large tuple or a bag and I want to see if one of the
fields match a string. How would one do that?
Similarly how do you apply a function to all the fields in a tuple?
Thanks,
Fabian