Re: While/CROSS/FOREACH loop

2012-05-25 Thread Aniket Mokashi
This might be helpful for this use case - http://hortonworks.com/blog/new-apache-pig-features-part-2-embedding/ On Tue, May 22, 2012 at 11:31 PM, Russell Jurney wrote: > I need to repeatedly CROSS a data set, then FOREACH it, reduce it with > a filter, then group/test it to test if it's done yet,

Re: Re: RCfile

2012-05-25 Thread Raghu Angadi
yeah, you can enable lzo compression the normal way : set mapred.output.compress true; set mapred.output.compression.codec org.apache.hadoop.io.compress.LzoCodec; store a into 'output' using RCFilePigStorage(); Raghu. On Thu, May 24, 2012 at 11:08 PM, yingnan.ma wrote: > Hi, > > the

Re: in statement

2012-05-25 Thread praveenesh kumar
Write a UDF that takes tuples/bag as input. Then do whatever processing you want to do with the values inside the bag/tuple. Look how COUNT/SUM UDF is written, you will get better picture. Thanks, Praveenesh On Fri, May 25, 2012 at 3:17 PM, Fabian Alenius wrote: > Hi, > > lets say I have a large

Re: in statement

2012-05-25 Thread Руслан Аль-Факих
Hey Fabian, You can try this: inputData = LOAD 'input'; grouped = GROUP inputData BY $0; result = FOREACH grouped { filtered = FILTER inputData BY $1 == 'mystring'; GENERATE group, ( (COUNT(filtered) > 0) ? 'true' : 'false' ) AS StringExists; } Not sure whether

in statement

2012-05-25 Thread Fabian Alenius
Hi, lets say I have a large tuple or a bag and I want to see if one of the fields match a string. How would one do that? Similarly how do you apply a function to all the fields in a tuple? Thanks, Fabian