Re: Lookup in a dataset

2013-11-14 Thread Aaron Zimmerman
You’ll want to use COGROUP. Something like x = COGROUP input1 by col3, input2 by col4; needed = FILTER x by IsEmpty(input2); Thanks, Aaron Zimmerman Platform Engineer Sprout Social 773.227.7528 @apzimmerman sproutsocial.com On November 14, 2013 at 1:19:46 AM, Swaroop Patra

Re: Lookup in a dataset

2013-11-14 Thread Swaroop Kumar Patra
Thanks Aaron for replay. I will try this out. Thanks, Swaroop On 14-Nov-2013, at 5:37 pm, Aaron Zimmerman azimmer...@sproutsocial.com wrote: You’ll want to use COGROUP. Something like x = COGROUP input1 by col3, input2 by col4; needed = FILTER x by IsEmpty(input2); Thanks,

Using variables generated by FOREACH command

2013-11-14 Thread Mix Nin
Hi I have a group and foreach statements as below grouped = GROUP filterdata BY (page_name,web_session_id); x = foreach grouped { distinct_web_cookie_id= DISTINCT filterdata.web_cookie_id; distinct_encrypted_customer_id= DISTINCT filterdata.encrypted_customer_id; distinct_web_session_id=

Re: Custom partitioning and order for optimum hbase store

2013-11-14 Thread Dmitriy Lyubimov
Getting back onto this old problem again. Sorry. So, HBase bulk load again. Ok i have got a store func that writes HFile. I got the group-by with custom partitioner that does partitioning according to regions in HBase table. The only remaining piece is to set parallel right. HRegion partitioner