Hi,

Did you get a chance to look into the PiggyBank String functions?

http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/evaluation/string/package-summary.html

I guess you need to use the substring function.

REGISTER <path-to-piggybank>/piggybank.jar;
DEFINE StrSub org.apache.pig.piggybank.evaluation.string.SUBSTRING();

... now you can use the SUBSTRING function as StrSub.
B = ForEach A generate StrSub(sid,1,64);

Hope it Helps.
Sumit



________________________________
From: Vincent Barat <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Wed, 20 April, 2011 7:37:03 PM
Subject: How to remove the field key from bags tuples after a GROUP ?

Hi,

First, I group 2 tables using a key (named sid):

rich_sessions = GROUP sessions BY sid, activities BY sid;

After this operation, all the tuples in the bag "activities" start with the 
same 
"sid" field.
This field is long (64 bytes) and I would like to remove it from all activity 
tuples in order to save space before storing this rich_sessions in a file.

Is there any way to do this ?

Thank for your help,

Reply via email to