I guess this is a known issue, but if I have A=load 'data' as (a:int, b:int, c:int);
I am able to do B=foreach A generate (1,2,3); but not B=foreach A generate a, (b,c); I mean the udf for this is simple, but why isn't there built-in language support for this and the map/tuple operations I am asking for? Does anybody else use this kind of thing? On Mon, Feb 22, 2010 at 11:41 AM, hc busy <[email protected]> wrote: > how do I use this bag? Is there a way for me to specify it in grunt? > > BagFactory.getInstance().newSortedBag(comparator) > > ? > > > On Mon, Feb 22, 2010 at 10:34 AM, hc busy <[email protected]> wrote: > >> ok, it sounds like I have a plan. So I need to write a UDF from tuple to >> bag(t2b) and bag to tuple(b2t), and then I do >> >> exploded= foreach foo generate id, FLATTEN(t2b(field1, field2, field3)); >> implode= group exploded by id; >> implode= foreach implode generate id, flatten(b2t(implode)); >> >> to (almost) recover original table, except for field order may be messed >> up. Is there a way to write a udf like flatten that preserve order? >> >> >> Thanks! >> >> >> >> >> On Mon, Feb 22, 2010 at 9:57 AM, Dmitriy Ryaboy <[email protected]>wrote: >> >>> Same thing -- a udf to convert a tuple into a bag, then flatten. >>> Don't rely on any order you see in bags during testing -- there is >>> explicitly no guarantee there, it may change on you version to version >>> and >>> execution to execution. >>> >>> -D >>> >>> On Mon, Feb 22, 2010 at 9:45 AM, hc busy <[email protected]> wrote: >>> >>> > Thanks, Dmitriy and Rekha . So I understand the flatten on bag explodes >>> to >>> > multiple rows now. >>> > >>> > The BagConcat seems to work. Actually, doing a simple example using the >>> > group by, it would appear that the bag contains the results in the >>> order >>> > that they were before entering the group by. (so, if I group after an >>> order >>> > by x desc, then when I dump the table it prints the bag, but contents >>> are >>> > reversed)... So, actually, for my purposes, not having results in order >>> is >>> > okay. >>> > >>> > what about instead of charsplit, the data I have is this: >>> > >>> > 1,a,b,c,d >>> > 2,a,s,d,f >>> > >>> > and I want to explode it into >>> > 1,a >>> > 1,b >>> > 1,c >>> > 1,d >>> > 2,a >>> > 2,s >>> > 2,d >>> > 2,f >>> > >>> > (sorry, I made a mistake in the original question, the string is not a >>> > string but a tuple.) I think I may be able to get it into: >>> > >>> > 1, (a,b,c,d) >>> > 2, (a,s,d,f) >>> > >>> > but still, I need to explode it into several rows to operate on them >>> > separately. >>> > >>> > >>> > >>> > On Sun, Feb 21, 2010 at 8:03 PM, Rekha Joshi <[email protected]> >>> > wrote: >>> > >>> > > You would require a udf for this.Please check if you already have an >>> > > existing one in latest pig-udf.jar. >>> > > Or since this is a pretty simple one , you can write one yourself - >>> take >>> > > the tuple, assess the type , append the strings and return it from >>> your >>> > > exec() method. >>> > > >>> > > Cheers, >>> > > /R >>> > > >>> > > >>> > > On 2/19/10 11:51 PM, "hc busy" <[email protected]> wrote: >>> > > >>> > > Guys, I know this must be a common use case, but how do you explode >>> and >>> > > implode in pig? >>> > > >>> > > so, I have a file like this... >>> > > >>> > > 1, asdf >>> > > 2, qewrty >>> > > 3, zcxvb >>> > > >>> > > >>> > > and I want to apply an explode operation to it: >>> > > >>> > > 1, a >>> > > 1, s >>> > > 1, d >>> > > 1, f >>> > > 2, q >>> > > 2, e >>> > > 2, w >>> > > 2, r >>> > > 2, t >>> > > 2, y >>> > > 3, z >>> > > 3, c >>> > > 3, x >>> > > 3, v >>> > > 3, b >>> > > >>> > > and after some work... I have this file: >>> > > >>> > > 1, aa >>> > > 1, ss >>> > > 1, dd >>> > > 1, ff >>> > > 2, qq >>> > > 2, ee >>> > > 2, ww >>> > > 2, rr >>> > > 2, tt >>> > > 2, yy >>> > > 3, zz >>> > > 3, cc >>> > > 3, xx >>> > > 3, vv >>> > > 3, bb >>> > > >>> > > >>> > > and I want to perform an implode: >>> > > >>> > > 1, aassddff >>> > > 2, qqeewwrrttyy >>> > > 3, zzccxxvvbb >>> > > >>> > > >>> > > well, obviously this is a dumb example, but I'd like to do those >>> things. >>> > > Can >>> > > somebody help me with this? I looked in the piggy bank and didn't see >>> > > anything that would do this for me. >>> > > >>> > > Thanks! >>> > > >>> > > >>> > >>> >> >> >
