how do I use this bag? Is there a way for me to specify it in grunt? BagFactory.getInstance().newSortedBag(comparator)
? On Mon, Feb 22, 2010 at 10:34 AM, hc busy <[email protected]> wrote: > ok, it sounds like I have a plan. So I need to write a UDF from tuple to > bag(t2b) and bag to tuple(b2t), and then I do > > exploded= foreach foo generate id, FLATTEN(t2b(field1, field2, field3)); > implode= group exploded by id; > implode= foreach implode generate id, flatten(b2t(implode)); > > to (almost) recover original table, except for field order may be messed > up. Is there a way to write a udf like flatten that preserve order? > > > Thanks! > > > > > On Mon, Feb 22, 2010 at 9:57 AM, Dmitriy Ryaboy <[email protected]>wrote: > >> Same thing -- a udf to convert a tuple into a bag, then flatten. >> Don't rely on any order you see in bags during testing -- there is >> explicitly no guarantee there, it may change on you version to version and >> execution to execution. >> >> -D >> >> On Mon, Feb 22, 2010 at 9:45 AM, hc busy <[email protected]> wrote: >> >> > Thanks, Dmitriy and Rekha . So I understand the flatten on bag explodes >> to >> > multiple rows now. >> > >> > The BagConcat seems to work. Actually, doing a simple example using the >> > group by, it would appear that the bag contains the results in the order >> > that they were before entering the group by. (so, if I group after an >> order >> > by x desc, then when I dump the table it prints the bag, but contents >> are >> > reversed)... So, actually, for my purposes, not having results in order >> is >> > okay. >> > >> > what about instead of charsplit, the data I have is this: >> > >> > 1,a,b,c,d >> > 2,a,s,d,f >> > >> > and I want to explode it into >> > 1,a >> > 1,b >> > 1,c >> > 1,d >> > 2,a >> > 2,s >> > 2,d >> > 2,f >> > >> > (sorry, I made a mistake in the original question, the string is not a >> > string but a tuple.) I think I may be able to get it into: >> > >> > 1, (a,b,c,d) >> > 2, (a,s,d,f) >> > >> > but still, I need to explode it into several rows to operate on them >> > separately. >> > >> > >> > >> > On Sun, Feb 21, 2010 at 8:03 PM, Rekha Joshi <[email protected]> >> > wrote: >> > >> > > You would require a udf for this.Please check if you already have an >> > > existing one in latest pig-udf.jar. >> > > Or since this is a pretty simple one , you can write one yourself - >> take >> > > the tuple, assess the type , append the strings and return it from >> your >> > > exec() method. >> > > >> > > Cheers, >> > > /R >> > > >> > > >> > > On 2/19/10 11:51 PM, "hc busy" <[email protected]> wrote: >> > > >> > > Guys, I know this must be a common use case, but how do you explode >> and >> > > implode in pig? >> > > >> > > so, I have a file like this... >> > > >> > > 1, asdf >> > > 2, qewrty >> > > 3, zcxvb >> > > >> > > >> > > and I want to apply an explode operation to it: >> > > >> > > 1, a >> > > 1, s >> > > 1, d >> > > 1, f >> > > 2, q >> > > 2, e >> > > 2, w >> > > 2, r >> > > 2, t >> > > 2, y >> > > 3, z >> > > 3, c >> > > 3, x >> > > 3, v >> > > 3, b >> > > >> > > and after some work... I have this file: >> > > >> > > 1, aa >> > > 1, ss >> > > 1, dd >> > > 1, ff >> > > 2, qq >> > > 2, ee >> > > 2, ww >> > > 2, rr >> > > 2, tt >> > > 2, yy >> > > 3, zz >> > > 3, cc >> > > 3, xx >> > > 3, vv >> > > 3, bb >> > > >> > > >> > > and I want to perform an implode: >> > > >> > > 1, aassddff >> > > 2, qqeewwrrttyy >> > > 3, zzccxxvvbb >> > > >> > > >> > > well, obviously this is a dumb example, but I'd like to do those >> things. >> > > Can >> > > somebody help me with this? I looked in the piggy bank and didn't see >> > > anything that would do this for me. >> > > >> > > Thanks! >> > > >> > > >> > >> > >
