Hi, Shi: This is working in 0.11. Can you try it? Johnny Zhang
On Tue, Mar 12, 2013 at 1:55 PM, Shi Gao <[email protected]> wrote: > Hi, > > After using TOBAG function and then flatten the bag, the relation can't be > used any more except dump or store. > > Pig version: 0.8.1-cdh3u5 > > Input data: > a1 b1 c1 > a2 b2 c2 > a1 b1 c1 > > > grunt> A = load '/mnt/hgfs/shared/test.txt' as > (f1:chararray,f2:chararray,f3:chararray); > grunt> describe > A > A: {f1: chararray,f2: chararray,f3: chararray} > > grunt> B = foreach A generate TOBAG(*); > grunt> describe B; > B: {{(f1: chararray,f2: chararray,f3: chararray)}} -- This is wrong, it > should be {(f1: chararray)} > grunt> dump B > ... > ({(a1),(b1),(c1)}) -- Shows correct result though. > ({(a2),(b2),(c2)}) > ({(a1),(b1),(c1)}) > ... > > grunt> C = foreach B generate flatten($0); > grunt> describe C; > C: {(f1: chararray,f2: chararray,f3: chararray)} -- This is wrong. > grunt> dump C; > (a1) > (b1) > (c1) > (a2) > (b2) > (c2) > (a1) > (b1) > (c1) > > -- And from here nothing can be done to C, except to dump as above. > > D = foreach C generate $0; -- gives error: java.lang.String cannot be cast > to org.apache.pig.data.Tuple > > To illustrate: > --------------------------------------------------------- > | A | f1: bytearray | f2: bytearray | f3: bytearray | > --------------------------------------------------------- > | | a1 | b1 | c1 | > --------------------------------------------------------- > --------------------------------------------------------- > | A | f1: chararray | f2: chararray | f3: chararray | > --------------------------------------------------------- > | | a1 | b1 | c1 | > --------------------------------------------------------- > -------------------------------------------------------------- > | B | bag({(f1: chararray,f2: chararray,f3: chararray)}) | > -------------------------------------------------------------- > | | {(a1), (b1), (c1)} | > -------------------------------------------------------------- > -------------------------------------------------------------- > | C | tuple({f1: chararray,f2: chararray,f3: chararray}) | > -------------------------------------------------------------- > | | a1 | > | | b1 | > | | c1 | > -------------------------------------------------------------- > > > However, this is wrong too: > E = foreach C generate flatten($0); > dump E; -- give error. > > Could you please help with this? > > Thanks, > Shi >
