Hi,
Hope there is some simple answer to this. I have bunch of rows, for
each
row, I want to add a column which is derived from some existing
columns. And
I have large number of columns in my input tuple so I don't want to
repeat
the name using "AS" when I generate. Is there an easy way just to
append a
column to tuples without having to touch the tuple itself on the
output.
Here's my example:
grunt> DESCRIBE X;
X: {id: chararray,v1: int,v2: int}
grunt> DUMP X;
(a,3,42)
(b,2,4)
(c,7,32)
I can do this:
grunt> Y = FOREACH X GENERATE (v2 - v1) as diff, id, v1, v2;
grunt> DUMP Y;
(39,a,3,42)
(2,b,2,4)
(25,c,7,32)
But I would prefer not to have to list all the v's. I may have v1,
v2, v3,
..., v100.
Of course this doesn't work
grunt> Y = FOREACH X GENERATE (v2 - v1) as diff, FLATTEN(X);
What can be done to simplify this? And related question, what is the
schema
after the FOREACH, I wish I could do a DESCRIBE after FOREACH.
Thanks !!