b = group a.(x,y) by x; /* Less verbose is good to me. */
b = foreach (group a by x) { generate group, a.(x,y) } /* sad panda */
On Tuesday, February 21, 2012, Jonathan Coveney <[email protected]> wrote:
> While thinking and talking through
https://issues.apache.org/jira/browse/PIG-2536, something came up:
>
> should this idea, that relation.projection works in distincts, work in
any case where a projection is present?
>
> In the patch I linked to, it allows you to do
>
> b = distinct a.$0;
>
> It accomplishes this by mapping that to:
>
> b = distinct (foreach a generate $0);
>
> It seems that if this is useful, then this should be available wherever
relations are used?
>
> ie
>
> b = group a.(x,y) by x;
>
> or anything. The case of group is somewhat problematic, however, because
if you describe that, you'll get...
>
> b: {group: int,1-2: {(x: int,y: int)}}
>
> Which, per Alan's comment, has to do with no real naming convention for
nested relations....
>
> I guess the question is whether this is, in general, useful?
>
> More broadly...
> - Is it worth thinking about how to make this go deeper? Currently you
can do b = distinct a.x, but not b = distinct a.x.$0 (if it were
appropriate). There are issues with this (and in fact there is an
outstanding but w.r.t. b = foreach (group a by $0) generate $1.$0.$0.$0.$0;
<== this works!).
> - Is the strategy of the syntactic sugar ok? I think in this case it
should be (the relation name issue notwithstanding), but could see
arguments either way.
>
> Find a super small patch with no tests attached... I wanted to get some
thoughts before making yet another JIRA?
>
--
Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com