While thinking and talking through
https://issues.apache.org/jira/browse/PIG-2536, something came up:

should this idea, that relation.projection works in distincts, work in any
case where a projection is present?

In the patch I linked to, it allows you to do

b = distinct a.$0;

It accomplishes this by mapping that to:

b = distinct (foreach a generate $0);

It seems that if this is useful, then this should be available wherever
relations are used?

ie

b = group a.(x,y) by x;

or anything. The case of group is somewhat problematic, however, because if
you describe that, you'll get...

b: {group: int,1-2: {(x: int,y: int)}}

Which, per Alan's comment, has to do with no real naming convention for
nested relations....

I guess the question is whether this is, in general, useful?

More broadly...
- Is it worth thinking about how to make this go deeper? Currently you can
do b = distinct a.x, but not b = distinct a.x.$0 (if it were appropriate).
There are issues with this (and in fact there is an outstanding but w.r.t.
b = foreach (group a by $0) generate $1.$0.$0.$0.$0; <== this works!).
- Is the strategy of the syntactic sugar ok? I think in this case it should
be (the relation name issue notwithstanding), but could see arguments
either way.

Find a super small patch with no tests attached... I wanted to get some
thoughts before making yet another JIRA?

Reply via email to