I was referring to 1 below. I think making this mandatory instead of allowed is sufficient (obviously over time, so we don't break computability).
Alan. On Apr 9, 2012, at 10:21 AM, Dmitriy Ryaboy wrote: > Alan, which idea are you +1 on? I think (int) D is the current syntax. > > There are a couple problems that people hit in the current scalar > implementation, both of which I think can be fixed without introducing > new syntax: > > 1) Require the cast, don't do it implicitly. This was actually in the > design doc but didn't get implemented for some reason. > > 2) Throw an error on the frontend if the scalar relation is the > relation being iterated on. Meaning: > > foreach foo generate (int) foo.id; -- this will cause the second "foo" > to be interpreted as a scalar invocation, although clearly it's just a > bug, and the programmer mean to say "generate (int) id" > > We can just detect this error case and throw during compilation. > > 3) Improve MR-side logging to make it clear that a relation is being > loaded from the side, what the relation is, etc. > > I believe we have jiras open for all of these.. > > D > > On Mon, Apr 9, 2012 at 10:15 AM, Alan Gates <[email protected]> wrote: >> I'm +1 on this idea, since it's been a problem since the beginning. Why not >> use regular casting notation though, rather than develop another notation? >> That's what we discussed originally when we were deciding whether to require >> casting or do it silently. So instead of D->a or SCALAR(D) it would be >> (int)D. >> >> Alan. >> >> On Apr 8, 2012, at 7:42 AM, Jonathan Coveney wrote: >> >>> I like this idea, and I think we should deprecate the old syntax, and we >>> can discuss later when it'd get deleted (and when that would be worth it... >>> if we have a new syntax, it seems pretty painless to have the other one >>> float around for backwards compatibility, and if anyone uses it it's a sort >>> of "caveat emptor"). >>> >>> 2012/4/8 Aniket Mokashi <[email protected]> >>> >>>> Hi, >>>> >>>> I have noticed early users of pig often hit issues because of confusing >>>> syntax between scalars and projections. I think scalar syntax should be >>>> made more explicit for users to use in order to avoid these problems. For >>>> example- D = foreach C generate B->count; etc. >>>> I am sure we might break some backward compatibility but we can at least >>>> deprecate the syntax for a few versions and eventually move to new syntax. >>>> >>>> Thoughts? >>>> >>>> Thanks, >>>> Aniket >>>> >>
