does not seem to be a issue in pig 0.15 .. (tested in local mode only as of
now)
a = load '/tmp/test/test.txt' using PigStorage(',') as
(A:chararray,B:chararray,C:chararray);
b = group a by (A,B);
c = foreach b {
asdf = filter $1 by (1==1);
generate COUNT_STAR($1) as TARGET;
};
d = limit c 10;
e = foreach d generate TARGET;
dump e;
end output ...
(1)
*Cheers !!*
Arvind
On Sat, Nov 14, 2015 at 12:18 AM, Christopher Maier <
[email protected]> wrote:
> Hi,
>
> I haven't received a response on this, has anyone had a chance to
> reproduce the error?
>
> Thanks,
> Kit
>
> From: Christopher Maier
> Sent: Tuesday, October 20, 2015 4:02 PM
> To: '[email protected]' <[email protected]>
> Subject: Schema changes based on subquery
>
> Hi,
>
> I am getting the wrong counts from Pig for a certain query. I have
> simplified the query to what's below, which shows as a failure instead of a
> wrong count.
>
> Why does the first line of the subquery cause the output schema to revert
> to be the same as the input schema? This line should not have any impact on
> the output.
>
> (I've removed some of the extra logging output.)
>
> pig -version
> Apache Pig version 0.12.0 (rexported)
> compiled Oct 26 2014, 23:43:04
>
> Query
> grunt> a = load 'test1.txt' using PigStorage(',') as
> (A:chararray,B:chararray,C:chararray);
> grunt> b = group a by (A,B);
> grunt> c = foreach b {
> >> asdf = filter $1 by (1==1);
> >> generate COUNT_STAR($1) as TARGET;
> >> };
> grunt> d = limit c 10;
>
> Values
> grunt> dump a;
> (a,b,c)
> grunt> dump b;
> ((a,b),{(a,b,c)})
> grunt> dump c;
> (1)
> grunt> dump d;
> (1)
>
> Schema 'describe' at each step looks good
> grunt> describe a;
> a: {A: chararray,B: chararray,C: chararray}
> grunt> describe b;
> b: {group: (A: chararray,B: chararray),a: {(A: chararray,B: chararray,C:
> chararray)}}
> grunt> describe c;
> c: {TARGET: long}
> grunt> describe d;
> d: {TARGET: long}
>
> Attempted next step fails
> grunt> e = foreach d generate TARGET;
> <line 8, column 23> Invalid field projection. Projected field [TARGET]
> does not exist in schema: A:chararray,B:chararray,C:chararray.
>
> Progress of real schema through query
> grunt> z = foreach a generate FAKE;
> <line 8, column 23> Invalid field projection. Projected field [FAKE] does
> not exist in schema: A:chararray,B:chararray,C:chararray.
> grunt> z = foreach b generate FAKE;
> <line 8, column 23> Invalid field projection. Projected field [FAKE] does
> not exist in schema:
> group:tuple(A:chararray,B:chararray),a:bag{:tuple(A:chararray,B:chararray,C:chararray)}.
> grunt> z = foreach c generate FAKE;
> <line 8, column 23> Invalid field projection. Projected field [FAKE] does
> not exist in schema: TARGET:long.
> grunt> z = foreach d generate FAKE;
> <line 8, column 23> Invalid field projection. Projected field [FAKE] does
> not exist in schema: A:chararray,B:chararray,C:chararray.
>
> Alternate query shows no error
> grunt> c = foreach b {
> >> generate COUNT_STAR($1) as TARGET;
> >> };
> grunt> d = limit c 10;
> grunt> e = foreach d generate TARGET;
> grunt> dump e;
> (1)
>
> Thanks,
> Kit Maier
>
>
>
> Nothing in this message is intended to constitute an electronic signature
> unless a specific statement to the contrary is included in this message.
>
> Confidentiality Note: This message is intended only for the person or
> entity to which it is addressed. It may contain confidential and/or
> privileged material. Any review, transmission, dissemination or other use,
> or taking of any action in reliance upon this message by persons or
> entities other than the intended recipient is prohibited and may be
> unlawful. If you received this message in error, please contact the sender
> and delete it from your computer.
>