[
https://issues.apache.org/jira/browse/PIG-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14998940#comment-14998940
]
Rohini Palaniswamy commented on PIG-4724:
-----------------------------------------
Had a discussion with [~daijy]. Daniel said that this should be fine and we can
achieve this with some tweaking like passing an empty bag. But since this is
backward incompatible it would be good to have a switch so that the older
behavior can be retained if some users were not expecting this and relying on
it.
> GROUP ALL must create an output record in case there is no input
> ----------------------------------------------------------------
>
> Key: PIG-4724
> URL: https://issues.apache.org/jira/browse/PIG-4724
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.15.0
> Reporter: Prashant Kommireddi
>
> {code}
> A = load 'data';
> B = filter A by $0 == 'THIS_DOES_NOT_EXIST';
> C = group B ALL;
> D = foreach C generate group, COUNT(B);
> {code}
> Even if the filter did not output any rows, since we are grouping on ALL the
> expected output should probably be (ALL, 0). The implementation generates a
> pseudo key “all” for every input on map side, thus reduce side we can combine
> all input together. However, this does not work for 0 input since the reduce
> side does not get any input. If the input is empty, yield a pseudo “all, 0”
> to reduce
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)