Andy Schlaikjer created PIG-3368:
------------------------------------
Summary: doc pig flatten operator applied to empty vs null bag
Key: PIG-3368
URL: https://issues.apache.org/jira/browse/PIG-3368
Project: Pig
Issue Type: Improvement
Components: documentation
Reporter: Andy Schlaikjer
[Pig docs|http://pig.apache.org/docs/r0.11.0/basic.html#flatten] state that
FLATTEN(field_of_type_bag) may generate a cross-product in the case when an
additional field is projected, e.g.:
y = FOREACH x GENERATE f1, FLATTEN(fbag) as f2;
Additionally, for records in x for which fbag is empty (not null), no output
record is generated.
What is expected behavior when fbag is null?
Some users might expect similar behavior, but FLATTEN actually passes through
the null, resulting in an output record (f1, f2) where f2 is null.
It would be useful to update FLATTEN docs to mention this.
http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml?view=markup#l5051
I'm guessing these are the relevant bits which affect this behavior:
http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l440
http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l468
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira