Andy Schlaikjer created PIG-3368: ------------------------------------ Summary: doc pig flatten operator applied to empty vs null bag Key: PIG-3368 URL: https://issues.apache.org/jira/browse/PIG-3368 Project: Pig Issue Type: Improvement Components: documentation Reporter: Andy Schlaikjer
[Pig docs|http://pig.apache.org/docs/r0.11.0/basic.html#flatten] state that FLATTEN(field_of_type_bag) may generate a cross-product in the case when an additional field is projected, e.g.: y = FOREACH x GENERATE f1, FLATTEN(fbag) as f2; Additionally, for records in x for which fbag is empty (not null), no output record is generated. What is expected behavior when fbag is null? Some users might expect similar behavior, but FLATTEN actually passes through the null, resulting in an output record (f1, f2) where f2 is null. It would be useful to update FLATTEN docs to mention this. http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml?view=markup#l5051 I'm guessing these are the relevant bits which affect this behavior: http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l440 http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l468 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira