Andy Schlaikjer created PIG-3368:
------------------------------------

             Summary: doc pig flatten operator applied to empty vs null bag
                 Key: PIG-3368
                 URL: https://issues.apache.org/jira/browse/PIG-3368
             Project: Pig
          Issue Type: Improvement
          Components: documentation
            Reporter: Andy Schlaikjer


[Pig docs|http://pig.apache.org/docs/r0.11.0/basic.html#flatten] state that 
FLATTEN(field_of_type_bag) may generate a cross-product in the case when an 
additional field is projected, e.g.:

y = FOREACH x GENERATE f1, FLATTEN(fbag) as f2;

Additionally, for records in x for which fbag is empty (not null), no output 
record is generated.

What is expected behavior when fbag is null?

Some users might expect similar behavior, but FLATTEN actually passes through 
the null, resulting in an output record (f1, f2) where f2 is null.

It would be useful to update FLATTEN docs to mention this.

http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml?view=markup#l5051

I'm guessing these are the relevant bits which affect this behavior:

http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l440

http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java?view=markup#l468

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to