-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/278/
-----------------------------------------------------------
Review request for pig and Richard Ding.
Summary
-------
The following script:
urlContents = LOAD 'inputdir' USING BinStorage() AS (url:bytearray,
pg:bytearray);
– describe and dump are in-sync
DESCRIBE urlContents;
DUMP urlContents;
urlContentsG = GROUP urlContents BY url;
DESCRIBE urlContentsG;
urlContentsF = FOREACH urlContentsG GENERATE group,urlContents.pg;
DESCRIBE urlContentsF;
DUMP urlContentsF;
Prints for the DESCRIBE commands:
urlContents: {url: chararray,pg: chararray}
urlContentsG: {group: chararray,urlContents: {url: chararray,pg: chararray}}
urlContentsF: {group: chararray,pg: {pg: chararray}}
The reported schemas for urlContentsG and urlContentsF are wrong. They are also
against the section "Schemas for Complex Data Types" in
http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_Schemas.
As expected, actual data observed from DUMP urlContentsG and DUMP urlContentsF
do contain the tuple inside the inner bags.
The correct schema for urlContentsG is: {group: chararray,urlContents:
{t1:(url: chararray,pg: chararray)}}
This may sound like a technicality, but it isn't. For instance, a UDF that
assumes an inner bag of {chararray} will not work with {(chararray)}.
This addresses bug PIG-767.
https://issues.apache.org/jira/browse/PIG-767
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCogroup.java
1057928
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOGenerate.java
1057928
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOInnerLoad.java
1057928
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLogicalPlanMigrationVisitor.java
1057928
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNewPlanLogToPhyTranslationVisitor.java
1057928
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSchema.java
1057928
Diff: https://reviews.apache.org/r/278/diff
Testing
-------
Test-patch:
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 9 new or
modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the
total number of release audit warnings.
Unit-test:
all pass.
Thanks,
Daniel