[ https://issues.apache.org/jira/browse/PIG-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683098#action_12683098 ]
Santhosh Srinivasan commented on PIG-723: ----------------------------------------- This is a duplicate of PIG-694. > Pig generates incorrect schema for generated bags after FOREACH. > ---------------------------------------------------------------- > > Key: PIG-723 > URL: https://issues.apache.org/jira/browse/PIG-723 > Project: Pig > Issue Type: Bug > Affects Versions: 0.1.0 > Environment: Linux > $pig --version > Apache Pig version 0.1.0-dev (r750430) > compiled Mar 07 2009, 09:20:13 > Reporter: Dhruv M > Priority: Critical > > grunt> rf_src = LOAD 'rf_test.txt' USING PigStorage(',') AS (lhs:chararray, > rhs:chararray, r:float, p:float, c:float); > grunt> rf_grouped = GROUP rf_src BY rhs; > > grunt> lhs_grouped = FOREACH rf_grouped GENERATE group as rhs, rf_src.(lhs, > r) as lhs, MAX(rf_src.p) as p, MAX(rf_src.c) AS c; > grunt> describe lhs_grouped; > lhs_grouped: {rhs: chararray,lhs: {lhs: chararray,r: float},p: float,c: float} > I think it should be: > lhs_grouped: {rhs: chararray,lhs: {(lhs: chararray,r: float)},p: float,c: > float} > Because of this, we are not able to perform UNION on 2 sets because union on > incompatible schemas is causing a complete loss of schema information, making > further processing impossible. > This is what we want to UNION with: > grunt> asrc = LOAD 'atest.txt' USING PigStorage(',') AS (rhs:chararray, > a:int); > grunt> aa = FOREACH asrc GENERATE rhs, (bag{tuple(chararray,float)}) null as > lhs, -10F as p, -10F as c; > grunt> describe aa; > aa: {rhs: chararray,lhs: {(chararray,float)},p: float,c: float} > If there is something wrong with what I am trying to do, please let me know. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.