[ https://issues.apache.org/jira/browse/PIG-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929803#action_12929803 ]
Olga Natkovich commented on PIG-1281: ------------------------------------- Discussed this with Daniel. Here is what needs to happen: (1) If type is specified at load time or through cast, typechecker should detect the problem. (2) Otherwise, frontend needs to insert cast to a tuple and let backend figure out if the real data contains the tuple. Neither of this is happenning right now. Please, note that the problem with the script identified in the ticket can't be detected at frontend because the type info is missing. > Detect org.apache.pig.data.DataByteArray cannot be cast to > org.apache.pig.data.Tuple type of errors at Compile Type during creation of > logical plan > --------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: PIG-1281 > URL: https://issues.apache.org/jira/browse/PIG-1281 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.6.0 > Reporter: Viraj Bhat > Fix For: 0.9.0 > > > This is more of an enhancement request, where we can detect simple errors > during compile time during creation of Logical plan rather than at the > backend. > I created a script which contains an error which gets detected in the backend > as a cast error when in fact we can detect it in the front end(group is a > single element so group.$0 projection operation will not work). > {code} > inputdata = LOAD '/user/viraj/mymapdata' AS (co1, col2, col3, col4); > projdata = FILTER inputdata BY (col1 is not null); > groupprojdata = GROUP projdata BY col1; > cleandata = FOREACH groupprojdata { > bagproj = projdata.col1; > dist_bags = DISTINCT bagproj; > GENERATE group.$0 as newcol1, COUNT(dist_bags) as > newcol2; > }; > cleandata1 = GROUP cleandata by newcol2; > cleandata2 = FOREACH cleandata1 { GENERATE group.$0 as finalcol1, > COUNT(cleandata.newcol1) as finalcol2; }; > ordereddata = ORDER cleandata2 by finalcol2; > store into 'finalresult' using PigStorage(); > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.