[ https://issues.apache.org/jira/browse/PIG-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-1368. --------------------------------- Resolution: Duplicate This will be addressed as part of PIG-1271 > Utf8StorageConvertor's bytesToTuple and bytesToBag methods need to be > tightened for corner cases > ------------------------------------------------------------------------------------------------ > > Key: PIG-1368 > URL: https://issues.apache.org/jira/browse/PIG-1368 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.7.0 > Reporter: Pradeep Kamath > > Consider the following data: > 1\t ( hello , bye ) \n > 1\t( hello , bye )a\n > 2 \t (good , bye)\n > The following script gives the results below: > a = load 'junk' as (i:int, t:tuple(s:chararray, r:chararray)); dump a; > (1,( hello , bye )) > (1,( hello , bye )) > (2,(good , bye)) > The current bytesToTuple implementation discards leading and trailing > characters before the tuple delimiters and parses the tuple out - I think > instead it should treat any leading and trailing characters (including space) > near the delimiters as an indication of a malformed tuple and return null. > Also in the code, consumeBag() should handle the special case of {} and not > delegate the handling to consumeTuple(). > In consumeBag() null tuples should not be skipped. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.