[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837401#action_12837401 ]
hc busy commented on PIG-1016: ------------------------------ No, it doesnt. (yet) so, I was pulled away for a while, and didn't follow threads on pig. So, reading Thejas's comment, and discussing internally with my colleagues at work/ They agree with the discussion here that the existing PigStorage has it's merrits and that if I want nested data structure, I should write my own custom storage. Basically, I have a separate storage that support nested data reading, and in my data, long values have been modified to include the 'l' on the end. The change in PIG-1082 makes it possible for us to join and order on nested data structure. That change is still necessary even with my own loader because otherwise the data is still compared as DataByteArrays {{src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java}} {code} public int compare(Object o1, Object o2) { NullableBytesWritable nbw1 = (NullableBytesWritable)o1; NullableBytesWritable nbw2 = (NullableBytesWritable)o2; int rc = 0; // If either are null, handle differently. if (!nbw1.isNull() && !nbw2.isNull()) { rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); } else { // For sorting purposes two nulls are equal. if (nbw1.isNull() && nbw2.isNull()) rc = 0; else if (nbw1.isNull()) rc = -1; else rc = 1; } if (!mAsc[0]) rc *= -1; return rc; } {code} to be changed to something like this: {code} public int compare(Object o1, Object o2) { int rc=0; // find bug is complaining about nulls. This check sequence will prevent nulls from being dereferenced. if(o1!=null && o2!=null){ // In case the objects are comparable if((o1 instanceof NullableBytesWritable && o2 instanceof NullableBytesWritable)|| !(o1 instanceof PigNullableWritable && o2 instanceof PigNullableWritable) ){ NullableBytesWritable nbw1 = (NullableBytesWritable)o1; NullableBytesWritable nbw2 = (NullableBytesWritable)o2; // If either are null, handle differently. if (!nbw1.isNull() && !nbw2.isNull()) { rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); } else { // For sorting purposes two nulls are equal. if (nbw1.isNull() && nbw2.isNull()) rc = 0; else if (nbw1.isNull()) rc = -1; else rc = 1; } }else{ // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's PigNullableWritable nbw1 = (PigNullableWritable)o1; PigNullableWritable nbw2 = (PigNullableWritable)o2; // If either are null, handle differently. if (!nbw1.isNull() && !nbw2.isNull()) { rc = nbw1.compareTo(nbw2); } else { // For sorting purposes two nulls are equal. if (nbw1.isNull() && nbw2.isNull()) rc = 0; else if (nbw1.isNull()) rc = -1; else rc = 1; } } }else{ if(o1==null && o2==null){rc=0;} else if(o1==null) {rc=-1;} else{ rc=1; } } if (!mAsc[0]) rc *= -1; return rc; } {code} Because once we allow non-NullableBytesWritable's into the comparator, the comparator fails unless we handle those cases. > Reading in map data seems broken > -------------------------------- > > Key: PIG-1016 > URL: https://issues.apache.org/jira/browse/PIG-1016 > Project: Pig > Issue Type: Improvement > Components: data > Affects Versions: 0.4.0 > Reporter: hc busy > Attachments: PIG-1016.patch > > > Hi, I'm trying to load a map that has a tuple for value. The read fails in > 0.4.0 because of a misconfiguration in the parser. Where as in almost all > documentation it is stated that value of the map can be any time. > I've attached a patch that allows us to read in complex objects as value as > documented. I've done simple verification of loading in maps with tuple/map > values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.