[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771308#action_12771308 ]
hc busy commented on PIG-1016: ------------------------------ Well, I'd like to start by thanking everyone for the attention and support! As a first time contributor, I feel my heart warmed by the encouraging comments and serious time everyone is spending on my problem. I also greatly appreciate the patience everybody has, and of course I am perpetually grateful for everybody's work in making this all work. Line by line, {code} + // find bug is complaining about nulls. This check sequence will prevent nulls from being dereferenced. + if(o1!=null && o2!=null){ ... + }else{ + if(o1==null && o2==null){rc=0;} + else if(o1==null) {rc=-1;} + else{ rc=1; } {code} Does what it says, it prevents a findbug warning. non-null is greater than null by convention. {code} + // In case the objects are comparable + if((o1 instanceof NullableBytesWritable && o2 instanceof NullableBytesWritable)|| + !(o1 instanceof PigNullableWritable && o2 instanceof PigNullableWritable) + ){ + + NullableBytesWritable nbw1 = (NullableBytesWritable)o1; + NullableBytesWritable nbw2 = (NullableBytesWritable)o2; + + // If either are null, handle differently. + if (!nbw1.isNull() && !nbw2.isNull()) { + rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() && nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } + } {code} The if statement takes us outside of original comparison code (enclosed in outer if above) ONLY if both compratee are PigNullableWritable that are not NullableBytesWritable. This may seem confusing at first glance, but what it does is do the identical thing as before the patch except for the new case that I introduced by allowing other types. The code is awkward, as Santhosh noted. But I am not too sure I understand the original implementation. But certainly, this way, we preserve original behavior and for new cases that this patch introduces, they are handled in the remaining else: {code} else{ + // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's + PigNullableWritable nbw1 = (PigNullableWritable)o1; + PigNullableWritable nbw2 = (PigNullableWritable)o2; + // If either are null, handle differently. + if (!nbw1.isNull() && !nbw2.isNull()) { + rc = nbw1.compareTo(nbw2); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() && nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } + } {code} This is the safest way I can think of writing this code, and I have been able to order by a value begotten out of a map. Also, join and then sort keyed on values of maps both works. I guess something that flows better might be the following: {code} if(o1!=null && o2!=null){ if((o1 instanceof PigNullableWritable && o2 instanceof PigNullableWritable ){ PigNullableWritable nbw1 = (PigNullableWritable)o1; PigNullableWritable nbw2 = (PigNullableWritable)o2; // If either are null, handle differently. if (!nbw1.isNull() && !nbw2.isNull()) { rc = nbw1.compareTo(nbw2); } else { // For sorting purposes two nulls are equal. if (nbw1.isNull() && nbw2.isNull()) rc = 0; else if (nbw1.isNull()) rc = -1; else rc = 1; } }else{ throw new Exception("bad compare"); } }else{ if(o1==null && o2==null){rc=0;} else if(o1==null) {rc=-1;} else{ rc=1; } {code} But I must admit that I don't know what the right thing to do is. I don't know the design well enough to know if throwing an exception is the appropriate thing? Or something else? And would the last code block perform the right comparison in place of the original function? lmk of your thoughts on improvements to the patch. > Reading in map data seems broken > -------------------------------- > > Key: PIG-1016 > URL: https://issues.apache.org/jira/browse/PIG-1016 > Project: Pig > Issue Type: Improvement > Components: data > Affects Versions: 0.4.0 > Reporter: hc busy > Fix For: 0.5.0 > > Attachments: PIG-1016.patch > > > Hi, I'm trying to load a map that has a tuple for value. The read fails in > 0.4.0 because of a misconfiguration in the parser. Where as in almost all > documentation it is stated that value of the map can be any time. > I've attached a patch that allows us to read in complex objects as value as > documented. I've done simple verification of loading in maps with tuple/map > values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.