[ https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480378#comment-13480378 ]
Jonathan Coveney commented on PIG-2975: --------------------------------------- As a side note, Koji, if you make a new jira specifically about improve BinInterSedesRawComparator's handling of DataByteArray's I will review and commit it. And if you want to learn Pig, you could make another JIRA about improving the performance in general. IMHO BinInterSedes (and that whole code path that touches it) could probably be significantly improved. W.r.t. to this issue, I think we should either directly compare the bytes (currently leaning towards this), or we can just have a special lightweight comparator that special cases DataByteArrays, and delegates to BinInterSedesRawComparator otherwise. We wouldn't need the complexity of the union approach, and we should get the correctness, speed, and stable bytearray sort order. That said, IF we decide to preserve byte array sort order, I think we should make a decision now about whether or not we want to define that semantic. If not, then just directly comparing the bytes should be a-ok, since all that is important for bytearrays currently is that a global ordering exists, not what that global ordering is. > TestTypedMap.testOrderBy failing with incorrect result > ------------------------------------------------------- > > Key: PIG-2975 > URL: https://issues.apache.org/jira/browse/PIG-2975 > Project: Pig > Issue Type: Sub-task > Affects Versions: 0.11 > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Blocker > Fix For: 0.11 > > Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, > pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, > pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt > > > Looked at > {noformat} > junit.framework.AssertionFailedError > at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352) > {noformat} > This looks like a valid test case failing with incorrect result. > {noformat} > % cat test/orderby.txt > [key#1,key9#23] > [key#3,key3#2] > [key#22] > % cat test/orderby.pig > a = load 'test/orderby.txt' as (m:[]); > b = foreach a generate m#'key' as b0; > dump b; > c = order b by b0; > dump c; > % java ... org.apache.pig.Main -x local test/orderby.pig > [dump b] > (1) > (3) > (22) > ... > [dump c] > (1) > (1) > (22) > % > where did the '(3)' go? > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira