[ https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480247#comment-13480247 ]
Gianmarco De Francisci Morales commented on PIG-2975: ----------------------------------------------------- Hi, We use ByteBuffer in the comparator for convenience. However, I don't think we should really compare the 6 minutes of the incorrect version with the 10 minutes of the correct version too much. IMHO correctness is more important than performance. The slowness is due to the fact that we need to unnest the ByteArray from the Tuple and that we are using a Tuple to store any kind of data. That said, BinInterSedes.BinInterSedesRawComparator is meant for performance, so if there is a way to make it faster it's more than welcome. My guess is that it won't be easy to recover the original speed. I would suggest to profile the code with some micro benchmark to see where the time is spent. > TestTypedMap.testOrderBy failing with incorrect result > ------------------------------------------------------- > > Key: PIG-2975 > URL: https://issues.apache.org/jira/browse/PIG-2975 > Project: Pig > Issue Type: Sub-task > Affects Versions: 0.11 > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Blocker > Fix For: 0.11 > > Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, > pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, > pig-2975-trunk_v03-unionapproach.txt > > > Looked at > {noformat} > junit.framework.AssertionFailedError > at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352) > {noformat} > This looks like a valid test case failing with incorrect result. > {noformat} > % cat test/orderby.txt > [key#1,key9#23] > [key#3,key3#2] > [key#22] > % cat test/orderby.pig > a = load 'test/orderby.txt' as (m:[]); > b = foreach a generate m#'key' as b0; > dump b; > c = order b by b0; > dump c; > % java ... org.apache.pig.Main -x local test/orderby.pig > [dump b] > (1) > (3) > (22) > ... > [dump c] > (1) > (1) > (22) > % > where did the '(3)' go? > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira