[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480231#comment-13480231
 ] 

Jonathan Coveney commented on PIG-2975:
---------------------------------------

Hmm, ok, well, that's a good change to keep in in general.

I think that an ok short term solution is to special case DataByteArray's with 
a custom WritableComparator that, in the case of a 
BYTEARRAY/TINYBYTEARRAY/SMALLBYTEARRAY will just use WritableComparator's 
compareBytes a la BytesWritable, else it fails over to 
BinInterSedesRawComparator.

Let's make some tests and fix this. Correctness trumps performance, though 
let's make sure that this failover approach is performant (I see no reason it 
shouldn't be).

But then step 2 is to make a separate ticket about optimizing 
BinInterSedesRawComparator. Anyone working on Pig can get a key for yourkit, so 
you can ping me for that. The pro-style approach IMHO is to use Google Caliper 
to build some micro-benchmarks (caliper is good about warming up the JVM), 
while also using the bigger benchmark you've been using in this thread. Then 
you can use YourKit while isolating the difference in speeds and isolate where 
the difference is coming in, and what method calls are taking the most time.
                
> TestTypedMap.testOrderBy failing with incorrect result 
> -------------------------------------------------------
>
>                 Key: PIG-2975
>                 URL: https://issues.apache.org/jira/browse/PIG-2975
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.11
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Blocker
>             Fix For: 0.11
>
>         Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
>     at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main    -x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to