[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480442#comment-13480442
 ] 

Koji Noguchi commented on PIG-2975:
-----------------------------------

bq. One thing I worry about this new sort order is, we might get a worse 
compression ratio for users' output. (Similar record ordered far away.)

To clarify, pig-2975-trunk_v04-purerawcompare.txt doesn't work well with the 
reverse domain name. 
(com.yahoo.news)
(org.apache.wiki)
(com.yahoo.sports)

Of course user can specify 'chararray' to achieve that but I'm afraid some 
users are using the default bytearray type.
                
> TestTypedMap.testOrderBy failing with incorrect result 
> -------------------------------------------------------
>
>                 Key: PIG-2975
>                 URL: https://issues.apache.org/jira/browse/PIG-2975
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.11
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Blocker
>             Fix For: 0.11
>
>         Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
>     at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main    -x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to