[ 
https://issues.apache.org/jira/browse/THRIFT-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated THRIFT-318:
---------------------------------

    Attachment: thrift-318.patch

This patch adds a new custom Set implementation, IntRangeSet, that collapses 
the values into extents of contiguous values. Then, contains(int) does 2*num 
extents comparisons. This proves to be faster than HashSet, likely by avoiding 
the Integer.valueOf autoboxing and Integer.hashcode operation. My tests show 
that for a variety of different value sets and query values, it's about 60% 
faster. 

I've also amended the java compiler to use IntRangeSet when generating enums. 
The struct code itself does not change.

> Performance of HashSet for enumeration VALID_VALUES seems poor
> --------------------------------------------------------------
>
>                 Key: THRIFT-318
>                 URL: https://issues.apache.org/jira/browse/THRIFT-318
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Compiler (Java)
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: thrift-318.patch
>
>
> It looks like using a HashSet for the VALID_VALUES set we now put in 
> enumerated types was a bad move, performance-wise. There's a fair amount of 
> HashSet/HashMap/Integer overhead generated.
> I think that the VALID_VALUES should still be a Set, but we can make a 
> TIntRangeSet or something internal to Thrift that's more efficient for our 
> usecases and save some CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to