[
https://issues.apache.org/jira/browse/THRIFT-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Duxbury updated THRIFT-318:
---------------------------------
Attachment: thrift-318.patch
This patch adds a new custom Set implementation, IntRangeSet, that collapses
the values into extents of contiguous values. Then, contains(int) does 2*num
extents comparisons. This proves to be faster than HashSet, likely by avoiding
the Integer.valueOf autoboxing and Integer.hashcode operation. My tests show
that for a variety of different value sets and query values, it's about 60%
faster.
I've also amended the java compiler to use IntRangeSet when generating enums.
The struct code itself does not change.
> Performance of HashSet for enumeration VALID_VALUES seems poor
> --------------------------------------------------------------
>
> Key: THRIFT-318
> URL: https://issues.apache.org/jira/browse/THRIFT-318
> Project: Thrift
> Issue Type: Improvement
> Components: Compiler (Java)
> Reporter: Bryan Duxbury
> Assignee: Bryan Duxbury
> Priority: Minor
> Fix For: 0.1
>
> Attachments: thrift-318.patch
>
>
> It looks like using a HashSet for the VALID_VALUES set we now put in
> enumerated types was a bad move, performance-wise. There's a fair amount of
> HashSet/HashMap/Integer overhead generated.
> I think that the VALID_VALUES should still be a Set, but we can make a
> TIntRangeSet or something internal to Thrift that's more efficient for our
> usecases and save some CPU.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.