[ 
https://issues.apache.org/jira/browse/LUCENE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700931#action_12700931
 ] 

Yonik Seeley edited comment on LUCENE-1607 at 4/20/09 2:27 PM:
---------------------------------------------------------------

bq. The fastest hash we can get, should have no collisions. This is achievable 
by resizing on each new collision.

*edit*: agree, for the first version that was only a cache where collisions 
invalidate the entry and cause another String.intern() to be called... my 
comments below are with respect to the second version of my code where interned 
strings are never dropped from the table.

Hmmm, in my quick'n'dirty tests of about 256 unique strings, a smaller hash 
table was actually quicker (initialized with 32 and let it resize vs starting 
at 1024).  I imagine that this would be due to a larger part of the table 
fitting in smaller and faster processor caches.  YMMV.  Collisions should also 
be very quick to skip by comparing the hash code (which is cached for Strings).



      was (Author: ysee...@gmail.com):
    bq. The fastest hash we can get, should have no collisions. This is 
achievable by resizing on each new collision.

Hmmm, in my quick'n'dirty tests of about 256 unique strings, a smaller hash 
table was actually quicker (initialized with 32 and let it resize vs starting 
at 1024).  I imagine that this would be due to a larger part of the table 
fitting in smaller and faster processor caches.  YMMV.  Collisions should also 
be very quick to skip by comparing the hash code (which is cached for Strings).


  
> String.intern() faster alternative
> ----------------------------------
>
>                 Key: LUCENE-1607
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1607
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Earwin Burrfoot
>             Fix For: 2.9
>
>         Attachments: intern.patch, LUCENE-1607.patch, LUCENE-1607.patch, 
> LUCENE-1607.patch, LUCENE-1607.patch
>
>
> By using our own interned string pool on top of default, String.intern() can 
> be greatly optimized.
> On my setup (java 6) this alternative runs ~15.8x faster for already interned 
> strings, and ~2.2x faster for 'new String(interned)'
> For java 5 and 4 speedup is lower, but still considerable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to