[ 
https://issues.apache.org/jira/browse/SOLR-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321994#comment-14321994
 ] 

Yonik Seeley commented on SOLR-7110:
------------------------------------

Background for others who don't know how this works, Solr (javabin format) 
internally avoids repeating String keys by allowing strings to be specified by 
number if it's already been seen in the current message.

But looking at the patch quickly, this isn't about reusing the "external 
string" across different messages.  This is simply about avoiding String 
creation.  Basically, one reads a sequence of UTF8 bytes off the stream and 
instead of creating a new String object, we check a cache may already have a 
String for those bytes.  This isn't unique to JavaBin either... one could use 
the same technique in any of our transports (including HTTP params).

Gut feel is that as written, this will be slower.  The extra work + overhead of 
our concurrent LRU cache should swamp any savings.  Has this been benchmarked?

> Optimize JavaBinCodec to minimize string Object creation
> --------------------------------------------------------
>
>                 Key: SOLR-7110
>                 URL: https://issues.apache.org/jira/browse/SOLR-7110
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>            Priority: Minor
>         Attachments: SOLR-7110.patch
>
>
> In JavabinCodec we already optimize on strings creation , if they are 
> repeated in the same payload. if we use a cache it is possible to avoid 
> string creation across objects as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to