Thank you everyone for the valuable input!

On Jun 11, 2013, at 1:52 AM, Aleksey Shipilev <aleksey.shipi...@oracle.com> 
wrote:

> On 06/11/2013 12:31 PM, Remi Forax wrote:
>> On 06/10/2013 08:06 PM, Steven Schlansker wrote:
>> Hi Steven,
>> the main issue is that intern() doesn't work in isolation,
>> 
>> I think it's better to change the JSON Parser implementation to use it's
>> own cache (or not) and not rely on String.intern().
> 
> +1.
> 
> IMO, String.intern() is the gateway into VM symbol table, and should be
> regarded as such. The improvements for String.intern(), if any, then
> should be on the VM (native) side.
> 
> Also, I think most people confuse String interning and String
> de-duplication. Using interning to improve memory footprint is the
> overkill. Smart deduplicators may carefully balance the overheads of
> deduplication vs. the memory footprint


Yes, maybe this is in fact the real problem here.  The JavaDoc for String does 
not in anyway reflect what you and the other JDK developers seem to assume -- 
that intern() is mostly a "for JVM use" method and is not really intended for 
use by end users.  Maybe a documentation update to reflect that fact would be 
appropriate?  Something indicating that the implementation is specialized for 
VM usage and is not optimal for end user code might help clear up confusion.  
Does that sound like a good idea?

I understand that this is confusing the contract of the method with the 
implementation a bit.  I just feel that the sentiment I get here ("Why would 
you do that?  Don't use intern, just do it yourself!") is mismatched with the 
implicit fit-for-purpose I expect from core Java classes, and a warning might 
help reduce confusion.


On Jun 11, 2013, at 2:28 AM, Alan Bateman <alan.bate...@oracle.com> wrote:

> On 10/06/2013 19:06, Steven Schlansker wrote:
>> Hi core-libs-dev,
>> 
>> While doing performance profiling of my application, I discovered that 
>> nearly 50% of the time deserializing JSON was spent within String.intern().  
>> I understand that in general interning Strings is not the best approach for 
>> things, but I think I have a decent use case -- the value of a certain field 
>> is one of a very limited number of valid values (that are not known at 
>> compile time, so I cannot use an Enum), and is repeated many millions of 
>> times in the JSON stream.
>> 
> Have you run with -XX:+PrintStringTableStatistics? Might be interesting if 
> you can share the output (it is printed just before the VM terminates).
> 
> There are also tuning knobs such as StringTableSize and would be interesting 
> to know if you've experimented with.
> 
> -Alan.


I have not experimented with any such tunings.  I will do so and report back 
before spending a lot of time changing things.  Thank you for the pointer!


Best,
Steven

Reply via email to