Nobody took the bait on this yet? :-)

Certainly there's a lot of semi-myth on this topic, on both sides. Here's my source of mythology (or urban legend, as one might have it):

http://www.ibm.com/developerworks/java/library/j-jtp09275/index.html

My concern here is not so much about leaking of the StringBuilder held in a field, as Chris seemed to be responding to. I'd expect the ObjectInputStream to be GC'd at some point along with any StringBuilders it contains references to.

I'm more concerned about defeating any optimizations that might occur if local variables are converted to fields. The article referenced above mentions escape analysis and possible stack allocation of locally-created objects. Offhand I don't know if stack allocation occurs for any of the builders in ObjectInputStream, but it certainly cannot occur if references are stored in fields.

It would good if there were some evidence we could discuss instead of myths and urban legends. :-) Perhaps the original poster (Robert Stupp) can rerun some benchmarks with and without the conversion from locals to fields. (Earlier in the thread he posted some significant performance improvements, but my suspicion is that most of that improvement came from the conversion to use Unsafe.)

I'm mindful that this may require a lot of effort. I think it would take a fair bit of work to come up with a benchmark that shows any difference between the two cases. I'm also mindful that one's intuition is often wrong.

s'marks



On 1/31/14 5:53 PM, Vitaly Davidovich wrote:

One would have to measure of course, but intuitively, it seems like a good change to reuse the stringbuilder. There's also the issue that the local stringbuilder before was default-sized, and for large content, it would generate even more garbage as the underlying array is expanded.

The "temporary short lived allocations are cheap" is, unfortunately, a semi-myth, or at least somewhat misguided. It's true that individually they may be cheap, but they have macro effects. The higher the allocation rate, the more young GCs we get. Every young GC (on hotspot collectors, at least) is a STW pause. Bringing threads to a safepoint has some cost, especially if there are many of them on large many-core machines. With larger heaps these days, young gens tend to be larger as well. When GC runs, it trashes the cpu caches for the mutators, so when they resume, they may get cache misses. At each young GC, some objects may get promoted prematurely to tenured.

So no, I wouldn't say it's quite inexpensive :). When you have no option but to allocate, it's nice to have collectors that can handle those necessary allocations well. Otherwise, if it's a perf sensitive area and avoiding allocations doesn't obfuscate or make the code significantly less maintainable, it's usually better to avoid allocations.

Just my $.02

Sent from my phone

On Jan 31, 2014 2:06 PM, "Stuart Marks" <stuart.ma...@oracle.com <mailto:stuart.ma...@oracle.com>> wrote:

    On 1/31/14 10:46 AM, Chris Hegarty wrote:

        I think your patch can be split into two logical, independent, parts. 
The
        first is the use of unsafe to access the String UTF length. The seconds 
is
        to reuse, where possible, the existing StringBuilder instances,
        skipping of
        primitive/object reading/writing where applicable, and general cleanup.

        Since this is a very sensitive area I would like to consider these
        separately. To that end, I have taken the changes that are applicable
        to the
        latter, removed any subjective stylistic changes, and made some 
additional
        cleanup improvements.

        http://cr.openjdk.java.net/~chegar/serial_stupp.00/webrev/
        <http://cr.openjdk.java.net/%7Echegar/serial_stupp.00/webrev/>


    I agree with splitting the Unsafe usages so they can be reviewed
    separately. New
    and changed usage of Unsafe will require exacting scrutiny.

    In general, the cleanups and refactorings look fine.

    I have a question about the change to reuse StringBuilder instances. This
    replaces freshly allocated StringBuilders stored in local variables with
    reuse of a StringBuilder stored in a field of BlockDataInputStream, which
    in turn is stored in a field of ObjectInputStream. Thus, instead of
    creating of lots of temporaries that become gc-eligible very quickly, this
    creates a single long-lived object whose size is the high-water mark of
    the longest string that's been read. The change does reduce allocation
    pressure, but the point of generational GC is to make allocation and
    collection of temporaries quite inexpensive. Is this the right tradeoff?

    s'marks


Reply via email to