I did not know that about StringTableSize. I thought it was more of a hard
limit. That's good to know. Thanks

On Wed, Feb 8, 2017 at 2:16 PM, Joern Kottmann <[email protected]> wrote:

> The StringTableSize doesn't limit the amount of Strings that can be stored
> in the pool, if the size is too small it just gets slower.
> This would only be done for loading models, querying the model wouldn't be
> affected. The predicate / feature strings would be interned.
>
> Jörn
>
>
>
> On Wed, Feb 8, 2017 at 6:37 PM, Jeffrey Zemerick <[email protected]>
> wrote:
>
> > Would it be possible to have an option or setting somewhere that
> determines
> > if string pooling is used? The option would provide backward
> compatibility
> > in case someone has to adjust the -XX:StringTableSize because their
> > existing models exceed the default JVM limit, and an option would also be
> > useful for cases when the models were made from different data sources.
> > (I'm assuming in that case using string pooling would be detrimental to
> > performance.)
> >
> > Jeff
> >
> >
> > On Wed, Feb 8, 2017 at 5:50 AM, Joern Kottmann <[email protected]>
> wrote:
> >
> > > Hello all,
> > >
> > > I often run multiple models in production, often trained on the same
> data
> > > but with different types (typical name finder scenario). There could be
> > one
> > > model to detect person names, and another to detection locations. The
> > > predicate Strings inside those models are always the same but the
> models
> > > can't share the same String instance.
> > >
> > > I would like to propose that we use String.intern in the model reader
> to
> > > ensure one string is only loaded once.
> > >
> > > We tried that in the past and this caused lots of issues with PermGen
> > > space, but this was improved over time in Java. In Java 8 (on which we
> > > depend now) this should work properly.
> > >
> > > Here is an interesting article about it:
> > > http://java-performance.info/string-intern-in-java-6-7-8/
> > >
> > > Using String.intern will make the model loading a bit slower (we can
> > > benchmark that).
> > >
> > > Jörn
> > >
> >
>

Reply via email to