> I'm curious how the hot cache you describe would be maintained and
> accessed.

The only gain I was successful at was with static precomputed cache.
Calculated from a priori data (hot paths) or, in the extreme, the
entire FST is converted to a hash map. :) This allows very fast
lookups of any node-arc pair, but is not usable for enumerating
outgoing arcs (for example)... It's really something we did to
accelerate millions and millions of lookups over the FST (in an
otherwise not even Lucene-related data structure). I don't think it'll
be generally useful.

> I know eg that the current implementation has a cache of the
> first 128 "root" arcs, which I guess are presumed to be
> highly-visited, but I think what you are describing is a more dynamic
> cache based on usage?

I don't think it's an assumption of high-visited ratio... it's just a
limit so that we don't cache very extreme root node fan-outs. The
choice of this limit is very arbitrary I'd say...

> Were you thinking that one would maintain an LRU
> cache say? Or was this some offline analysis you did based on access
> patterns?

On patterns. So not generally useful.

Don't get me wrong -- try to experiment with that constant-expanded
array... This can be useful especially for nodes close to the root (if
they're dense)... So definitely worth looking into.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to