Yes I think so too. Apparently this can be a ram hog if you have zillions of fields?
I plan to post an issue with more details over the weekend when I will have the time to write up the results I saw. On Fri, Apr 26, 2019, 10:31 AM David Smiley <david.w.smi...@gmail.com> wrote: > Maybe what you propose would allow FST.cachedRootArcs to be eliminated? > Basically your algorithm could decide that the root would always be direct > access. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Thu, Apr 25, 2019 at 5:18 PM Dawid Weiss <dawid.we...@gmail.com> wrote: > >> > I'm curious how the hot cache you describe would be maintained and >> > accessed. >> >> The only gain I was successful at was with static precomputed cache. >> Calculated from a priori data (hot paths) or, in the extreme, the >> entire FST is converted to a hash map. :) This allows very fast >> lookups of any node-arc pair, but is not usable for enumerating >> outgoing arcs (for example)... It's really something we did to >> accelerate millions and millions of lookups over the FST (in an >> otherwise not even Lucene-related data structure). I don't think it'll >> be generally useful. >> >> > I know eg that the current implementation has a cache of the >> > first 128 "root" arcs, which I guess are presumed to be >> > highly-visited, but I think what you are describing is a more dynamic >> > cache based on usage? >> >> I don't think it's an assumption of high-visited ratio... it's just a >> limit so that we don't cache very extreme root node fan-outs. The >> choice of this limit is very arbitrary I'd say... >> >> > Were you thinking that one would maintain an LRU >> > cache say? Or was this some offline analysis you did based on access >> > patterns? >> >> On patterns. So not generally useful. >> >> Don't get me wrong -- try to experiment with that constant-expanded >> array... This can be useful especially for nodes close to the root (if >> they're dense)... So definitely worth looking into. >> >> Dawid >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >>