[
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748756#comment-16748756
]
Mike Sokolov commented on LUCENE-8653:
--------------------------------------
The reverse reading is required because the FST serializes itself from an
Object-heavy DAG of Nodes and Arcs into an array of bytes by traversing the DAG
backwards, but writing forwards into the byte storage. And it optimizes
straight-line sections of the DAG by eliminating the explicit pointers and just
implicitly pointing to the (logically) next Node in the byte array, so "next"
here means *at the next lower byte address*. We can eliminate this reversal by
reversing the byte array after serialization and fixing-up the explicit
pointers when we read them. We can't really fix them up in place without more
major surgery because they are VInts.
> Reverse FST storage so it can be read forward
> ---------------------------------------------
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/FSTs
> Reporter: Mike Sokolov
> Priority: Major
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can
> be read forward in order to be more cache-friendly and align better with
> standard I/O practice. Today FSTs are read in reverse and this leads to some
> awkwardness, and you can't use standard readers so the code can be confusing
> to work with.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]