dungba88 opened a new issue, #12697:
URL: https://github.com/apache/lucene/issues/12697
### Description
After writing the FSTStore-backed FST to DataOutput, and specifying a
different DataOutput for meta, if we try to read from these (using the FST
public ctor) we will get the following the exception:
```
java.lang.ArrayIndexOutOfBoundsException: Index 17 out of bounds for length
17
at
__randomizedtesting.SeedInfo.seed([CBCB30F6D2F8FEA1:821F24747AC56DDD]:0)
at
org.apache.lucene.store.ByteArrayDataInput.readVLong(ByteArrayDataInput.java:133)
at org.apache.lucene.util.fst.FST.<init>(FST.java:494)
at org.apache.lucene.util.fst.FST.<init>(FST.java:443)
```
The reason is that, when writing to metadata, if the FST is backed by
FSTStore, it would not write the numBytes:
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/FST.java#L555-L562
The numBytes is instead written by FSTStore to the main DataOutput:
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/OnHeapFSTStore.java
Thus if we set the metaOut and dataOut as the same DataOutput, they will
subsequently write the numBytes correctly. However if we use different
DataOutput, the metaOut will thus lack of the numBytes and cause the index out
of bounds exception.
To illustrate:
When writing on the same DataOutput
```
[ HEADER ] [ EMPTY_OUTPUT_FLAG ] [ EMPTY_OUTPUT ] [INPUT_TYPE ] [ START_NODE
] [ NUM_BYTES ] [ MAIN ]
```
When writing on the different DataOutput
```
metaOut: [ HEADER ] [ EMPTY_OUTPUT_FLAG ] [ EMPTY_OUTPUT ] [INPUT_TYPE ] [
START_NODE ]
dataOut: [ NUM_BYTES ] [ MAIN ]
```
I can put a fix to this
### Version and environment details
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]