tang-hi opened a new issue, #12261:
URL: https://github.com/apache/lucene/issues/12261
### Description
## Description
I noticed that when Lucene processes stored fields, it saves
`numStoredFields` and `lengths` as shown in the code below:
```java
// save numStoredFields
saveInts(numStoredFields, numBufferedDocs, fieldsStream);
// save lengths
saveInts(lengths, numBufferedDocs, fieldsStream);
private static void saveInts(int[] values, int length, DataOutput out)
throws IOException {
if (length == 1) {
out.writeVInt(values[0]);
} else {
StoredFieldsInts.writeInts(values, 0, length, out);
}
}
```
Source:
[Lucene90CompressingStoredFieldsWriter.java](https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsWriter.java#L217-L221)
During the `StoredFieldsInts.writeInts(values, 0, length, out);` operation,
the method iterates through the `values` array to check if all the elements are
the same. I propose an optimization to potentially increase performance by
avoiding this loop.
My suggestion is to add two new fields: `numStoredFieldsAllSame` and
`docLengthAllSame`. These fields would track whether all the elements in the
respective arrays are the same during insertion. By passing these fields to the
`writeInts` method, we can eliminate the need for the loop.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]