Tim-Brooks commented on PR #15886: URL: https://github.com/apache/lucene/pull/15886#issuecomment-4143967453
This is motivated by scenarios where the user is mostly lightly indexing (docvalues, stored fields, etc; with few indexed fields). In scenarios like this the actual validation of the schema starts to dominating the processDocument cost. This change does not make a difference in lunceneutil benchmarks which are inverted indices heavy. However, in macro docvalue oriented runs in Elasticsearch it has a 10% impact on the documents per second (combined with another small parent field handling change I will follow-up with). Hopefully there would be some interest in this optimization which targets the extremely common Lucene case of freezing and re-using the same FieldType instance. <img width="955" height="437" alt="image" src="https://github.com/user-attachments/assets/d8995cd4-0fa5-408b-a43c-b98a0faaea6a" /> <img width="718" height="334" alt="image" src="https://github.com/user-attachments/assets/e7fa41f8-8178-4780-b7f2-c88adedf9dea" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
