itschrispeck opened a new pull request, #13776: URL: https://github.com/apache/pinot/pull/13776
Enhances existing `noRawDataForTextIndex` config to skip writing raw data when re-using the mutable index is enabled. This PR also contains a fix for configs added in [this PR](https://github.com/apache/pinot/pull/13577), which didn't take effect but instead globally disabled the re-use index feature (only default values were used). The implementation is very similar to the existing `SegmentColumarIndexCreator` implementation for this config, which uses a `SameValueForwardIndexCreator` class in the same way. I'm open to suggestions if there's a better way to accomplish this, though the current implementation keeps the logic relatively isolated. It seems a bit weird to me that a text index config should affect the forward index, long term I think this can be accomplished generically w/ forward index configs. In the short term, I think the enhancement provides significant benefits (especially when used with `SchemaConformingTransformer`). Some metrics from this are shown below, the improvements should be fairly evident: <img width="592" alt="image" src="https://github.com/user-attachments/assets/e38b53f6-fcf2-4378-bfa2-8254b56b2e0c"> <img width="592" alt="image" src="https://github.com/user-attachments/assets/f9fbcdc6-82ef-4ea1-a2a8-98f81f1f7816"> <img width="592" alt="image" src="https://github.com/user-attachments/assets/ca63140f-26cc-4997-adb1-27dd0bd3aaf2"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
