tustvold opened a new issue, #1851: URL: https://github.com/apache/arrow-rs/issues/1851
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A while back I implemented an optimized string dictionary builder for [IOx](https://github.com/influxdata/influxdb_iox/blob/main/arrow_util/src/dictionary.rs). This contains two major tricks to provide better performance: * Use ahash instead of SipHash - this alone provides a 40% speedup * Use hashbrown's `raw_entry_mut` to not duplicate string values into the hashmap I have an implementation of this for arrow that needs a bit more polish, but leads to a 60% speedup over the current implementation in arrow. Unfortunately it depends on #1850 as it needs to be able to read the string data from an in-progress `StringBuilder` **Describe the solution you'd like** Implement #1850 and then add this functionality **Describe alternatives you've considered** We could not do this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
