Stefan Richter created FLINK-32410:
--------------------------------------
Summary: Allocate hash-based collections with sufficient capacity
for expected size
Key: FLINK-32410
URL: https://issues.apache.org/jira/browse/FLINK-32410
Project: Flink
Issue Type: Improvement
Reporter: Stefan Richter
Assignee: Stefan Richter
Fix For: 1.19.0
The JDK API to create hash-based collections for a certain capacity is arguable
misleading because it doesn't size the collections to "hold a specific number
of items" like you'd expect it would. Instead it sizes it to hold load-factor%
of the specified number.
For the common pattern to allocate a hash-based collection with the size of
expected elements to avoid rehashes, this means that a rehash is essentially
guaranteed.
We should introduce helper methods (similar to Guava's
`Maps.newHashMapWithExpectedSize(int)`) for allocations for expected size and
replace the direct constructor calls with those.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)