Hi all,

As you probably already noticed, Stephan has triggered a discussion thread
about code style guide for Flink [1]. Recently we were discussing
internally some smaller concerns and I would like start separate threads
for them.

This thread is about creating collections always with initial capacity. As
you might have seen, some parts of our code base always initialise
collections with some non-default capacity. You can even activate a check
in IntelliJ Idea that can monitor and highlight creation of collection
without initial capacity.

Pros:
- performance gain if there is a good reasoning about initial capacity
- the capacity is always deterministic and does not depend on any changes
of its default value in Java
- easy to follow: always initialise, has IDE support for detection

Cons (for initialising w/o good reasoning):
- We are trying to outsmart JVM. When there is no good reasoning about
initial capacity, we can rely on JVM default value.
- It is even confusing e.g. for hash maps as the real size depends on the
load factor.
- It would only add minor performance gain.
- a bit more code, increases maintenance burden.

The conclusion is the following at the moment:
Only set the initial capacity if you have a good idea about the expected
size.

Please, feel free to share you thoughts.

Best,
Andrey

[1]
http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E

Reply via email to