Hi,

> - a bit more code, increases maintenance burden.

I think there is even more to that. It’s almost like a code duplication, albeit 
expressed in very different way, with all of the drawbacks of duplicated code: 
initial capacity can drift out of sync, causing confusion. Also it’s not “a bit 
more code”, it might be non trivial reasoning/calculation how to set the 
initial value. Whenever we change something/refactor the code, "maintenance 
burden” will mostly come from that. 

Also I think this just usually falls under a premature optimisation rule.

Besides:

> The conclusion is the following at the moment:
> Only set the initial capacity if you have a good idea about the expected size.

I would add a clause to set the initial capacity “only for good proven 
reasons”. It’s not about whether we can set it, but whether it makes sense to 
do so (to avoid the before mentioned "maintenance burden”).

Piotrek

> On 1 Aug 2019, at 14:41, Xintong Song <tonysong...@gmail.com> wrote:
> 
> +1 on setting initial capacity only when have good expectation on the
> collection size.
> 
> Thank you~
> 
> Xintong Song
> 
> 
> 
> On Thu, Aug 1, 2019 at 2:32 PM Andrey Zagrebin <and...@ververica.com> wrote:
> 
>> Hi all,
>> 
>> As you probably already noticed, Stephan has triggered a discussion thread
>> about code style guide for Flink [1]. Recently we were discussing
>> internally some smaller concerns and I would like start separate threads
>> for them.
>> 
>> This thread is about creating collections always with initial capacity. As
>> you might have seen, some parts of our code base always initialise
>> collections with some non-default capacity. You can even activate a check
>> in IntelliJ Idea that can monitor and highlight creation of collection
>> without initial capacity.
>> 
>> Pros:
>> - performance gain if there is a good reasoning about initial capacity
>> - the capacity is always deterministic and does not depend on any changes
>> of its default value in Java
>> - easy to follow: always initialise, has IDE support for detection
>> 
>> Cons (for initialising w/o good reasoning):
>> - We are trying to outsmart JVM. When there is no good reasoning about
>> initial capacity, we can rely on JVM default value.
>> - It is even confusing e.g. for hash maps as the real size depends on the
>> load factor.
>> - It would only add minor performance gain.
>> - a bit more code, increases maintenance burden.
>> 
>> The conclusion is the following at the moment:
>> Only set the initial capacity if you have a good idea about the expected
>> size.
>> 
>> Please, feel free to share you thoughts.
>> 
>> Best,
>> Andrey
>> 
>> [1]
>> 
>> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
>> 

Reply via email to