On Tue, 15 Apr 2025 14:12:49 GMT, Vicente Romero <vrom...@openjdk.org> wrote:

>> Please review this patch to fix a potential infinite loop in 
>> `JavadocTokenizer.ensure` when `map.length` and `size + need` approach 
>> Interger.MAX_VALUE.
>> 
>> While I couldn't reproduce the issue even with large inputs (~1.9GB java 
>> file where almost the entire file is one javadoc comment), the fix is about 
>> correctness and prevention of UB in extreme cases.
>> 
>> TIA
>
> src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavadocTokenizer.java
>  line 309:
> 
>> 307: 
>> 308:             while (need > grow && grow <= Integer.MAX_VALUE/2) {
>> 309:                 grow <<= 1;
> 
> I wonder why `map` is not an `ArrayList` which already have a grow strategy. 
> If this is because of performance issues I think that stop growing the array 
> as soon as `grow > Integer.MAX_VALUE / 2` is a bit premature as there is 
> still plenty of room for a less aggressive growing strategy

`OffsetMap` instances holding an `int[]` array are created and held in memory 
for each processed doc comment, so I think the optimized implementation is 
justified by boxing and memory overhead. However, individual array size is 
rather small, usually 2 elements per line and/or Unicode escape. This amounts 
to a few dozen elements for a typical doc comment, with the longest doc 
comments in JDK source reaching the low thousands. So the growth by doubling 
should be fine for all practical uses of this class, and supporting huge sizes 
(or overflowing capacity) are rather theoretical problems.

Regarding the proposed fix, I think I would prefer a one-stop solution rather 
than the two-step approach, by which I mean let the overflow happen, and check 
for `grow < 0` directly afterwards here in the loop.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24620#discussion_r2055923696

Reply via email to