On Tue, 15 Apr 2025 14:12:49 GMT, Vicente Romero <vrom...@openjdk.org> wrote:
>> Please review this patch to fix a potential infinite loop in >> `JavadocTokenizer.ensure` when `map.length` and `size + need` approach >> Interger.MAX_VALUE. >> >> While I couldn't reproduce the issue even with large inputs (~1.9GB java >> file where almost the entire file is one javadoc comment), the fix is about >> correctness and prevention of UB in extreme cases. >> >> TIA > > src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavadocTokenizer.java > line 309: > >> 307: >> 308: while (need > grow && grow <= Integer.MAX_VALUE/2) { >> 309: grow <<= 1; > > I wonder why `map` is not an `ArrayList` which already have a grow strategy. > If this is because of performance issues I think that stop growing the array > as soon as `grow > Integer.MAX_VALUE / 2` is a bit premature as there is > still plenty of room for a less aggressive growing strategy `OffsetMap` instances holding an `int[]` array are created and held in memory for each processed doc comment, so I think the optimized implementation is justified by boxing and memory overhead. However, individual array size is rather small, usually 2 elements per line and/or Unicode escape. This amounts to a few dozen elements for a typical doc comment, with the longest doc comments in JDK source reaching the low thousands. So the growth by doubling should be fine for all practical uses of this class, and supporting huge sizes (or overflowing capacity) are rather theoretical problems. Regarding the proposed fix, I think I would prefer a one-stop solution rather than the two-step approach, by which I mean let the overflow happen, and check for `grow < 0` directly afterwards here in the loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24620#discussion_r2055923696