[ https://issues.apache.org/jira/browse/CODEC-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833916#comment-17833916 ]
Gary D. Gregory edited comment on CODEC-323 at 4/4/24 12:39 PM: ---------------------------------------------------------------- Hello [~arthur.chan] I don't think this is the case in git master at least. Check there and see what you think. was (Author: garydgregory): I don't think this is the case in git master at least. Check there and see what you think. > Possible Out-of-Memory problem in Apache Commons Codec PhoneticEngine class > --------------------------------------------------------------------------- > > Key: CODEC-323 > URL: https://issues.apache.org/jira/browse/CODEC-323 > Project: Commons Codec > Issue Type: Improvement > Reporter: Sheung Chi Chan > Priority: Minor > > In the constructor of Apache Commons Codec PhoneticEngine class, the last > parameter maxPhonemes accepts any integer. Although a negative or zero > maxPhonemes value is rejected in a later stage, a very large integer still > passes the checking. The maxPhonemes variable is used later in the apply() > method to create a LinkedHashSet object, passing by the invoke() method in > the PhoneticBuilder object stored in the PhoneticEngine object. By Java > settings, the creation of LinkedHashSet objects won’t allocate all memory > immediately. It will allocate a small amount of memory and when more memory > is needed, the resize() method is called to request more memory. Thus > creating the LinkedHashSet object with a large integer size will not result > in errors immediately. When the logic tries adding items to the created > LinkedHashSet object, it will first check if the number of elements in the > set is larger than the provided maxPhonemes. The new element will be added to > the set if and only if the current size of the set is smaller than the > maxPhonemes. Thus if a very large maxPhonemes is provided, a large amount of > new data could be added to the set. It could easily use up the memory because > new elements could be added to the set. This causes a possible out-of-memory > problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)