[ https://issues.apache.org/jira/browse/NET-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gary D. Gregory updated NET-709: -------------------------------- Fix Version/s: 3.11.2 (was: 3.11.1) > IMAP Memory considerations with large ‘FETCH’ sizes. > ---------------------------------------------------- > > Key: NET-709 > URL: https://issues.apache.org/jira/browse/NET-709 > Project: Commons Net > Issue Type: Improvement > Components: IMAP > Affects Versions: 3.8.0 > Reporter: Anders > Priority: Minor > Labels: IMAP, buffer, chunking, large, literal, memory, partial > Fix For: 3.11.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > h2. *IMAP Memory considerations with large ‘FETCH’ sizes.* > > The following comments concern classes in the > [org.apache.common.net.imap|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/package-summary.html] > package. > > Consider the following imap ‘fetch’ exchange between a client (>) and server > (<): > {{> A654 FETCH 1:2 (BODY[TEXT])}} > {{{}< * 1 FETCH (BODY[TEXT] {*}{{*}{*}80000000{*}{*}}{*}\r\n{}}}{{{}…{}}} > {{< * 2 FETCH …}} > {{< A654 OK FETCH completed}} > > The first untagged response (* 1 FETCH …) contains a literal \{80000000} or > 80MB. > > After reviewing the > [source|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298], > it is my understanding, the entire 80MB sequence of data will be read into > Java memory even when using > ‘[IMAPChunkListener|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.IMAPChunkListener.html]’. > According the the documentation: > > {quote}Implement this interface and register it via > [IMAP.setChunkListener(IMAPChunkListener)|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.html#setChunkListener-org.apache.commons.net.imap.IMAP.IMAPChunkListener-] > in order to get access to multi-line partial command responses. Useful when > processing large FETCH responses. > {quote} > > It is apparent the partial fetch response is read in full (80MB) before > invoking the ‘IMAPChunkListener’ and then discarding the read lines (freeing > up memory). > > Back to the example: > > A654 FETCH 1:2 (BODY[TEXT]) > < * 1 FETCH (BODY[TEXT] \{80000000}\r\n > *{color:#ff0000}…. <— read in full into memory then discarded after calling > IMAPChunkListener{color}* > < * 2 FETCH (BODY[TEXT] \{250}\r\n > {color:#ff0000}*…. <— read in full into memory then discarded after calling > IMAPChunkListener*{color} > < A654 OK FETCH completed > > Above, you can see the chunk listener is good for each individual partial > fetch response but does not prevent a large partial from being loaded into > memory. > > Let’s review the > [code|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298]: > > [ > 296|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l296] > int literalCount = IMAPReply.literalCount(line); > {color:#ff0000}Above counts the size of the literal, in our case 80000000 or > 80MB (for the first partial fetch response).{color} > > > [ > 297|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l297] > final boolean isMultiLine = literalCount >= 0; > [ > 298|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298] > while (literalCount >= 0) { > [ > 299|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l299] > line=_reader.readLine(); > [ > 300|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l300] > if (line == null) > { throw new EOFException("Connection closed > without indication."); } > [ > 303|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l303] > replyLines.add(line); > [ > 304|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l304] > literalCount -= line.length() + 2; // Allow for CRLF > [ > 305|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l305] > } > {color:#ff0000}The literal count above starts at 80000000 and is decremented > until reaching a negative non zero value where 80MB is read in full and while > loop returns.{color} > > [ > 306|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l306] > if (isMultiLine) { > [ > 307|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l307] > final IMAPChunkListener il = chunkListener; > [ > 308|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l308] > if (il != null) { > [ > 309|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l309] > final boolean clear = il.chunkReceived(this); > [ > 310|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l310] > if (clear) { > [ > 311|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l311] > fireReplyReceived(IMAPReply.PARTIAL, > getReplyString()); > [ > 312|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l312] > replyLines.clear(); > {color:#ff0000}Now, after all 80MB is loaded into memory in full, invoke the > IMAPChunkListener and throw away the lines freeing memory.{color} > > [ > 313|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l313] > } > [ > 314|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l314] > } > [ > 315|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l315] > } > > I’m considering modifying the getReply() method, shown above, to chunk the > partial responses breaking up the literal so that it’s not loaded into memory > in full. This is to prevent the entire 80MB literal value from being loaded > into memory. > > This would be configurable as not to break the existing users of the API. > Something like .setBreakLargeLiteralSize(true), when breakUpLargeLiteralSize > is true, a maxLiteralBuffer value is used to chunk the literal preventing all > 80MB from being loaded in full, instead loading chunks of it. This would > require implementations of IMAPChunkListener to handle this behavior if it > was turned on. The default behavior will see this chunking disabled as to not > break the existing users. Essentially an opt-in feature reducing the risk. > > *{color:#403294}What are you thoughts or concerns with this? Do you > agree?{color}* -- This message was sent by Atlassian Jira (v8.20.10#820010)