[ 
https://issues.apache.org/jira/browse/NET-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory updated NET-709:
--------------------------------
    Fix Version/s: 3.11.2
                       (was: 3.11.1)

> IMAP Memory considerations with large ‘FETCH’ sizes.
> ----------------------------------------------------
>
>                 Key: NET-709
>                 URL: https://issues.apache.org/jira/browse/NET-709
>             Project: Commons Net
>          Issue Type: Improvement
>          Components: IMAP
>    Affects Versions: 3.8.0
>            Reporter: Anders
>            Priority: Minor
>              Labels: IMAP, buffer, chunking, large, literal, memory, partial
>             Fix For: 3.11.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> h2. *IMAP Memory considerations with large ‘FETCH’ sizes.*
>  
> The following comments concern classes in the 
> [org.apache.common.net.imap|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/package-summary.html]
>  package.
>  
> Consider the following imap ‘fetch’ exchange between a client (>) and server 
> (<):
> {{> A654 FETCH 1:2 (BODY[TEXT])}}
> {{{}< * 1 FETCH (BODY[TEXT] {*}{{*}{*}80000000{*}{*}}{*}\r\n{}}}{{{}…{}}}
> {{< * 2 FETCH …}}
> {{< A654 OK FETCH completed}}
>  
> The first untagged response (* 1 FETCH …) contains a literal \{80000000} or 
> 80MB.
>  
> After reviewing the 
> [source|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298],
>  it is my understanding, the entire 80MB sequence of data will be read into 
> Java memory even when using  
> ‘[IMAPChunkListener|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.IMAPChunkListener.html]’.
>  According the the documentation: 
>  
> {quote}Implement this interface and register it via 
> [IMAP.setChunkListener(IMAPChunkListener)|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.html#setChunkListener-org.apache.commons.net.imap.IMAP.IMAPChunkListener-]
>  in order to get access to multi-line partial command responses. Useful when 
> processing large FETCH responses.
> {quote}
>  
> It is apparent the partial fetch response is read in full (80MB) before 
> invoking the ‘IMAPChunkListener’ and then discarding the read lines (freeing 
> up memory).
>  
> Back to the example:
> > A654 FETCH 1:2 (BODY[TEXT])
> < * 1 FETCH (BODY[TEXT] \{80000000}\r\n
> *{color:#ff0000}…. <— read in full into memory then discarded after calling 
> IMAPChunkListener{color}*
> < * 2 FETCH (BODY[TEXT] \{250}\r\n
> {color:#ff0000}*…. <— read in full into memory then discarded after calling 
> IMAPChunkListener*{color}
> < A654 OK FETCH completed
>  
> Above, you can see the chunk listener is good for each individual partial 
> fetch response but does not prevent a large partial from being loaded into 
> memory.
>  
> Let’s review the 
> [code|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298]:
>  
> [ 
> 296|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l296]
>                  int literalCount = IMAPReply.literalCount(line);
> {color:#ff0000}Above counts the size of the literal, in our case 80000000 or 
> 80MB (for the first partial fetch response).{color}
>  
>  
> [ 
> 297|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l297]
>                  final boolean isMultiLine = literalCount >= 0;
> [ 
> 298|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298]
>                  while (literalCount >= 0) {
> [ 
> 299|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l299]
>                      line=_reader.readLine();
> [ 
> 300|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l300]
>                      if (line == null)  
> {                                  throw new EOFException("Connection closed 
> without indication.");   }
> [ 
> 303|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l303]
>                      replyLines.add(line);
> [ 
> 304|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l304]
>                      literalCount -= line.length() + 2; // Allow for CRLF
> [ 
> 305|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l305]
>                  }
> {color:#ff0000}The literal count above starts at 80000000 and is decremented 
> until reaching a negative non zero value where 80MB is read in full and while 
> loop returns.{color}
>  
> [ 
> 306|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l306]
>                  if (isMultiLine) {
> [ 
> 307|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l307]
>                      final IMAPChunkListener il = chunkListener;
> [ 
> 308|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l308]
>                      if (il != null) {
> [ 
> 309|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l309]
>                          final boolean clear = il.chunkReceived(this);
> [ 
> 310|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l310]
>                          if (clear) {
> [ 
> 311|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l311]
>                              fireReplyReceived(IMAPReply.PARTIAL, 
> getReplyString());
> [ 
> 312|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l312]
>                              replyLines.clear();
> {color:#ff0000}Now, after all 80MB is loaded into memory in full, invoke the 
> IMAPChunkListener and throw away the lines freeing memory.{color}
>  
> [ 
> 313|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l313]
>                          }
> [ 
> 314|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l314]
>                      }
> [ 
> 315|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l315]
>                  }
>  
> I’m considering modifying the getReply() method, shown above, to chunk the 
> partial responses breaking up the literal so that it’s not loaded into memory 
> in full. This is to prevent the entire 80MB literal value from being loaded 
> into memory. 
>  
> This would be configurable as not to break the existing users of the API. 
> Something like .setBreakLargeLiteralSize(true), when breakUpLargeLiteralSize 
> is true, a maxLiteralBuffer value is used to chunk the literal preventing all 
> 80MB from being loaded in full, instead loading chunks of it. This would 
> require implementations of IMAPChunkListener to handle this behavior if it 
> was turned on. The default behavior will see this chunking disabled as to not 
> break the existing users. Essentially an opt-in feature reducing the risk.
>  
> *{color:#403294}What are you thoughts or concerns with this? Do you 
> agree?{color}*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to