Re: IMAP FETCH management

2024-03-20 Thread Benoit TELLIER

Hello all,

Today I did put together a POC where the following IMAP command

    a0 FETCH 1:* (BODY[])

would directly stream content from the S3 storage without storing the 
full input in a byte array.


I did test it a bit manually on top of the S3 AES implementation.

Link: https://github.com/apache/james-project/pull/2137

While working on this I stumbled across ReactorUtils::toInputStream 
which do not implement available (returns 0) and always block when 
trying to access the next chunk of data.
This would defeat most of the benefits of Netty's ChuckedStream 
abstraction: a reliable available method allows polling on it in the 
enventLoop and send data as it is ready.
Feeling brave I decided to experiment with a subscriber bringing the 
gaps between the NIO world and the reactor word.

This work is incomplete as usage in real life situation causes crash.

Link: https://github.com/apache/james-project/pull/2138

Other consideration doing this is also the need to increase the count of 
S3 connection as they are going to stay open longer...


Those are advanced topics and I believe they would be crucial into 
making Apache James a better IMAP server...


Best regards,

Benoit TELLIER

On 19/03/2024 16:45, Benoit TELLIER wrote:

Hello all,

As I had already been writing here, I did encounter significant issues 
during a recent deployment [1]


[1] 
https://www.mail-archive.com/server-dev@james.apache.org/msg73848.html


This did lead to [2] implementing backpressure for IMAP FETCH. Which 
had been mitigating the issue.


[2] https://issues.apache.org/jira/projects/JAMES/issues/JAMES-3997

But not really well-enough. As the count of users/mails increases I 
ended up with some new OutOfMemory exception related to IMAP usage 
from this weekend.


I thus did take the time to write a test regarding backpressure [3] 
(not reading the socket and instrumenting the mailbox layer to see 
what is actually pulled) and started playing with some related Netty 
settings [4].


[3] https://github.com/apache/james-project/pull/2128

[4] https://github.com/apache/james-project/pull/2129

However high/low level write buffer watermarks seems ineffective: it 
takes dozens of several MB messages to be written for the 
back-pressure to quick-in. And the default values (32KB/64KB) are very 
low compared to a problematic message size. Netty expertise is more 
than welcome here!


Another problem is that as of today message content is loaded as a 
byte array by the mailbox layer. For a request like IMAP FETCH 
(BODY[]) this is ineffective and we could rather be streaming it 
straight from the object store (even applying backpressure from within 
a single message write). Yet this would require a major refactoring of 
mailbox / imap code. And also a bullet proof lifecycle management for 
connections/ temporary files.


Thoughts?

Benoit



-
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org



[BUILD-FAILURE]: Job 'james/ApacheJames/master [master] [1306]'

2024-03-20 Thread Apache Jenkins Server
BUILD-FAILURE: Job 'james/ApacheJames/master [master] [1306]':
Check console output at "https://ci-builds.apache.org/job/james/job/ApacheJames/job/master/1306/;>james/ApacheJames/master
 [master] [1306]"

-
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org