Hello all,

## Context

We are working on JMAP, and EMail::hasAttachments metadata is listed as
a fast property.

However to retrieve it today, we need to do a full message read in order
to load attachment (as JMAP hasAttachment do not take inlined
attachments into account and mailbox property do).

Also, while inspecting the code, MessageResult::getLoadedAttachments is
never used with attachment bytes. This means that given an email with a
10 MB attachment, upon GetMessages call with full profile, we are going
to read the full eml (10 MB) then load attachment bytes (10 MB) while
the attachment could have not been loaded in the first place. In our
little example we read 20MB while only 10 MB could have been necessary.

This attachment over-reading results in both performance and cost issue
on the object storage - what is the topic me, René and Duc are currently
working on.

## Involved POJOs

Attachment (mailbox-api)
 - id
 - type
 - bytes

MessageAttachment (mailbox-api)
 - attachment (of type Attachment)
 - name
 - cid
 - isInline

Attachment (jmap)
 - blobId (derived from attachmentId)
 - type
 - name
 - size
 - cid
 - isInline

DAOAttachment (mailbox-cassandra)
 - id
 - blobId
 - type
 - size

 - Message (mailbox-store) & MessageResult (mailbox-api) allows listing
attachments. Content usage includes:
    - Scanning search

## Proposal

Introduce a new POJO: MessageAttachmentMetadata (mailbox-api)
 - id
 - name
 - cid
 - isInline
 - size
 - type

 - Message (mailbox-store) & MessageResult (mailbox-api) SHOULD return
MessageAttachmentMetadata NOT MessageAttachment. Thus these metadata
will be added at the FetchGroup.profile.MINIMAL.

We need to port "scanning search" to do on the fly message parsing. This
is OK as:
 - memory-guice is not intending for production usage, no need to be
performant
 - Usage of scanning search is not exposed as a product
 - jpa and maildir do not store attachment so recomputation is the
current behaviour.

## Consequences

JMAP Email::hasAttachment property would rely on
FetchGroup.Profile.MINIMAL, allowing the implementation of
https://github.com/apache/james-project/blob/master/src/adr/0013-precompute-jmap-preview.md


JMAP GetMessages with full profile will read 2 time less data allowing
both a cost and performance improvment.

Note that all caller reading full message will also benefit from these
changes (IMAP fetch, mailbox backup, review recomputation)

## Alternative

We could merge MessageAttachment & Attachment together however this
would lead to significant datastructure re-arranging for no behavioural
gains and just a slightly more lean API.

Thus I propose not to takle this now.

## What is coming next

René cordier started working on this topic to get a proof of concept out.

I propose myself to write an ADR if need be (reusing the content here ;-) )

Best regards,

Benoit


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to