Benoit Tellier created JAMES-2921:
-------------------------------------

             Summary: Improve blob store usage
                 Key: JAMES-2921
                 URL: https://issues.apache.org/jira/browse/JAMES-2921
             Project: James Server
          Issue Type: Wish
          Components: Blob
            Reporter: Benoit Tellier
         Attachments: adr-0003-blobstore-storage-policies.md, 
adr-0004-objectstorage-blobid-list.md

Please find attached the two technical decisions being proposed.

These needs performance feedback before being adopted.

h2. Proposal: Add storage policies for BlobStore

Introduce StoragePolicies at the level of the BlobStore API.

The proposed policies includes:

 - SizeBasedStoragePolicy: The blob underlying storage medium will be chosen 
depending of its size. 
 - LowCostStoragePolicy: The blob is expected to be saved in low cost storage. 
Access is expected to be unfrequent.
 - PerformantStoragePolicy: The blob is expected to be saved in performant 
storage. Access is expected to be unfrequent.

The UnionBlobStore will be reworked to choose between Cassandra and 
ObjectStorage implementations depending on the policies.

DeletedMessageVault, BlobExport & MailRepository will rely on 
LowCostStoragePolicy. Other BlobStore users will rely on SizeBasedStoragePolicy.

h2. Proposal: persist BlobIds for avoiding persisting several time the same 
blobs within ObjectStorage

Rely on a StoredBlobIdsList API to know which blob is persisted or not in 
object storage. Provide a Cassandra implementation of it. Located in blob-api 
for convenience, this in not a top level API. It is intended to be used by some 
blobStore implementations (here only ObjectStorage).

 - When saving a blob with precomputed blobId, we can check the existance of 
the blob in storage, avoiding possibly the expensive "save".
 - When saving a blob too big to precompute its blobId, once the blob had been 
stream using a temporary random blobId, copy operation can be avoided and the 
temporary blob could be directly removed.

Cassandra is faster doing "write every time" rather than "read before write" so 
we should not use the storedblob projection for it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to