This is an automated email from the ASF dual-hosted git repository. btellier pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/james-project.git
commit cb1380d41f97b85decf99d397bdcc35c849c6bcb Author: Benoit Tellier <[email protected]> AuthorDate: Sun Apr 12 16:53:57 2020 +0700 JAMES-3148 [ADR] Cassandra metadata cleanup upon deletion --- src/adr/0029-Cassandra-mailbox-deletion-cleanup.md | 44 ++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/src/adr/0029-Cassandra-mailbox-deletion-cleanup.md b/src/adr/0029-Cassandra-mailbox-deletion-cleanup.md new file mode 100644 index 0000000..d97e3cd --- /dev/null +++ b/src/adr/0029-Cassandra-mailbox-deletion-cleanup.md @@ -0,0 +1,44 @@ +# 29. Cassandra mailbox deletion cleanup + +Date: 2020-04-12 + +## Status + +Accepted (lazy consensus) + +## Context + +Cassandra is used within distributed James product to hold messages and mailboxes metadata. + +Cassandra holds the following tables: + - mailboxPathV2 + mailbox allowing to retrieve mailboxes informations + - acl + UserMailboxACL hold denormalized information + - messageIdTable & imapUidTable allow to retrieve mailbox context information + - messageV2 table holds message metadata + - attachmentV2 holds attachments for messages + - References to these attachments are contained within the attachmentOwner and attachmentMessageId tables + +Currently, the deletion only deletes the first level of metadata. Lower level metadata stay unreachable. The data looks +deleted but references are actually still present. + +Concretely: + - Upon mailbox deletion, only mailboxPathV2 & mailbox content is deleted. messageIdTable, imapUidTable, messageV2, + attachmentV2 & attachmentMessageId metadata are left undeleted. + - Upon mailbox deletion, acl + UserMailboxACL are not deleted. + - Upon message deletion, only messageIdTable & imapUidTable content are deleted. messageV2, attachmentV2 & + attachmentMessageId metadata are left undeleted. + +This jeopardize efforts to regain disk space and privacy, for example through blobStore garbage collection. + +## Decision + +We need to cleanup Cassandra metadata. They can be retrieved from dandling metadata after the delete operation had been +conducted out. We need to delete the lower levels first so that upon failures undeleted metadata can still be reached. + +This cleanup is not needed for strict correctness from a MailboxManager point of view thus it could be carried out +asynchronously, via mailbox listeners so that it can be retried. + +## Consequences + +Mailbox listener failures lead to eventBus retrying their execution, we need to ensure the result of the deletion to be +idempotent. \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
