This is an automated email from the ASF dual-hosted git repository.

rcordier pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/james-project.git

commit 8efc932aa6a942f292a0f118b3d9ce37005034c3
Author: Tran Tien Duc <[email protected]>
AuthorDate: Mon Feb 17 15:48:42 2020 +0700

    JAMES-3052 Solving Cassandra inconsistencies Administration Procedures
---
 .../server/manage-guice-distributed-james.md       | 70 +++++++++++++++++++++-
 src/site/markdown/server/manage-webadmin.md        |  9 ++-
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/src/site/markdown/server/manage-guice-distributed-james.md 
b/src/site/markdown/server/manage-guice-distributed-james.md
index 6be9cb3..126a02c 100644
--- a/src/site/markdown/server/manage-guice-distributed-james.md
+++ b/src/site/markdown/server/manage-guice-distributed-james.md
@@ -20,7 +20,8 @@ advanced users.
  - [Mailbox Event Bus](#mailbox-event-bus)
  - [Mail Processing](#mail-processing)
  - [ElasticSearch Indexing](#elasticsearch-indexing)
-
+ - [Solving cassandra inconsistencies](#solving-cassandra-inconsistencies) 
+ 
 ## Overall architecture
 
 Guice distributed James server intends to provide a horizontally scalable 
email server.
@@ -260,3 +261,70 @@ by setting the parameter 
`elasticsearch.index.mailbox.name` to the name of your
 re-creates index upon restart
 
 _Note_: keep in mind that reindexing can be a very long operation depending on 
the volume of mails you have stored.
+
+## Solving cassandra inconsistencies
+
+Cassandra backend uses data duplication to workaround Cassandra query 
limitations. 
+However, Cassandra is not doing transaction when writing in several tables, 
+this can lead to consistency issues for a given piece of data. 
+The consequence could be data that is in transient state (that should never 
appear outside of the system).
+
+Because of the lack of transactions, it's hard to prevent these kind of 
issues. We had developed some features to 
+fix some existing cassandra inconsistency issues that had been reported to 
James. 
+
+Here is the list of known inconsistencies:
+ - [RRT (RecipientRewriteTable) mapping 
sources](#rrt-recipientrewritetable-mapping-sources)
+ - [Jmap message fast view projections](#jmap-message-fast-view-projections)
+ - [Mailboxes](#mailboxes)
+
+### RRT (RecipientRewriteTable) mapping sources
+
+`rrt` and `mappings_sources` tables store information about address mappings. 
+The source of truth is `rrt` and `mappings_sources` is the projection table 
containing all 
+mapping sources.
+
+#### How to detect the inconsistencies
+
+Right now there's no tool for detecting that, we're proposing a [development 
plan](https://issues.apache.org/jira/browse/JAMES-3069). 
+By the mean time, the recommendation is to execute the `SolveInconsistencies` 
task below 
+in a regular basis. 
+
+#### How to solve
+
+Execute the Cassandra mapping `SolveInconsistencies` task described in 
[webadmin 
documentation](https://james.apache.org/server/manage-webadmin.html#Operations_on_mappings_sources)
 
+
+### Jmap message fast view projections
+
+When you read a Jmap message, some calculated properties are expected to be 
fast to retrieve, like `preview`, `hasAttachment`. 
+James achieves it by pre-calculating and storing them into a message 
projection table(`message_fast_view_projection`). 
+Consequently the following fetches are optimized by reading directly from the 
projection table instead of calculating it again. 
+The underlying data is immutable so there's no inconsistency risk if the 
projections is outdated. 
+But still you can face a performance issue, how bad it is depends on how much 
the projection is lagging behind.
+
+#### How to detect the outdated projections
+
+You can watch the `MessageFastViewProjection` health check at [webadmin 
documentation](manage-webadmin.html#Check_all_components). 
+It provides a check based on the ratio of missed projection reads.  
+
+#### How to solve
+ 
+Since the MessageFastViewProjection is self healing, you should be concerned 
only if 
+the health check still returns `degraded` for a while, there's a possible 
thing you 
+can do is looking at James logs for more clues. 
+
+### Mailboxes
+
+`mailboxPath` and `mailbox` tables share common fields like `mailboxId` and 
mailbox `name`. 
+A successful operation of creating/renaming/delete mailboxes has to succeed at 
updating `mailboxPath` and `mailbox` table. 
+Any failure on creating/updating/delete records in `mailboxPath` or `mailbox` 
can produce inconsistencies.
+
+#### How to detect the inconsistencies
+
+If you found the suspicious `MailboxNotFoundException` in your logs. 
+Currently, there's no dedicated tool for that, we recommend scheduling 
+the SolveInconsistencies task below for the mailbox object on a regular basis, 
+avoiding peak traffic in order to address both inconsistencies diagnostic and 
fixes.
+
+#### How to solve
+
+Under development: Task for solving mailbox inconsistencies
\ No newline at end of file
diff --git a/src/site/markdown/server/manage-webadmin.md 
b/src/site/markdown/server/manage-webadmin.md
index 76dccc6..40d5d3f 100644
--- a/src/site/markdown/server/manage-webadmin.md
+++ b/src/site/markdown/server/manage-webadmin.md
@@ -99,7 +99,14 @@ Supported health checks include:
  - **ElasticSearch Backend**: ElasticSearch storage. Included in Cassandra 
Guice based products.
  - **Guice application lifecycle**: included in all Guice products.
  - **JPA Backend**: JPA storage. Included in JPA Guice based products.
- - **MessageFastViewProjection**: included in memory and Cassandra based Guice 
products.
+ - **MessageFastViewProjection**: included in memory and Cassandra based Guice 
products. 
+ Health check of the component storing JMAP properties which are fast to 
retrieve. 
+ Those properties are computed in advance from messages and persisted in order 
to archive a better performance. 
+ There are some latencies between a source update and its projections updates. 
+ Incoherency problems arise when reads are performed in this time-window. 
+ We piggyback the projection update on missed JMAP read in order to decrease 
the outdated time window for a given entry. 
+ The health is determined by the ratio of missed projection reads. (lower than 
10% causes `degraded`)
+
  - **RabbitMQ backend**: RabbitMQ messaging. Included in Distributed Guice 
based products.
 
 Response codes:


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to