Have a good one.
Matt
Am 08.03.25 um 16:12 schrieb Benoit TELLIER:
though Cassandra supports multiDC cross availability zone well
this dont' mean all Cassandra implems do
And James don't:
- IMAP reliand on incrematal monotic counters means strong concistency
which don't play well with high latencies (2-4 rountrips)
- multiple levels of metadata makes it inconsistencies prone if not
operated with quorum consistency - and quorum consistency means cross
availability read and writes which is a latency and throughtput show stoper.
TL DR: James distributed server can work on multiDC, but with
significant shortcomings, and only with region-wide set up, not world wide
setup
--
Best regards,
Benoit TELLIER
General manager of Linagora VIETNAM.
Product owner for Team-Mail product.
Chairman of the Apache James project.
Mail: btell...@linagora.com
Tel: (0033) 6 77 26 04 58 (WhatsApp, Signal)
On Mar 8, 2025 10:48 AM, from Jean Helou <jhe...@apache.org>Hi Matt,
This has turned into a rather long answer. The first part is more about
james in general, the second is more about your specific setup :)
As far as I'm aware James itself is stateless. I don't think you loose
counter values when you restart your main server.
This, you should be able to spin as many James instances as you want and
point them to the same storage without issues. Even if there are some
asynchronous state updates the state should eventually converge.
The difficulty is distributed storage not distributed processing.
For instance of you spin a mariadb on one or your new VPs and reload a
backup from you main mariadb the states of both databases will
immediately
start to diverge as they are unaware of each other, new messages
delivered
to your main since the backup will not be visible to the VPs, messages
read
on the vps will still appear unread on the main server.
From there you will want to look into replication but simple
primary/secondary replication will throw errors on writes to the
secondary
making you secondary James instance fill error logs on failed writes.
The next step is multimaster replication which is something I never
tried.
The distributed james app demonstrates a fully distributed system :
including a distributed database (Cassandra), a distributed message
broker
(rabbitmq iirc), a distributed search engine (opensearch), etc.
This allows you to have as many James nodes as you want, all talking to
as
many messaging/storage nodes as you want. All fully synced and with write
semantics that offer a reasonable consistency. This is a setup that makes
sense for massive deployments. If you wanted to build the next google
mail
for example.
The use of blob storage (S3 like) to store message contents is an
orthogonal concern. Database storage is fairly expensive compared to blob
storage. And storing large blobs in databases while doable is usually
not
recommended, at least not without specific table design. The same is true
for message brokers.
The alternatives are storing on the file system, which is not distributed
or using a blob store.
I'm almost certain you can configure the distributed app (or build a
variant of it) that does not use blob storage but I wouldn't recommend
it.
Now, how all this applies to your setup :)
My understanding is that for now you have a single rather powerful
machine
hosting both James and mariadb. The james instance handles both SMTP and
IMAP or POP.
I'll also assume that you don't intend to start operating a multi DC
Cassandra cluster :)
Finally I'll assume the VPS are rather small at this price :)
If they are large enough to host a clone of your main Mariadb and it's
data
you can use one for a mariadb and another for James.start from a backup
of
the main Mariadb then use IMAP sync to have eventual consistency between
mailboxes on your main server and the replica.
You can go further and spread the workload of the main server too
You start a James instance configured for IMAP/POP on a couple vps
instances, keep the db config to talk to the main Mariadb. Change your
clients config and eventually you can drop the corresponding listeners on
the main server if you want
Do the same for SMTP and put the new ones at a higher priority than the
instance running on the main server, after a while you can even stop the
main server James process entirely :)
The downside of course is increased latency both from client to vps but
also from vps to vps or to the main database server.
I hope that opens venues for exploration:)
Have fun
Le sam. 8 mars 2025 à 03:06, cryptearth <cryptea...@cryptearth.de
.invalid>
a écrit :
Hello there dear James devs and fellow James users,
my hoster OVH currently offers me a great deal on VPSs for less than 12
bucks a year (less than 1 buck per month) in several datacenters around
the world. I really tempt to get that deal as I have some ideas to
utilize multiple servers - having them around the world like in
Australia and Canada is just a bonus.
One thing I plan to implement is to setup James on each of the servers.
But then the question came up: How to synchronize them?
Currently I use my home server only as a backup without any
synchronization with my main root server. In fact: It's currently not
running due to some issues I have with my home server I have to fix
first before get James running again.
Now when scaling up to several servers around the world it would be cool
to take advantage of that by combine them with synchronization. But as
the additional systems are VPSs only I'd like to setup a master-slave
setup with each slave James on the VPSs sync up to the master James on
my powerful root server.
First I thought about fetchmail to at least pull in mails from the
slaves to the master - but fetchmail is only part of the deprecated
spring build. As I like to have my mailstorage in a database I would
like to keep using the guice-jpa build instead of switching the the
guice-distributed which doesn't use jpa and seems to be meant for use
with AWS S3 buckets.
I also could write some java code using the java mail api working in a
fetchmail way itself - but I'm unsure how to inject mails from other
servers properly into the main server so they do look like if they were
receive by the masterserver itself.
Could it be done by just synchronizing the MariaDB databases in the
background or would fiddle with the database while James is running
screw it up like the several counters for mails and mailboxes?
If James 3.x isn't suited for such a use case maybe that's something to
be considered for 4.0? Or is that too late into the current development
now and would delay a 4.0 release?
I would like to explore this idea further to see if and how James can be
used in a distributed cluster like other mailers can. Building a James
mail server cluster sounds just cool - and seen from "well, big
companies like google have several hundrets to thousands mail servers
deployed around the glob all working together" it sure has to be
possible with James as well - as broken down it's just some listeners on
some server sockets with some database backend synchronized by a message
bus. This should be extendable across multiple servers.
Have a nice weekend everyone.
Greetings from Germany,
Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org
For additional commands, e-mail: server-user-h...@james.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org
For additional commands, e-mail: server-user-h...@james.apache.org