Re: Implement Blob store multi-tenancy
Hi folks, I definitely share the concerns Jean raised in his email. We already have so many concepts that we are loosing most of the potential users and contributors before they even manage to do something useful with James. Multi-tenant is a very desirable feature, and I understand why a company making money hosting James would want that feature, but retrofiting such a concept into such a large codebase requires some planning. Would you agree to propose an ADR about that? We could then use the work done for the ADR to update the documentation. Some more comments below. On Wednesday, November 6th, 2024 at 06:44, Jean Helou wrote: > Hello, > > I'm taking my evening to try and answer about this asking my questions. > Especially since I saw the PR is already being pushed forward and I am not > convinced by the design decisions that were made. > > > Today James does not support multi-tenancy for blob store. Therefore blob > > isolation between domains could be an issue for example in a SaaS > > deployment that requires strict data isolation for users. > > > Would you say that James supports multi-tenancy in other parts, why focus > on blob store ? > James can handle multiple domains as long as you don't need to do TLS/SSL > or DKIM properly. Is a domain the same thing as a tenant ? If yes why > introduce a new concept, if no what would you say is the > difference/relationship between a domain and a tenant ? Personally, I would expect a Tenant to be the uppermost segregation key. That way, a Tenant can have multiple Domains, which is a frequent usecase. > how do you expect the new concept to reflect in the various apis ? I guess this is the most important question. And we have so many APIs ... > Pulsar has a multi tenant > api that scopes all operations for instance. it is clearly a first class > concept. Should tenants also be a first class concept in james ? My take is: it should. > You propose to introduce another concept: `Bucket` which is composed of a > `BucketName` and an optional `Tenant`. > How does the Tenant relate to the `BucketName` ? shouldn't the tenant or > domain be a first class parameter of the storage apis instead ? > > The jira mentions > > > That way blobstore could implement different isolation strategies for > > tenants (configurable): > > > - buckets as today - good for few tenants after all.\ > > > Which suggests that one can already use the buckets to isolate > tenants/domains, this in turn suggests that the BucketName passed in the > BlobStore api today is already usable as a discriminant for tenants/domains > (though I have yet to find any uses of this with something other than the > default bucketname in the code). If BucketName is the current parameter to > isolate tenants shouldn't it be replaced by the new tenant concept instead > of adding another information ? How should a programmer decide when to use > which ? Tenant should never be optional. Each tenant should have its specific configuration, there's no reason to mix bucket concept in the code with tenant. > > Do we really need to propagate the domain/tenant concept to all the storage > apis ? Will it be necessary for the MailRepositories too for instance ? I don't see how you would have a proper tenant segregation if it's not propagated in each and every layer. > Overall, Isn't this change to introduce multi-tenancy large enough to > warrant an ADR if only to answer these questions and document the concepts > (possibly retrodocument them) 100% agree. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Closed] (JAMES-3700) Dead letter policy for the Pulsar MailQueue
[ https://issues.apache.org/jira/browse/JAMES-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler closed JAMES-3700. > Dead letter policy for the Pulsar MailQueue > --- > > Key: JAMES-3700 > URL: https://issues.apache.org/jira/browse/JAMES-3700 > Project: James Server > Issue Type: Sub-task > Components: pulsar, Queue >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.9.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Currently the Pulsar MailQueue do not come up with a dead-letter policy. > A bad JSON payload halts the processing. > This makes the Pulsar MailQeue brittle: > - The ability to inject a single message with a bad payload can cause an > entire James cluster to come to a halt. > - Could be seen as an attack vector > - But also any changes to the underlying JSON schema for payloads is > susceptible to cause major downtime. > We should define a deadletter policy: > - Given a number of failures delivery of the message would be abandonned > - And moved to a dead-letter topic for later audit (prevent data loss) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Resolved] (JAMES-3700) Dead letter policy for the Pulsar MailQueue
[ https://issues.apache.org/jira/browse/JAMES-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler resolved JAMES-3700. -- Fix Version/s: 3.9.0 Resolution: Fixed Solved by https://github.com/apache/james-project/pull/2355 > Dead letter policy for the Pulsar MailQueue > --- > > Key: JAMES-3700 > URL: https://issues.apache.org/jira/browse/JAMES-3700 > Project: James Server > Issue Type: Sub-task > Components: pulsar, Queue >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.9.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Currently the Pulsar MailQueue do not come up with a dead-letter policy. > A bad JSON payload halts the processing. > This makes the Pulsar MailQeue brittle: > - The ability to inject a single message with a bad payload can cause an > entire James cluster to come to a halt. > - Could be seen as an attack vector > - But also any changes to the underlying JSON schema for payloads is > susceptible to cause major downtime. > We should define a deadletter policy: > - Given a number of failures delivery of the message would be abandonned > - And moved to a dead-letter topic for later audit (prevent data loss) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3740) IMAP UID <-> MSN mapping occupies too much memory
[ https://issues.apache.org/jira/browse/JAMES-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851774#comment-17851774 ] Matthieu Baechler commented on JAMES-3740: -- If it's good for you production I guess it's good for others too. > IMAP UID <-> MSN mapping occupies too much memory > - > > Key: JAMES-3740 > URL: https://issues.apache.org/jira/browse/JAMES-3740 > Project: James Server > Issue Type: Improvement > Components: IMAPServer >Affects Versions: 3.7.0 >Reporter: Benoit Tellier >Priority: Major > Attachments: Screenshot from 2022-03-30 17-39-37.png, Screenshot from > 2022-04-08 17-33-35.png > > Time Spent: 1h 20m > Remaining Estimate: 0h > > h3. What is UID <-> MSN mapping ? > In IMAP RFC-3501 there is two ways one addresses a message: > - By its UID (Unique ID) that is unique (until UID_VALIDITY changes...) > - By its MSN (Message Sequence Number) which is the (mutable) position of a > message in the mailbox. > We then need: > - Given a UID return its MSN which is for instance compulsory upon EXPUNGED > notifications when QRESYNCH is not enabled. > - Given a MSN based request we need to convert it back to a UID (rare). > We do store the list of UIDs, sorted, in RAM and perform binarysearches to > resolve those. > h3. What is the impact on heap? > Each uid is wrapped in a MessageUID object. This object wrapping comes with > an overhead of at least 12 bytes in addition to the 8 bytes payload (long). > Quick benchmarks shows it's actually worse: 10 million uids did take up to > 275 MB. > {code:java} > @Test > void measureHeapUsage() throws InterruptedException { > int count =1000; > testee.addAll(IntStream.range(0, count) > .mapToObj(i -> MessageUid.of(i + 1)) > .collect(Collectors.toList())); > Thread.sleep(1000); > System.out.println("GCing"); > System.gc(); > Thread.sleep(1000); > > System.out.println(ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed()); > } > {code} > Now, from let's take a classical production deployment I get: > - Some users have up to 2.5 million messages in their INBOX > - I can get an average of 100.000 messages for each user > So for a small scale deployment, we are already "consuming" ~300 MB of memory > just for the UID <-> mapping. > Scaling to 1.000 users on a single James instance we clearly see that HEAP > consumption will start being a problem (~3GB) without even speaking of target > of 10.000 users per James I do have in mind. > It's worth mentioning that IMAP being statefull, and UID <-> MSN mapping > attached to a selected mailbox, such a mapping is long lived: > - Multiple small objects would need to be copied individually by the GC, > putting pressure during long gen > - Those long lived object will eventually be promoted to old gen, thus the > more there is the longer the resulting stop-the-world GC pauses will be. > h3. Temporary fix ? > We can get rid of the object boxing in UidMsnConverter by using primitive > type collections for instance provided by fastutils project. > The same bench was down to 84MB. > Also, we could get things more compact by using an INT representation of > UIDs. (Those are most of the case below 2 billions, to be above this there > need to be more than 2 billion emails transiting through one's mailbox which > is highly unlikely). A fallback to "long" storage can be setted up if a UID > above 2 billion is observed. > This such a compact int storage we are down to 46MB. > So taking the prior mentioned numbers we could expect a 1.000 people > deployment to require ~400 MB and a larger scale 10.000 people deployment on > a single James to consume up to 4GB. Not that enjoyable but definitly more > manageable. > Please note that primitive collections are more GC friendly as their elements > are manages together, as a single object (backing array). > h3. What other mail servers do > I found references to Dovecote, which does a similar algorithm compared to > us: binary search on a list of uids. The noticeable difference is that this > list of UIDs is held on disk and not in memory as we do. > References: > https://doc.dovecot.org/developer_manual/design/indexes/mail_index_api/?highlight=time > Of course, such a solution would be attractive... We could imagine keeping > the last 1.000 uids in memory,
[jira] [Closed] (JAMES-3943) Assemble a scaling SMTP server backed by pulsar (and PG)
[ https://issues.apache.org/jira/browse/JAMES-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler closed JAMES-3943. Resolution: Fixed > Assemble a scaling SMTP server backed by pulsar (and PG) > > > Key: JAMES-3943 > URL: https://issues.apache.org/jira/browse/JAMES-3943 > Project: James Server > Issue Type: Sub-task > Components: app > Reporter: Matthieu Baechler >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3943) Assemble a scaling SMTP server backed by pulsar (and PG)
[ https://issues.apache.org/jira/browse/JAMES-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851773#comment-17851773 ] Matthieu Baechler commented on JAMES-3943: -- I think we are good. Closing. > Assemble a scaling SMTP server backed by pulsar (and PG) > > > Key: JAMES-3943 > URL: https://issues.apache.org/jira/browse/JAMES-3943 > Project: James Server > Issue Type: Sub-task > Components: app >Reporter: Matthieu Baechler >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Proposal about deprecation and removal
Hi, I had a hack session with Benoit today and the sentence "this would break Spring" came many times along the day. As I'm less active on James than I used to be, I must admin I have no idea how popular the Spring version of James is nowadays. However, what strikes me when I hack on James is how the size of the project and its legacy makes it so slow to make progress. We did some deprecation and removal in the past but we have been conservative about that. I would like to argue that being conservative to preserve existing users may actually prevents from attracting new ones. Moreover, it probably also prevents new developers to involve as they are quickly overwhelmed. So, what would you think about removing more aggressively features and modules, starting with the Spring support? Cheers, -- Matthieu Baechler
Re: OpenJPA
Hi, What about using Scala Doobie (https://tpolecat.github.io/doobie/) as it allows to write actual SQL with the comfort of a compiler and to bind results to case class quite easily? Cheers, -- Matthieu --- Original Message --- On Saturday, October 14th, 2023 at 16:48, Jean Helou wrote: > > > Hello, > > I have been following the discussion on pg mailbox but every jpa mention > makes me think of how bad I found the developer experience with open jpa > when playing with scaling-smtp. > (having to set the agent manually for tests in the ide is a pain. having to > list the classes to instrument isn't much better and we had quite a few > issues when moving scaling-smtp away from Cassandra and to PgSQL) > > So I'm forking the initial discussion to ask if there would be interest in > migrating away from openjpa ? at least for new features. > > I have basically come to dislike most automatic ORMs but pure jdbc is very > low level. I thus tend to favor wrappers that make my life easier without > trying to hide the SQL. My current favorite wrapper is jooq ( > https://www.jooq.org/). > > I am not sure if it would be an acceptable choice for James. While there is > an open source version which uses apache 2, access to proprietary databases > relies on proprietary code with a paying license. > > Apart from this I'm sure jooq would be nice especially if we are going for > a PG only solution. > > Jooq DSL mirrors SQL quite closely to provide typesafety. it still allows > writing SQL explicitly if the DSL is not enough. > > It returns a resultset in a thin wrapper that can be generated from the DDL > to allow typed access to the result rowsm items. > This then allows using a user provided mapper to build applicative objects. > > I tried searching the archives of the mailing list for references to jooq > and found nothing but I may have missed things. > > The downside of migrating to jooq is that I'm not sure how easy it is to > swap db backend with it and support for Oracle/DB2/commercial databases > would require to buy a license. > As I said I don´t know how much of a deal breaker that is > > jean > > Le ven. 6 oct. 2023 à 23:49, Benoit TELLIER btell...@linagora.com a > > écrit : > > > Hey there! > > > > The goal: deliver James "stateless email server" concept to smaller > > deployments than those addressable with the Distributed server. > > > > Why Postgres? Rock solid. And more options than other SQL stores (see > > below) > > > > The requirements would be: > > - Leverage the blobStore for binary storage (email bodies + > > attachements). Those big binaries are not meant to be stored into SQL rows > > - blaming you, JPA! > > - Bring choice on blob store : PGSQL native solution ( > > https://www.postgresql.org/docs/7.4/jdbc-binary-data.html ) for small > > deployments OR S3 > > - Bring choice on search: PGSQL native solution ( > > https://www.postgresql.org/docs/current/textsearch.html ) for small > > deployments OR OpenSearch > > - Bring choice on PubSub: PGSQL native solution ( > > https://www.postgresql.org/docs/current/plpgsql-trigger.html ) OR > > RabbitMQ > > - Enforce strict tenant isolation: domain A won't access domain B data > > even if we screw up James access control layer. This can be done with Row > > security https://www.postgresql.org/docs/current/ddl-rowsecurity.html . > > - Be reactive. This can be achieved by using a reactive firendly driver > > like r2dbc... > > - Ensure that we can easily run on some largely scaling postgres... > > CitusData ? > > > > An other outcome might be to drop JPA implementation, ideally... (we > > provide something similar but wy better) > > > > Ideally I would like to deliver this before september 2024... > > > > Thoughts? > > Would this be something interesting people in here? > > Would some people be interested contributing to this effort? > > Would some people desire sponsoring this effort? > > > > If this is non consensual, I can also contribute this into > > https://github.com/linagora/tmail-backend/ without annoying people in > > here... > > > > -- > > > > Best regards, > > > > Benoit TELLIER > > > > General manager of Linagora VIETNAM. > > Product owner for Team-Mail product. > > Chairman of the Apache James project. > > > > Mail: btell...@linagora.com > > Tel: (0033) 6 77 26 04 58 (WhatsApp, Signal) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Building a PostgreSQL mailbox for James?
Hi Benoit, This topic has been discussed for years, I'm happy you finally draw a plan for it. To me, the aim for Postgres is small to middle size deployment with minimal dependencies. In that regard, having an implementation that spans across all infrastructure needs is a must have. So my take would be let's implement everything with PG: blob storage, search, messaging and various data storage like Event Sourcing and plain data. For a user, it will always be possible to plug another piece of infrastructure if need be (like having better search or store more blobs, etc). The only nice-to-have to me would be the multi-tenant goal as you can always spawn another James instance by domain (and you can use the same PG if you want by using several databases). To answer the last questions: I would definitely be interested in using this implementation (I use JPA for now). I could marginally contribute to it as I have experience with PG but my time is very limited (unless someone wants to sponsor my work, of course). I can donate some code related to Event Sourcing has I have an implementation of an Event Store on top of PG and some code around messaging. Let me know if you are interested in that contributation. In term of strategy, I think that would help James gain popularity among hobbyist and small businesses, so I think it worth trying. Cheers, -- Matthieu Baechler --- Original Message --- On Friday, October 6th, 2023 at 23:48, Benoit TELLIER wrote: > > > Hey there! > > The goal: deliver James "stateless email server" concept to smaller > deployments than those addressable with the Distributed server. > > Why Postgres? Rock solid. And more options than other SQL stores (see below) > > The requirements would be: > - Leverage the blobStore for binary storage (email bodies + attachements). > Those big binaries are not meant to be stored into SQL rows - blaming you, > JPA! > - Bring choice on blob store : PGSQL native solution ( > https://www.postgresql.org/docs/7.4/jdbc-binary-data.html ) for small > deployments OR S3 > - Bring choice on search: PGSQL native solution ( > https://www.postgresql.org/docs/current/textsearch.html ) for small > deployments OR OpenSearch > - Bring choice on PubSub: PGSQL native solution ( > https://www.postgresql.org/docs/current/plpgsql-trigger.html ) OR RabbitMQ > - Enforce strict tenant isolation: domain A won't access domain B data even > if we screw up James access control layer. This can be done with Row security > https://www.postgresql.org/docs/current/ddl-rowsecurity.html . > - Be reactive. This can be achieved by using a reactive firendly driver like > r2dbc... > - Ensure that we can easily run on some largely scaling postgres... CitusData > ? > > An other outcome might be to drop JPA implementation, ideally... (we provide > something similar but wy better) > > Ideally I would like to deliver this before september 2024... > > Thoughts? > Would this be something interesting people in here? > Would some people be interested contributing to this effort? > Would some people desire sponsoring this effort? > > If this is non consensual, I can also contribute this into > https://github.com/linagora/tmail-backend/ without annoying people in here... > > > > -- > > Best regards, > > Benoit TELLIER > > General manager of Linagora VIETNAM. > Product owner for Team-Mail product. > Chairman of the Apache James project. > > Mail: btell...@linagora.com > Tel: (0033) 6 77 26 04 58 (WhatsApp, Signal) > > - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3943) Assemble a scaling SMTP server backed by pulsar (and PG)
Matthieu Baechler created JAMES-3943: Summary: Assemble a scaling SMTP server backed by pulsar (and PG) Key: JAMES-3943 URL: https://issues.apache.org/jira/browse/JAMES-3943 Project: James Server Issue Type: Sub-task Components: app Reporter: Matthieu Baechler -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3687) Implements Apache Pulsar based Mailqueue
[ https://issues.apache.org/jira/browse/JAMES-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770237#comment-17770237 ] Matthieu Baechler commented on JAMES-3687: -- I would like to treat this last issue in another task. There's already https://issues.apache.org/jira/browse/JAMES-3696 to track that if I understand correctly. Could we close this ticket? > Implements Apache Pulsar based Mailqueue > > > Key: JAMES-3687 > URL: https://issues.apache.org/jira/browse/JAMES-3687 > Project: James Server > Issue Type: Sub-task > Components: Queue >Reporter: Jean Helou >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > An apache pulsar based mailqueue offers a different set of compromises over > the existing mailqueue implementations: > pros: > * pulsar is a distributed queue > * pulsar offers scheduling facilities making it easier to implement delayed > queues > cons: > * being fully distributed some consistency guarantees cannot be honored for > flush and filter since the flushing and filtering commands take time to > propagate in the cluster -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3906) Add hot reloading/updating without restart of the certificate
[ https://issues.apache.org/jira/browse/JAMES-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728364#comment-17728364 ] Matthieu Baechler commented on JAMES-3906: -- The letsencrypt part deserves its own ticket anyway, you can close this one IMO. > Add hot reloading/updating without restart of the certificate > - > > Key: JAMES-3906 > URL: https://issues.apache.org/jira/browse/JAMES-3906 > Project: James Server > Issue Type: New Feature >Reporter: Wojtek >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > It would be great to be able to update the certificate without restarting the > server, reloading the certificate from the file and/or updating it via REST > API > > Mailing list thread: > https://www.mail-archive.com/server-user@james.apache.org/msg16722.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3906) Add hot reloading/updating witht restart of the certificate
[ https://issues.apache.org/jira/browse/JAMES-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720800#comment-17720800 ] Matthieu Baechler commented on JAMES-3906: -- I guess we have to think about what happens when a certificate is updated for a James cluster. It would most likely be updated from a single node but would require propagation to other nodes. That being said, I think that instead of trying to write ad hoc code to replace the certificate into the right object, we should rather think about leveraging the event bus to dispatch such information and then see how it would impact the various building blocks. What do you think? > Add hot reloading/updating witht restart of the certificate > --- > > Key: JAMES-3906 > URL: https://issues.apache.org/jira/browse/JAMES-3906 > Project: James Server > Issue Type: New Feature >Reporter: Wojtek >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > It would be great to be able to update the certificate without restarting the > server, reloading the certificate from the file and/or updating it via REST > API > > Mailing list thread: > https://www.mail-archive.com/server-user@james.apache.org/msg16722.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3906) Add hot reloading/updating witht restart of the certificate
[ https://issues.apache.org/jira/browse/JAMES-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719849#comment-17719849 ] Matthieu Baechler commented on JAMES-3906: -- We'd like to implement Let's Encrypt acme protocol in James and I don't see any reason to restart a server when you can replace certificate at runtime. > Add hot reloading/updating witht restart of the certificate > --- > > Key: JAMES-3906 > URL: https://issues.apache.org/jira/browse/JAMES-3906 > Project: James Server > Issue Type: New Feature >Reporter: Wojtek >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > It would be great to be able to update the certificate without restarting the > server, reloading the certificate from the file and/or updating it via REST > API > > Mailing list thread: > https://www.mail-archive.com/server-user@james.apache.org/msg16722.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3906) Add hot reloading/updating witht restart of the certificate
[ https://issues.apache.org/jira/browse/JAMES-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719815#comment-17719815 ] Matthieu Baechler commented on JAMES-3906: -- Hi there, We worked a bit on that topic yesterday with [~jeantil] and found that we can probably update the current certificate without restarting anything (it would prevent from stopping ongoing connections). For every new connection, `AbstractSSLAwareChannelPipelineFactory` creates a new sslHandler, so if we can update the SSLContext into `Encryption`, a new certificate would be picked for new connections. > Add hot reloading/updating witht restart of the certificate > --- > > Key: JAMES-3906 > URL: https://issues.apache.org/jira/browse/JAMES-3906 > Project: James Server > Issue Type: New Feature >Reporter: Wojtek >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > It would be great to be able to update the certificate without restarting the > server, reloading the certificate from the file and/or updating it via REST > API > > Mailing list thread: > https://www.mail-archive.com/server-user@james.apache.org/msg16722.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Assigned] (JAMES-3805) PulsarMailQueueTest.dequeueShouldBeConcurrent is unstable
[ https://issues.apache.org/jira/browse/JAMES-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler reassigned JAMES-3805: Assignee: Matthieu Baechler > PulsarMailQueueTest.dequeueShouldBeConcurrent is unstable > - > > Key: JAMES-3805 > URL: https://issues.apache.org/jira/browse/JAMES-3805 > Project: James Server > Issue Type: Bug > Components: pulsar, Queue >Affects Versions: master >Reporter: Benoit Tellier >Assignee: Matthieu Baechler >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > https://ci-builds.apache.org/job/james/job/ApacheJames/job/PR-1099/16/testReport/junit/org.apache.james.queue.pulsar/PulsarMailQueueTest/dequeueShouldBeConcurrent/ > {code:java} > Error Message > Too many concurrent offers. Specified maximum is 1. You have to wait for one > previous future to be resolved to send another request > Stacktrace > java.lang.IllegalStateException: Too many concurrent offers. Specified > maximum is 1. You have to wait for one previous future to be resolved to send > another request > Standard Output > 10:19:36.780 [ERROR] r.c.p.Operators - Operator called default onErrorDropped > java.lang.IllegalStateException: Too many concurrent offers. Specified > maximum is 1. You have to wait for one previous future to be resolved to send > another request > at > akka.stream.impl.QueueSource$$anon$1.bufferElem(QueueSource.scala:115) > at > akka.stream.impl.QueueSource$$anon$1.$anonfun$callback$1(QueueSource.scala:126) > at > akka.stream.impl.QueueSource$$anon$1.$anonfun$callback$1$adapted(QueueSource.scala:120) > at > akka.stream.impl.fusing.GraphInterpreter.runAsyncInput(GraphInterpreter.scala:467) > at > akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:517) > at > akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:625) > at > akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:800) > at > akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:818) > at akka.actor.Actor.aroundReceive(Actor.scala:537) > at akka.actor.Actor.aroundReceive$(Actor.scala:535) > at > akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:716) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) > at akka.actor.ActorCell.invoke(ActorCell.scala:548) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) > at akka.dispatch.Mailbox.run(Mailbox.scala:231) > at akka.dispatch.Mailbox.exec(Mailbox.scala:243) > at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) > at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) > at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) > at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) > at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183) > 10:19:36.780 [ERROR] r.c.p.Operators - Operator called default onErrorDropped > java.lang.IllegalStateException: Too many concurrent offers. Specified > maximum is 1. You have to wait for one previous future to be resolved to send > another request > at > akka.stream.impl.QueueSource$$anon$1.bufferElem(QueueSource.scala:115) > at > akka.stream.impl.QueueSource$$anon$1.$anonfun$callback$1(QueueSource.scala:126) > at > akka.stream.impl.QueueSource$$anon$1.$anonfun$callback$1$adapted(QueueSource.scala:120) > at > akka.stream.impl.fusing.GraphInterpreter.runAsyncInput(GraphInterpreter.scala:467) > at > akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:517) > at > akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:625) > at > akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:800) > at > akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:818) > at akka.actor.Actor.aroundReceive(Actor.scala:537) > at akka.actor.Actor.aroundReceive$(Actor.scala:535) > at > akka.stream.impl.fusing.ActorGraph
Re: Upgrade regarding ElasticSearch
Hi, On Fri, 2022-06-17 at 10:19 +0700, Rene Cordier wrote: [...] > > > > > Instead of deleting the support for ES 7, we could just keep it and run > > tests against OpenSearch. > > > > Investment is very low and people don't have to switch to non- > > opensource ES 8 if they don't want to. > > I'm sorry you are mixing up things... It's not question anymore to move > up to ES 8 at all, > Sorry, I must have missed that part from your email. > none of the 3 solutions I proposed above are implying > this. > Well, I was thinking that, given you already have a ES 8 impl almost done, you would like to keep it. It looks like I was wrong. > We agree it's better to just switch to OpenSearch and not continue > with ES versions after 7.10. > Definitely something I missed in the discussion. > > > > If ever somebody has interest in migrating to OpenSearch 2, it can be > > migrated at this point. > > Well as said as well... I did spend a lot of time on it personally to > migrate to their new java client, that looks very similar to the one on > es8 (that's why the POC is starting from the ES work: > https://github.com/apache/james-project/pull/1051... but I can clean it > up to make all the es 8 stuff disappear?) > > However I remain an issue with the sort to which we had a workaround > with Benoit (but I personally don't like it) and I proposed a fix on > their client: > https://github.com/opensearch-project/opensearch-java/pull/169 . > > But maybe a bit early to fully migrate to that yet... > > > > > I would personally postpone the migration and go for solution 1. > > I understand why not solution 3... but what about solution 2? It's just > change the one dependency es client to opensearch high level one and > change all the imports and... nothing else. Effort needed is not much > more time consuming IMO? And you can use OpenSearch 2. Would like to > know what makes you afraid on this? > > To be honest, if you tackle this issue (I mean, if you have time to spend on the task), I don't really care how, we have a good test suite, we can always change later if needed. Sorry for my previous email, I got the main assumption wrong. Regards, -- Matthieu - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Upgrade regarding ElasticSearch
Hi René, Thank you for starting this thread. On Wed, 2022-06-15 at 16:37 +0700, Rene Cordier wrote: > Hello James community! > > I would like to start a discussion regarding the upgrade of > ElasticSearch, hoping we can reach a consensus, as I spent quite a lot > of time on this already. > > As you know, the version 7.10 has reached EOL already, so we need to > migrate from it. > > Thus a while ago I started a very painful migration to ES 8.2 here: > https://github.com/apache/james-project/pull/1018 > > Before being rightfully reminded by Matthieu Baechler that ES 7.10 is > the last OSI-compliant version of ElasticSearch, before you know they > switched to a new license that's not really open source anymore... > > OpenSearch is indeed a fork of ES 7.10 using the Apache License, which > is definitely more in favor for adoption for the migration than ES 8. On > that, I totally agree. > > From then, with Benoit Tellier, we did our little extra research then. > > If we want to migrate from ES to OpenSearch, there is a few options on > the table actually: > > - solution 1: not modifying the ES7 code. Well it's possible, but you > can only use versions 1.x of OpenSearch (1.3.3 atm). However, from 2.x > version of OpenSearch, the support for es7 client has been dropped in > favor of their own clients. The best part for this solution is: it should not require anything else than changing the docker image reference. > > - solution 2: using the java high level rest client from OpenSearch > (https://opensearch.org/docs/latest/clients/java-rest-high-level/): That > client is a basic fork of the java high level rest client from ES. As > this client has been dropped in upper version of ES for a new client > (that you can see in the PR I did before: > https://github.com/apache/james-project/pull/1018), the fork is thus > identical. > Benoit did a little POC on it and it seems you only need to change the > imports and it works with OpenSearch 2.0 without issues (also said here > in their doc: > https://opensearch.org/docs/latest/clients/java-rest-high-level/#migrating-to-the-opensearch-java-high-level-rest-client) > > - solution 3: using the new java client > (https://opensearch.org/docs/latest/clients/java/). That client has been > forked from the new java client from ES probably at its beginnings, > before the change of license. In the POC I did here: > https://github.com/apache/james-project/pull/1051, you can see the > structure is very similar to the java client from ES, but with obviously > some changes or bugs as the fork went its way since when from the > original one. That migration is complicated honestly, but because I did > the one to ES then the remaining work is minimal as proven in the POC. > Just a few issues though encountered... (in the POC you can see them) > > I think solution 1 is IMO, not an option, as we probably want to migrate > to the latest version of OpenSearch as we are at it now. > > Solution 2 is very easy (replace dependencies and imports... nothing > more) and allows to use OpenSearch 2.0. > > I would say let's go with it if I didn't invest so much time migrating > to the new java client. Because this is the issue actually. Amazon > states that the java client is supposed to replace the high level one at > some point (like on forums, or the page of the github project > (https://github.com/opensearch-project/opensearch-java). It's a bit > blurry on really when the high level client would be dropped but I > wouldn't be surprised to see it on next major upgrade for example. > > So at some point we will have to migrate the client eventually... do we > try to do it now (solution 3) or do we do things simple for now > (solution 2) and keep the work done under the hood for the day the > migration is necessary? (cause a big chunk of it has been done I think). > > I'm honestly fine either way, but would love to hear what the community > has to say on the topic. > > Sorry it was long! But I hope I gave all the keys necessary to > understand our options here regarding this migration. > My point is: upgrading to ES 8 was triggered by "ES 7 is EOL". OpenSearch 1.x is basically a supported ES 7 and the code for ES 7 is supposed to work perfectly with OpenSearch 1.x. Instead of deleting the support for ES 7, we could just keep it and run tests against OpenSearch. Investment is very low and people don't have to switch to non- opensource ES 8 if they don't want to. If ever somebody has interest in migrating to OpenSearch 2, it can be migrated at this point. I would personally postpone the migration and go for solution 1. Regards, -- Matthieu - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Faight of cassandra-app artifact?
Hi Benoit, I agree this artifact should be removed as there's no good use-case for it. Cheers, -- Matthieu On Mon, 2022-05-09 at 18:15 +0700, Benoit TELLIER wrote: > Hello all, > > That is quite some time I actively militate against the cassandra- > app. > > Here are issues I have with this artifact: > - It is not targeting distributed deployment. Because it do not > leverages messaging technologies, it only supports one application > server in order to be able to implement connected protocols like > IMAP. > - Cassandra + ElasticSearch are difficult/expensive to operate > systems. Having them with only one application server is overkill. > - The two above points creates confusion and some of my customers > missed the "limitation section" and run into troubles. > - The Cassandra blob store sucks. Period. > -> Everything is written to one huge SSTable - beware of > exceeding > 50% of node storage! > -> Cassandra compaction takes forever and depletes cassandra node > ressources (off-heap memory) > -> Cassandra storage is expensive > -> The horror story further develops but I am uneasy to share > this > publicly as some fixes were clear Cassandra anti-patterns. > - Some emails can get stuck in ActiveMQ - I was unable to come up > with > a proper diagnostic for this and some oldish issues already tend to > refer to this behaviour. > - This artifact is accidental: merely a step toward the distributed > server that we never got rid of. > - Too many artifact is complexity, build time... I would be happy > to > remove this one to let the room for other artifacts to shine. > > My proposal is thus to deprecate / remove it. > > Migration to the distributed server can be done with only one added > dependency (RabbitMQ) and no data loss given that the mail queue is > empty. > > Some project members have expressed concerns regarding the current > distributed application and its rabbitMQ mail queue. The "management" > part of this mail queue was implemented in cunjunction with Cassandra > and lead to complex code that is hard to maintain and hard to > operate. > The "delay" feature (that makes management features needed!) is not > supported. On the long term the project expressed its interest for > having a Pulsar based distributed server. However on the short term, > and > in order to support this deprecation, I propose to allow simplifying > the > RabbitMQ by adding an option not to activate "management features". > > Given that is behaves as a pure message queue (no delays), for a MDA > looking into and interfering with a mail queue makes little sense > (and I > never did). Adding an option to drop this complexity, disable > browse/flush/size/remove/clear would allow to have a simple yet > reliable > mail queue suited to a MDA in the waiting of the Pulsar alternative. > > Thoughts? > > Best regards, > > Benoit TELLIER > > > - > To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org > For additional commands, e-mail: server-dev-h...@james.apache.org > - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3763) Implement a blobstore backed MailRepository
Matthieu Baechler created JAMES-3763: Summary: Implement a blobstore backed MailRepository Key: JAMES-3763 URL: https://issues.apache.org/jira/browse/JAMES-3763 Project: James Server Issue Type: New Feature Components: Blob, MailStore & MailRepository Affects Versions: master Reporter: Matthieu Baechler Currently, only Cassandra is a well maintained MailRepository implementation. We propose to implement a blobstore backed MailRepository to be able to deploy James without using Cassandra. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3762) change MailRepository contract to require a MailKey on delete
Matthieu Baechler created JAMES-3762: Summary: change MailRepository contract to require a MailKey on delete Key: JAMES-3762 URL: https://issues.apache.org/jira/browse/JAMES-3762 Project: James Server Issue Type: Improvement Components: MailStore & MailRepository Affects Versions: master Reporter: Matthieu Baechler Right now, MailRepository implementations (and contract tests) rely on the Mail name property to generate a stable MailKey. We propose to change this API to be able to generate unique ids (like UUID) to be able to use them with some key/value store technologies. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3761) MimeMessageStore should be able to store emails in a specific bucket
Matthieu Baechler created JAMES-3761: Summary: MimeMessageStore should be able to store emails in a specific bucket Key: JAMES-3761 URL: https://issues.apache.org/jira/browse/JAMES-3761 Project: James Server Issue Type: Improvement Components: Blob, MailStore & MailRepository Affects Versions: master Reporter: Matthieu Baechler When using MimeMessageStore in several contexts in the same James instance, it can be useful to be able to choose the target bucket. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3760) MailRepositoryContract is not cleaning the created Mails
Matthieu Baechler created JAMES-3760: Summary: MailRepositoryContract is not cleaning the created Mails Key: JAMES-3760 URL: https://issues.apache.org/jira/browse/JAMES-3760 Project: James Server Issue Type: Bug Components: tests Affects Versions: master Reporter: Matthieu Baechler Once https://issues.apache.org/jira/browse/JAMES-3759 is fixed, a lot of leaks are detected when running {code:java} MailRepositoryContract{code} The root cause is {noformat} createMail{noformat} generates MailImpl instances which need to be disposed -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3759) Leak Detector is not running in some tests
Matthieu Baechler created JAMES-3759: Summary: Leak Detector is not running in some tests Key: JAMES-3759 URL: https://issues.apache.org/jira/browse/JAMES-3759 Project: James Server Issue Type: Bug Components: tests Affects Versions: master Reporter: Matthieu Baechler In maven pom.xml files, there's a typo replicated in several files using {code:java} james.ligecycle.leak.detection.mode{code} instead of {code:java} james.lifecycle.leak.detection.mode{code} It prevents the awesome detection of leaks from actually reporting them. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Comment Edited] (JAMES-3740) IMAP UID <-> MSN mapping occupies too much memory
[ https://issues.apache.org/jira/browse/JAMES-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517056#comment-17517056 ] Matthieu Baechler edited comment on JAMES-3740 at 4/4/22 7:15 PM: -- What about building a sorted list containing ranges of UID? Most mailboxes probably have contiguous UIDs and thus it could potentially reduce a lot the size of the datastructure. For example (numbers are UIDs): 0-15 / 20-30 / 32-45 / 50-100 / 102-178 / 180-222 / 230-260 If I need to find the MSN for UID 35, I iterate over ranges, and add them until I reach the UID. Here : 16 + 11 + 3. If I need to find the UID for MSN 110, I iterate over ranges and add there sizes until I reach the number I want. Here 16 + 11 + 14 + 51 (=92) then I take the 18th element of the 102-178 range (because it's 110 - 92), so UID is 120. It means the lookup is {noformat} 0(n) {noformat} with n being the number of ranges. was (Author: matthieub): What about building a sorted list containing ranges of UID? Most mailboxes probably have contiguous UIDs and thus it could potentially reduce a lot the size of the datastructure. For example (numbers are UIDs): 0-15 / 20-30 / 32-45 / 50-100 / 102-178 / 180-222 / 230-260 If I need to find the MSN for UID 35, I iterate over ranges, and add them until I reach the UID. Here : 16 + 11 + 3. If I need to find the UID for MSN 110, I iterate over ranges and add there sizes until I reach the number I want. Here 16 + 11 + 14 + 51 (=92) then I take the 18th element of the 102-178 range (because it's 110 - 92), so UID is 120. It means the lookup is 0(n) with n being the number of ranges. > IMAP UID <-> MSN mapping occupies too much memory > - > > Key: JAMES-3740 > URL: https://issues.apache.org/jira/browse/JAMES-3740 > Project: James Server > Issue Type: Improvement > Components: IMAPServer >Affects Versions: 3.7.0 >Reporter: Benoit Tellier >Priority: Major > Attachments: Screenshot from 2022-03-30 17-39-37.png > > Time Spent: 20m > Remaining Estimate: 0h > > h3. What is UID <-> MSN mapping ? > In IMAP RFC-3501 there is two ways one addresses a message: > - By its UID (Unique ID) that is unique (until UID_VALIDITY changes...) > - By its MSN (Message Sequence Number) which is the (mutable) position of a > message in the mailbox. > We then need: > - Given a UID return its MSN which is for instance compulsory upon EXPUNGED > notifications when QRESYNCH is not enabled. > - Given a MSN based request we need to convert it back to a UID (rare). > We do store the list of UIDs, sorted, in RAM and perform binarysearches to > resolve those. > h3. What is the impact on heap? > Each uid is wrapped in a MessageUID object. This object wrapping comes with > an overhead of at least 12 bytes in addition to the 8 bytes payload (long). > Quick benchmarks shows it's actually worse: 10 million uids did take up to > 275 MB. > {code:java} > @Test > void measureHeapUsage() throws InterruptedException { > int count =1000; > testee.addAll(IntStream.range(0, count) > .mapToObj(i -> MessageUid.of(i + 1)) > .collect(Collectors.toList())); > Thread.sleep(1000); > System.out.println("GCing"); > System.gc(); > Thread.sleep(1000); > > System.out.println(ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed()); > } > {code} > Now, from let's take a classical production deployment I get: > - Some users have up to 2.5 million messages in their INBOX > - I can get an average of 100.000 messages for each user > So for a small scale deployment, we are already "consuming" ~300 MB of memory > just for the UID <-> mapping. > Scaling to 1.000 users on a single James instance we clearly see that HEAP > consumption will start being a problem (~3GB) without even speaking of target > of 10.000 users per James I do have in mind. > It's worth mentioning that IMAP being statefull, and UID <-> MSN mapping > attached to a selected mailbox, such a mapping is long lived: > - Multiple small objects would need to be copied individually by the GC, > putting pressure during long gen > - Those long lived object will eventually be promoted to old gen, thus the > more there is the longer the resulting stop-the-world GC pauses will be. > h3. Temporary fix ? > We can get rid of the object boxing in UidMsnConverter by using primitive > type collections for instance provided by fastutils pr
[jira] [Commented] (JAMES-3740) IMAP UID <-> MSN mapping occupies too much memory
[ https://issues.apache.org/jira/browse/JAMES-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517056#comment-17517056 ] Matthieu Baechler commented on JAMES-3740: -- What about building a sorted list containing ranges of UID? Most mailboxes probably have contiguous UIDs and thus it could potentially reduce a lot the size of the datastructure. For example (numbers are UIDs): 0-15 / 20-30 / 32-45 / 50-100 / 102-178 / 180-222 / 230-260 If I need to find the MSN for UID 35, I iterate over ranges, and add them until I reach the UID. Here : 16 + 11 + 3. If I need to find the UID for MSN 110, I iterate over ranges and add there sizes until I reach the number I want. Here 16 + 11 + 14 + 51 (=92) then I take the 18th element of the 102-178 range (because it's 110 - 92), so UID is 120. It means the lookup is 0(n) with n being the number of ranges. > IMAP UID <-> MSN mapping occupies too much memory > - > > Key: JAMES-3740 > URL: https://issues.apache.org/jira/browse/JAMES-3740 > Project: James Server > Issue Type: Improvement > Components: IMAPServer >Affects Versions: 3.7.0 >Reporter: Benoit Tellier >Priority: Major > Attachments: Screenshot from 2022-03-30 17-39-37.png > > Time Spent: 20m > Remaining Estimate: 0h > > h3. What is UID <-> MSN mapping ? > In IMAP RFC-3501 there is two ways one addresses a message: > - By its UID (Unique ID) that is unique (until UID_VALIDITY changes...) > - By its MSN (Message Sequence Number) which is the (mutable) position of a > message in the mailbox. > We then need: > - Given a UID return its MSN which is for instance compulsory upon EXPUNGED > notifications when QRESYNCH is not enabled. > - Given a MSN based request we need to convert it back to a UID (rare). > We do store the list of UIDs, sorted, in RAM and perform binarysearches to > resolve those. > h3. What is the impact on heap? > Each uid is wrapped in a MessageUID object. This object wrapping comes with > an overhead of at least 12 bytes in addition to the 8 bytes payload (long). > Quick benchmarks shows it's actually worse: 10 million uids did take up to > 275 MB. > {code:java} > @Test > void measureHeapUsage() throws InterruptedException { > int count =1000; > testee.addAll(IntStream.range(0, count) > .mapToObj(i -> MessageUid.of(i + 1)) > .collect(Collectors.toList())); > Thread.sleep(1000); > System.out.println("GCing"); > System.gc(); > Thread.sleep(1000); > > System.out.println(ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed()); > } > {code} > Now, from let's take a classical production deployment I get: > - Some users have up to 2.5 million messages in their INBOX > - I can get an average of 100.000 messages for each user > So for a small scale deployment, we are already "consuming" ~300 MB of memory > just for the UID <-> mapping. > Scaling to 1.000 users on a single James instance we clearly see that HEAP > consumption will start being a problem (~3GB) without even speaking of target > of 10.000 users per James I do have in mind. > It's worth mentioning that IMAP being statefull, and UID <-> MSN mapping > attached to a selected mailbox, such a mapping is long lived: > - Multiple small objects would need to be copied individually by the GC, > putting pressure during long gen > - Those long lived object will eventually be promoted to old gen, thus the > more there is the longer the resulting stop-the-world GC pauses will be. > h3. Temporary fix ? > We can get rid of the object boxing in UidMsnConverter by using primitive > type collections for instance provided by fastutils project. > The same bench was down to 84MB. > Also, we could get things more compact by using an INT representation of > UIDs. (Those are most of the case below 2 billions, to be above this there > need to be more than 2 billion emails transiting through one's mailbox which > is highly unlikely). A fallback to "long" storage can be setted up if a UID > above 2 billion is observed. > This such a compact int storage we are down to 46MB. > So taking the prior mentioned numbers we could expect a 1.000 people > deployment to require ~400 MB and a larger scale 10.000 people deployment on > a single James to consume up to 4GB. Not that enjoyable but definitly more > manageable. > Please note that primitive collections are more GC frien
[jira] [Commented] (JAMES-3487) Configure MimeMessageInputStreamSource THRESHOLD
[ https://issues.apache.org/jira/browse/JAMES-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17411186#comment-17411186 ] Matthieu Baechler commented on JAMES-3487: -- Why it can't be a config file parameter like everything else? They are easier to discover than properties or env vars, don't you think? > Configure MimeMessageInputStreamSource THRESHOLD > > > Key: JAMES-3487 > URL: https://issues.apache.org/jira/browse/JAMES-3487 > Project: James Server > Issue Type: Improvement > Components: James Core >Reporter: Benoit Tellier >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > This represents the point at which we should switch from a memory storage > into a file storage. > Defaults is 100 Kb. > Obviously this parameter is important: > - Higher values will operate mostly in-memory thus will have low latencies > but will trash the heap and might trigger a GC hell. > - Lower values will defensively operate on files. Higher latencies but > predictable throughtput. Modern SSDs and FS cache should enable to keep up > with high rates. > Optimally we should have some bench showing the impact of this parameter. > Related to JAMES-3477. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3639) Allow to configure SSL from PEM keys (without a keystore)
[ https://issues.apache.org/jira/browse/JAMES-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406836#comment-17406836 ] Matthieu Baechler commented on JAMES-3639: -- I actually read the PR's code and it looks like you found the same thing as me (just didn't updated this issue). Everything's good then. > Allow to configure SSL from PEM keys (without a keystore) > - > > Key: JAMES-3639 > URL: https://issues.apache.org/jira/browse/JAMES-3639 > Project: James Server > Issue Type: Improvement > Components: IMAPServer, JMAP, POP3Server, SMTPServer >Reporter: Benoit Tellier >Assignee: Antoine Duprat >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > This gives the opportunity to inter-operate directly with OpenSSL formats and > avoids some potentially tricky configuration steps (importing the keys in a > keystore). > Read related thread on the mailing list: > https://www.mail-archive.com/server-dev@james.apache.org/msg70772.html > How this looks like: > {code:java} > > file://conf/private.nopass.key > file://conf/certs.self-signed.csr > > {code} > Tested manually with self signed certificates: > {code:java} > # Generating your private key > openssl genrsa -des3 -out private.key 2048 > # Creating your certificates > openssl req -new -key private.key -out certs.csr > # Signing the certificate yourself > openssl x509 -req -days 365 -in certs.csr -signkey private.key -out > certs.self-signed.csr > # Removing the password from the private key > # Not necessary if you supply the secret in the configuration > openssl rsa -in private.key -out private.nopass.key > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3639) Allow to configure SSL from PEM keys (without a keystore)
[ https://issues.apache.org/jira/browse/JAMES-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406834#comment-17406834 ] Matthieu Baechler commented on JAMES-3639: -- I learnt something funny this summer: starting with jdk 9, the default keystore is pkcs#12 (https://blogs.oracle.com/jtc/jdk9-keytool-transitions-default-keystore-to-pkcs12) which is a standard format that anybody can build from openssl or any other compliant tool. That being said, PKCS#12 also has the good idea of being able to lock the private key with a passphrase, making it less vulnerable to secret stealing. What would you think about making PKCS#12 the default format? > Allow to configure SSL from PEM keys (without a keystore) > - > > Key: JAMES-3639 > URL: https://issues.apache.org/jira/browse/JAMES-3639 > Project: James Server > Issue Type: Improvement > Components: IMAPServer, JMAP, POP3Server, SMTPServer >Reporter: Benoit Tellier >Assignee: Antoine Duprat >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > This gives the opportunity to inter-operate directly with OpenSSL formats and > avoids some potentially tricky configuration steps (importing the keys in a > keystore). > Read related thread on the mailing list: > https://www.mail-archive.com/server-dev@james.apache.org/msg70772.html > How this looks like: > {code:java} > > file://conf/private.nopass.key > file://conf/certs.self-signed.csr > > {code} > Tested manually with self signed certificates: > {code:java} > # Generating your private key > openssl genrsa -des3 -out private.key 2048 > # Creating your certificates > openssl req -new -key private.key -out certs.csr > # Signing the certificate yourself > openssl x509 -req -days 365 -in certs.csr -signkey private.key -out > certs.self-signed.csr > # Removing the password from the private key > # Not necessary if you supply the secret in the configuration > openssl rsa -in private.key -out private.nopass.key > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3569) RecipientRewriteTable sometimes drops attributes from emails
[ https://issues.apache.org/jira/browse/JAMES-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331829#comment-17331829 ] Matthieu Baechler commented on JAMES-3569: -- Hi don't really know much about the reasons why we "fork the email and handle both very differently. However, expecting RRT to be early in the mail handling process is probably too strong of an assertion and we can't really build a reasoning on that. I understand why copying bother you but feel free to open another ticket if you want that fixed. For now, we are just fixing a bug: one of the emails generated by this mailet is loosing attributes. Whether we should have a config option ("keepAttributes") or not could be discussed but we definitely needs that behavior for what we do. > RecipientRewriteTable sometimes drops attributes from emails > > > Key: JAMES-3569 > URL: https://issues.apache.org/jira/browse/JAMES-3569 > Project: James Server > Issue Type: Bug > Components: SMTPServer >Affects Versions: 3.6.0 >Reporter: Jean Helou >Priority: Major > > When a mail has a recipient with a mapping to a remote email address, > RecipientRewriteTable creates a new mail and copies over a few fields. > Unfortunately it doesn't copy all the fields and in particular it drops the > mail attributes that have been computed by the pipeline up to this point. > For recipients which are rewritten to a local address there is no information > loss. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Remove functionality that saves James metrics into Elasticsearch
On Fri, 2021-03-12 at 21:50 +0700, Tellier Benoit wrote: > Hello Matthieu, > > Le 12/03/2021 à 21:14, Matthieu Baechler a écrit : > > Hi, > > > > My answers below. > > > > On Thu, 2021-03-11 at 15:17 +0200, Juhan Aasaru wrote: > > > Hi! > > > > > > There is currently ongoing work to upgrade James to Elasticsearch > > > 7.x > > > See https://github.com/apache/james-project/pull/328 > > > > > > Current James can be configured to save James metrics to a > > > separate > > > index > > > in Elasticsearch. > > > And then Grafana dashboards can be configured to display the > > > history > > > of > > > these saved metrics. > > > For more detailed documentation refer the configuration parameter > > > "elasticsearch.metrics.reports.enabled" here: > > > https://james.apache.org/server/config-elasticsearch.html > > > > > > This functionality that gathers and publish metrics is provided > > > by an > > > unmaintained library: > > > https://github.com/linagora/elasticsearch-metrics-reporter-java > > > This library is not compatible with Elasticsearch 7.x > > > > It doesn't support ES7, that's true. However, this support could be > > added. > > If that is the way to go we then need to find someone devoting > himself > for this maintenance... I agree, if nobody cares, we'll drop it anyway. > > > > > > > > We are proposing to remove this functionality from James as > > > the industry standard is to use external tools that are purpose > > > built > > > for > > > pulling and storing the metrics. > > > > I don't think it's the industry standard, it's prometheus standard. > > The > > debate never settled as far as I know. > > > > > > > > Users currently relying on this functionality would have to > > > configure > > > some > > > monitoring tool (like Prometheus) to regularly pull and store > > > these > > > metrics > > > if they want to continue displaying history of various James > > > related > > > metrics over time. > > > > I don't like that idea. > > > > If we talk about the root of the problem: dropwizard-metrics is > > quite > > rigit and we decided to include ElasticSearch appender in the > > product > > to have it working out of the box > > > > Now, micrometer is doing much better: it works with an SPI much > > like > > slf4 is doing for logging and allows to choose your implementation > > by > > dropping the right jar. > > Reading the documentation... > > ...about Prometheus integration we still end up with some custom > code, > that really looks like the one we had written on top of dropwizard: > https://micrometer.io/docs/registry/prometheus For sure, pull-based means exposing a endpoint and you have to integrate that with your existing routes. > > ...looking at the Elastic integration, we still need to start a > metric > registry https://micrometer.io/docs/registry/elastic > > I fail at seeing the benefits compared to dropwizard metrics > reporters... > > A ctrl+f in https://micrometer.io/docs for the term "SPI" yields no > result, what makes you say "we just need to drop a JAR"? SPI is written here : https://micrometer.io/docs/installing But you are right that they didn't implement the ServiceLoader thing that would allow to dynamically pick the implementation jar. It's a bit surprising to me because they claim to be "the slf4j for metrics" and slf4j is handling the dynamic loading. So in the end you are right: dynamic loading would require some work from us. Yet, as many modern frameworks use this metrics library, at least we would have up-to-date plugins. > > > > I suggestion we just port James to micrometer and let people choose > > how > > they want their metrics exposed. > > > > WDYT? > > More flexibility on the metric side would be IMO nice to have. > > I would be happy to see such contributions happens. > > (Can you give a precise link explaining how and why you think > micrometer > is more flexible than dropwizard?) In fact, projects including micrometer (like spring boot) implement their own dynamic loading. That would make sense to contribute that to micrometer but it's probably too much work for me. Sorry, it probably didn't help much. -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Remove functionality that saves James metrics into Elasticsearch
Hi, My answers below. On Thu, 2021-03-11 at 15:17 +0200, Juhan Aasaru wrote: > Hi! > > There is currently ongoing work to upgrade James to Elasticsearch 7.x > See https://github.com/apache/james-project/pull/328 > > Current James can be configured to save James metrics to a separate > index > in Elasticsearch. > And then Grafana dashboards can be configured to display the history > of > these saved metrics. > For more detailed documentation refer the configuration parameter > "elasticsearch.metrics.reports.enabled" here: > https://james.apache.org/server/config-elasticsearch.html > > This functionality that gathers and publish metrics is provided by an > unmaintained library: > https://github.com/linagora/elasticsearch-metrics-reporter-java > This library is not compatible with Elasticsearch 7.x It doesn't support ES7, that's true. However, this support could be added. > > We are proposing to remove this functionality from James as > the industry standard is to use external tools that are purpose built > for > pulling and storing the metrics. I don't think it's the industry standard, it's prometheus standard. The debate never settled as far as I know. > > Users currently relying on this functionality would have to configure > some > monitoring tool (like Prometheus) to regularly pull and store these > metrics > if they want to continue displaying history of various James related > metrics over time. I don't like that idea. If we talk about the root of the problem: dropwizard-metrics is quite rigit and we decided to include ElasticSearch appender in the product to have it working out of the box. Now, micrometer is doing much better: it works with an SPI much like slf4 is doing for logging and allows to choose your implementation by dropping the right jar. I suggestion we just port James to micrometer and let people choose how they want their metrics exposed. WDYT? -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3510) Automate release procedure to speed up releases
[ https://issues.apache.org/jira/browse/JAMES-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293823#comment-17293823 ] Matthieu Baechler commented on JAMES-3510: -- Just saying but https://infra.apache.org/release-signing.html states that: > Do not store your private key on any ASF machine. Do not create signatures on > ASF machines. Not sure it still stands > Automate release procedure to speed up releases > --- > > Key: JAMES-3510 > URL: https://issues.apache.org/jira/browse/JAMES-3510 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Could we collect here the steps that could be automated to fasten the process > of creating a new release. > Me (or my colleague Andreas) would be willing to work on some of the > automation tasks. > I propose automating publishing to Maven Central (building artifacts and PGP > signing them) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Time for a 3.6.0 release
Hi Juhan, On Tue, 2021-03-02 at 11:10 +0200, Juhan Aasaru wrote: [...] > I think (sadly) ElasticSearch V7 migration (and ES V6 backport) will not > be ready on time for this release, unless additional efforts are > committed to this issue. Since James guice version currently uses Elasticsearch 6.x that has reached end of life and our system cannot go live with old Elasticsearch then we would be very interested in getting this upgrade into a release (preferably into 3.6.0) and put in work in this. I know my colleague Andreas has made an effort to add Elasticsearch 7 support in James and as I understand it currently the problem is to get all the tests to pass in Elasticsearch 7 related modules. But it is unclear to me what is the plan to continue supporting Elasticsearch 6 in parallel. We probably won't be able to support both versions in the same product for a reasonable cost so I propose to drop ElasticSearch 6 support entirely. Would it be possible to have a quick recap of the remaining efforts needed. One place for this could be the Jira task: https://issues.apache.org/jira/browse/JAMES-3492 If the work required to finish is not too large I could get Andreas to come back and work on this starting this Friday. Hopefully this way we have a chance to reach the release deadline (or at least have a second release shortly after the current on) Let's define a deadline. That way, rather it's ready on time and included or not. If you need some help to make it happen, you may find some people offering consulting, we are several to be able to do that on the mailing-list. > the last release taking me over 3 months! Benoit, could you please list the main problems why creating a release is time consuming so we could think solutions how some of this could be automated. For example if PGP signing and publishing artifacts to Maven Central is time consuming then this could be automated in great deal. I created a JIRA issue for this automation initiative: https://issues.apache.org/jira/browse/JAMES-3510 Thanks, good idea. Regarding a release I have planned to raise a new topic that we as a community could think about a "long-term-support" release of James. Currently any James release is more like just a point in time marker but probably many of us have a vision that for a release we could create a separate branch and later only merge important security fixes there and then we could release patched versions like 3.6.1, 3.6.2 etc coming out in parallel with new main releases 3.7.0, 3.8.0 etc I'm interested in getting this set up and working on getting the patches identified and released but for this we would need to dramatically shorten the time and effort it takes to create a (minor/major) release. So this is why I would come back to "long-term-support-version" a bit later. If you want to handle that burden, that's awesome. I think nobody would be against having and LTS release for James. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Time for a 3.6.0 release
Hi Benoit, Hi James devs, Thank you for your email. On Sun, 2021-02-28 at 14:45 +0700, Tellier Benoit wrote: > Hey there! > > It's been almost a year (!) since we shipped some master code out in > a > release (the last release taking me over 3 months!). We definitely need to release more frequently. > As such, I think it would be important, as a project, to make such a > release happen before the next Apache board meeting (mid. April). I > devote myself to perform such a release (let's hope I will be more > efficient this time!). :+1: > I would like to propose the following suggestion for this 3.6.0 release: > > - Claim experimental support for JMAP RFC-8621 and RFC-8887 (websocket > transport) > - As such deprecate JMAP Draft as of 3.6.0, to be removed in 3.7.0. (or > later based on community usage) in order to encourage the migration. > - Deprecate `server/container/guice/cassandra-guice` product. > Rationals: it do not achieve distribution, this James server cannot be > scaled and maintaining a high project cardinality is hard - at least to > me. The more servers, the harder releasing is. :+1: > > I would also love to see the following work integrated in it: > > - https://issues.apache.org/jira/browse/JAMES-3506 SMTP stach is slow > and generate high GC when under high traffic (because it is such a nice > enhancement counter-weighting drawbacks of JAMES-3477) > - https://github.com/apache/james-project/pull/303 JAMES-3499 Use a > module chooser for LDAP users repository > <https://github.com/apache/james-project/pull/303> (as it ease > maintainance efforts) I don't care about how many additionals things we integrate as long as we define a deadline for cutting the release. Could you say we cut the release on 2021-03-15 ? > I think (sadly) ElasticSearch V7 migration (and ES V6 backport) will not > be ready on time for this release, unless additional efforts are > committed to this issue. It's sad but also it will be a good incentive to release a 3.8 release in a not so distant future. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Comment Edited] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271977#comment-17271977 ] Matthieu Baechler edited comment on JAMES-3492 at 2/8/21, 8:17 PM: --- Hi [~juhan], your plan is correct. I would refine it a little: 1. duplicate the ElasticSearch modules (for example mailbox/elasticsearch) and pick the last driver 2. duplicate every test-suite that involves an ElasticSearch container to change ElasticSearch version and james related elasticsearch modules 3. run new testsuites and mark failing tests as @Disabled 4. push upstream ! 5. fix every single @Disabled test 6. implement something to bind the version you want in the main products or in your own product 7. provide benchmark results with ES7 driver It's very likely that once you reach this point 6, every thing will just work. The coverage is quite good on search. The "manual testing" phase is probably a short one. Step 4 will help a lot at getting feedback and maybe help from the community. BTW, it's an exciting! was (Author: matthieub): Hi [~juhan], your plan is correct. I would refine it a little: 1. duplicate the ElasticSearch modules (for example mailbox/elasticsearch) and pick the last driver 2. duplicate every test-suite that involves an ElasticSearch container to change ElasticSearch version and james related elasticsearch modules 3. run new testsuites and mark failing tests as @Disabled 4. push upstream ! 5. fix every single @Disabled test 6. implement something to bind the version you want in the main products or in your own product It's very likely that once you reach this point 6, every thing will just work. The coverage is quite good on search. The "manual testing" phase is probably a short one. Step 4 will help a lot at getting feedback and maybe help from the community. BTW, it's an exciting! > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281067#comment-17281067 ] Matthieu Baechler commented on JAMES-3492: -- > Maybe we could also try, in another JIRA, to test ES6 with the ES7 driver. If > we can make it work quickly, it would allow to choose the ES version at > runtime. The elasticsearch doc says explicitely that the drivers don't support previous server versions. We can forget this idea. > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281046#comment-17281046 ] Matthieu Baechler commented on JAMES-3492: -- I'm not sure it's a good idea to drop es6 support as soon as we have es7 support. I would really prefer having both implementations. I agree with splitting guice modules some more (it's always a good refactoring). Maybe we could also try, in another JIRA, to test ES6 with the ES7 driver. If we can make it work quickly, it would allow to choose the ES version at runtime. > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278843#comment-17278843 ] Matthieu Baechler commented on JAMES-3492: -- > Do you mean to duplicate the tests in specific elasticsearch modules or every > test that in any way uses Elasticsearch Docker container The tests in specific ElasticSearch modules. > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276331#comment-17276331 ] Matthieu Baechler commented on JAMES-3492: -- Hi Andreas, I can offer an online pair-programming session if you're interested. > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3492) Elasticsearch 6->7 upgrade for guice version
[ https://issues.apache.org/jira/browse/JAMES-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271977#comment-17271977 ] Matthieu Baechler commented on JAMES-3492: -- Hi [~juhan], your plan is correct. I would refine it a little: 1. duplicate the ElasticSearch modules (for example mailbox/elasticsearch) and pick the last driver 2. duplicate every test-suite that involves an ElasticSearch container to change ElasticSearch version and james related elasticsearch modules 3. run new testsuites and mark failing tests as @Disabled 4. push upstream ! 5. fix every single @Disabled test 6. implement something to bind the version you want in the main products or in your own product It's very likely that once you reach this point 6, every thing will just work. The coverage is quite good on search. The "manual testing" phase is probably a short one. Step 4 will help a lot at getting feedback and maybe help from the community. BTW, it's an exciting! > Elasticsearch 6->7 upgrade for guice version > > > Key: JAMES-3492 > URL: https://issues.apache.org/jira/browse/JAMES-3492 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > Guice versions use Elasticsearch 6 that has reached end of life. > We are thinking about starting to work on this issue but first we need to > estimate the effort required. If anyone has any input on this please add a > comment. Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Jenkins CI setup
Hi Jean, Thank you so much for delivering this awesome fight against the red CI. In the past, such unstable builds have always been linked to some resources leaks during tests. At this point, we usually stopped implementing new things for a while and focus in pluging leaks to bring back build stability. I guess it's this time of the year again and you are the one paying the price for now. The first solution is to recycle JVMs less to mitigate leaks effects (with surefire reusefork option). If it's not enough, I'll have a look at pluging some leaks myself, if nobody else care enough. Keep me posted about the outcome of this change, I do care about having a CI for James very much. Cheers, -- Matthieu Baechler On Wed, 2021-01-13 at 12:50 +0100, Jean Helou wrote: > Happy new year fellow jamers ! > > In this thrilling new episode you might learn if 2021 will be the > year the > james project gets a public ci rolling again ! > > CI wars > Episode 49e^55 > The Memory errors strike back > The CI Resistance succeeded in configuring jenkins, fixed some tests, > exposed some bugs and tagged a lot of unstable tests as being > unstable. > After such a striking defeat the empire of bugs reacted in the most > vicious > way ever, it deployed "Direct buffer memory" errors throughout the > galaxy > to find contributors to the CI effort and tear down their hope and > motivation. They found the apache jenkins and it will need help from > all > the CI resistance members to fight them off ! > > On a bit more serious note, > I am at a loss as to how to fix this issue. My last four builds have > failed > because a `java.lang.OutOfMemoryError: Direct buffer memory` caused > the > forked jvm to crash, crashing the surefire plugin and the build with > it. > and that has been a build failure cause for a lot of the 63 builds on > the > apache CI. until now I updated the pom files of the corresponding > projects > to increase heap to 2G but the last failure occured in a project > where the > heap was already increased. > > Looking at a specific log > https://builds.apache.org/blue/organizations/jenkins/james%2FApacheJames/detail/PR-264/12/pipeline > I get a more classical `java.lang.OutOfMemoryError: Java heap space` > a bit > before (12:14:09.563 vs 12:45:57.988). > The last non error line before the fatal Direct buffer memory error > is > [INFO] Running > org.apache.james.webadmin.integration.rabbitmq.RabbitMQReindexingWith > EventDeadLettersTest > The last non error line before the nonfatal heap memory error is > [INFO] Running > org.apache.james.jmap.memory.cucumber.MemoryDownloadCucumberTest > > I will try to increase surefire's heap for the > memory-jmap-draft-integration-testing project too in case the inital > heap > space OOM triggered the other one. > stackoverflow is not very helpful either > https://stackoverflow.com/search?q=java.lang.OutOfMemoryError%3A+Direct+buffer+memory > or I have not been able to comprehend how the solutions there could > help > > I have gone through the files in /dockerfiles without finding > anything that > looked related to memory configuration of maven itself, If people who > run > the build locally with success or on their own CI could check the > MVN_OPTS > and let me know if they override maven's Xmx itself I would > appreciate it. > > thanks for your help > jean > > > > On Tue, Dec 29, 2020 at 9:10 AM Jean Helou > wrote: > > > Hi Benoit, > > > > As someone operating another CI, I want to play even unstable test > > on > > > every runs. Is there some adaptation needed to do this? > > > > > > > Yes you will have to change your CI, > > > mvn -B -e -fae test > > now only runs stable tests, to run unstable tests you need an > > additional > > step > > > mvn -B -e -fae test -Punstable-tests > > > > I believe your CI is also based on jenkins (because of the stress > > test > > jenkinsfile at the root of the project) in which case you could > > configure > > your jenkins to pick up the jenkinsfile and use the same pipeline > > as we use > > on apache CI > > > > cheers, > > jean > > - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: IMAP Fetch behavior on not found messages
On Tue, 2021-01-12 at 12:42 +0100, Jean Helou wrote: > > > When doing a fetch with some > non existing messages Cyrus will do a best effort and return the > existing messages whereas James will return a BAD response. I would preserve Cyrus's behavior as a defacto standard, not honoring this incurs the risk of breaking existing client software which relies on this behavior. if the preference is for a stricter behavior, then BAD is correct here. I would definitely suggest to try the stricter behaviour with an outlook client to make sure it doesn't break the UX too badly > And in case > of a fetch on an empty mailbox Cyrus will return a NO response where > James will return a BAD one. > After reading the discussion I feel that NO is more appropriate. - NO feels more like HTTP 404 NOT FOUND - BAD feels more like HTTP 400 BAD REQUEST I fully agree with Jean. -- Matthieu - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: The case of javax.mail MimeMessage CopyOnWrite optimization
Hi there, Thank you for bringing this topic to the mailing-list. To me safety and correctness is much more important than raw performance so I would like the always-copy implementation to replace the COW version. However, keep in mind that the JMH benchmark figures did not told the full story about the consequences of this change and be ready to experience slower real-world performances. Cheers, -- Matthieu Baechler On Tue, 2020-12-29 at 12:54 +0700, Tellier Benoit wrote: > Hello there! > > We had been discussing on GitHub recently about an optimization in > james > core around the usage of MimeMessage. > > Javax.mail MimeMessage is currently used to represent a message of an > email as part of the mail processing in James. It is part of the Mail > interface (mailet-api). > > As Mail envelope is composed of several recipients, mail related > operations are performed once for all these recipients (we enqueue > the > mail one time, we strip bcc one time etc...). Troubles arise when we > need different behaviors as part of mail processing across recipients > (think remote recipients, that needs there mail to be relayed, versus > local recipients that needs to be locally delivered). The email get's > duplicated (in MatcherSplitter) and the processing will then be > distinct > for both entities. The underlying MimeMessage may - or may not be > modified. > > In order to prevent MimeMessage duplication in the event the > underlying > MimeMessage is not modified, a Copy On Write mechanism was introduced > (I > guess... Sorry, I was not there yet). > > Upon his CI effort, Jean Helou with the help of Matthieu Baechler > made > he unpleasant finding that this was not thread safe, that was leading > to > build instabilities. The mailet processing happens in Camel, which is > multi-threaded. Concurrency issues arised between modifications, and > message disposal, when a same MimeMessage instance was shared. [1] > > A first effort was to try to achieve thread-safety, which leaded to a > brittle double reetrant read-write locks in order to govern data > access. > However, another performance enhancement bypassed these lock > mechanism > (MimeMessageWrapper allows accessing the data as an InputStream > instead > of requiring to copy it). The effort seemed overwhelming, not to > metion > possible risks of dead-locks. [2] > > We then came up with an always copy implementation [3]. Simpler, > safer... The underlying logic is to avoid trying being smarter than > mutability, and leverage immutability to achieve thread safety, which > is > a classic functional programming idiom. > > JMH benchmarks were conducted. We highlighter little performance > difference for small messages, in the percent realm for both memory > allocation and compute time. Differences are however higher for > bigger > messages (~10%) for both metrics. > > Please note that above 100KB the MimeMessage would be stored on disk, > thus limiting memory impact (see MimeMessageInputStreamSource). Maybe > we > should make the threshold configurable, via a system property for > instance? > > I just want to further mentioned I encountered that very issue on a > production instance: the underlying email had been corrupted by the > above mentioned COW bug and kept throwing NullPointerExceptions every > time the content was accessed. This resulted (on top of > distributed-james) in a RabbitMQ nack of the message, that ended up > in a > dead-letter queue. Replaying its processing required admin > intervention > and had been interpreted by the user as an email loss... > > To conclude this effort we (Jean an I) would like to merge the > "Always > copy" pull request. > > Also, would it be beneficial to write an ADR about this topic? > > Thoughts? > > Cheers, > > Benoit > > [1] The unfamous COW bug: > https://github.com/apache/james-project/pull/280/commits/09b5554bbcbbb98757910d59bac54f97ee1f8b4f > > [2] The double nested reetrant read-write lock fix attempt: > https://github.com/apache/james-project/pull/280 > [3] The always copy fix: > https://github.com/apache/james-project/pull/282 > [4] Benchmarks: > https://github.com/apache/james-project/pull/280#issuecomment-745211736 > & > https://github.com/apache/james-project/pull/280#issuecomment-745701937 > [5] The JIRA ticket: https://issues.apache.org/jira/browse/JAMES-3477 > > > - > To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org > For additional commands, e-mail: server-dev-h...@james.apache.org > - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: James requires administrative rights on RabbitMQ (!!!)
Hi, On Fri, 2020-12-11 at 12:42 +0700, Tellier Benoit wrote: > Hello James DEVs !!! > > I want to start a discussion around > https://issues.apache.org/jira/browse/JAMES-3475 > > Our issue is that James so far require administrative rights on > RabbitMQ > server. > > This of course means that sharing this RabbitMQ with other apps / > James > servers of other tenant represent a data isolation / security issue, > that we leverage by giving James his own dedicated RabbitMQ server, > which don't help mutualizing costs. > > Thus, I would like to leverage Cassandra to keep track of created > queues. > > This is a task that could be quickly tackled by Quan our intern, who > wants to learn about NoSQL. This could be a very good sandbox issue > for him. > > Feedback? What about using RabbitMQ virtualhost feature instead? https://www.rabbitmq.com/vhosts.html Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: About our usage of LWT in Cassandra related code
On Sun, 2020-12-06 at 22:33 +0100, Jean Helou wrote: > Hello, > > I'm currently trying to increase overall efficiency of the > Distributed > > James server. > > > > I have some concerns but i feel imposterish for posting them as they > most > likely come from my own lack of knowledge, i'll still try just in > case some > of points are valid :) > > - `users` we rely on LWT for throwing "AlreadyExist" exceptions. LWT > > are likely unnecessary as the webadmin > > presentation layer is offering an idempotent API (and silents the > > AlreadyExist exceptions). Only the CLI > > (soon to be deprecated for Guice products) makes this distinction. > > > So from a user perspective adding a user would always succeed. But > would it > succeed by doing nothing (the current behaviour in silencing the > AlreadyExist exception) or would it succeed by effectively > overwriting the > user (in a last write wins manner) ? This is a completely different > behaviour which is not necessarily desirable. > this can be further divided into 2 different cases : > - there are concurrent attempts to create the same user (in which > case the > user data is very likely the same or very close, and has possibly > never > been exposed to a human) in which case the LWW behaviour may be > acceptable > - A user has existed for a long time (definition of long to be > defined but > I would say above a few seconds :) ) in which cas overwriting is most > likely not acceptable > Fully agree: being idempotent for a command is not the same thing as "having unpredicable things happening without complaining". I almost never want to overwrite a user without explicitely asking for it: as a user creating a resource is not the same intention as modifying it. > > > - `domains` we rely on LWT for throwing "AlreadyExist" exceptions. > > LWT > > are likely unnecessary as the webadmin > > presentation layer is offering an idempotent API (and silents the > > AlreadyExist exceptions). Only the CLI > > (soon to be deprecated for Guice products) makes this distinction. > > Discussions have started on the topic and a proof of > > concept is available. > > > > same as above > > Why it would be ok to drop LWT for ACL updates only to replace it by > eventsourcing when you write: > > LWT are required for `eventSourcing`. As event sourcing usage is > > limited > to low-usage use cases, the performance degradations are not an > issue. > Doesn't that mean that ACLs would still rely on LWT but within an > additional layer ? Yes, it's the proposed solution AFAIU. > Also for ACLs isn't eventual consistency acceptable ? using > transactions to > avoid non serial writes but accepting stale reads ? I would say ACL could take effect in an eventual consistency way. > That's the limit of my understanding : all the flags/UID/IMAP > concerns are > beyond my current knowledge but I'll enjoy reading the comments :) Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: About our usage of LWT in Cassandra related code
On Tue, 2020-12-08 at 10:12 +0700, Tellier Benoit wrote: > Hello Matthieu, > > Sadly, I'm unable to see what you did write in the email you sent due > to > the absence of quote. > > Can you review your email client settings, in order to get a readable > output we can start discussing on? > > This time, I made the effort, but I would greatly appreciate a better > display. > I don't know what happened, I use the same mailer for years and never had this issue before. This morning, replying to the original mail with the same mailer with the same settings quote things. I guess it's a bug. > Best regards, > > Benoit > > Le 07/12/2020 à 14:47, Matthieu Baechler a écrit : > > Hi Benoit, > > > > On Fri, 2020-12-04 at 14:22 +0700, btell...@linagora.com (OpenPaaS) > > wrote: > > Hi, > > > > I'm currently trying to increase overall efficiency of the > > Distributed > > James server. > > > > As such, I'm pocking around for improvement areas and found a huge > > topic > > around LWT. > > > > My conclusions so far are that we should keep LWT and SERIAL > > consistency > > level out of the most common use cases. > > > > I know that this is a massive change in regard of the way the > > project > > had been working with Cassandra in the past few years. I would > > definitely, in the middle term, would like to reach LWT free reads > > on > > the Cassandra Mailbox to scale the deployments I am responsible of > > as > > part of my Linagora job (my long term goal being to decrease the > > total > > cost of ownership of a "Distributed James" based solution). While I > > am > > not opposed to diverge from the Apache James project on this point, > > if > > needed, I do believe an efficient distributed server (with the > > consequences it implies in term of eventual consistency) might be a > > strong asset for the Apache project as well, and would prefer to > > see > > this work lending on the James project. > > > > I've been ambitious on the ADR writing, especially in the > > complementary > > work section. Let's see which consensual ground we find on that! > > (the > > ML > > version here below serving as a public, immutable reference of my > > thinking!) > > > > > > I doubt we can model IMAP without serializability somewhere but > > let's > > read your proposal as I have LWT as much as you are. > > s/have/hate/ ? Yes, typo > > > > > > > --- > > > > ## Context > > > > As any kind of server James needs to provide some level of > > consistencies. > > > > Strong consistency can be achieved with Cassandra by relying on > > LightWeight transactions. This enables > > optimistic transactions on a single partition key. > > > > Under the hood, Cassandra relies on the PAXOS algorithm to achieve > > consensus across replica allowing us > > to achieve linearizable consistency at the entry level. To do so, > > Cassandra tracks consensus in a system.paxos > > table. This `system.paxos` table needs to be checked upon reads as > > well > > in order to ensure the latest state of the ongoing > > consensus is known. This can be achieved by using the SERIAL > > consistency > > level. > > > > Experiments on a distributed James cluster (4 James nodes, having 4 > > CPU > > and 8 GB of RAM each, and a 3 node Cassandra > > cluster of 32 GB of RAM, 8 CPUs, and SSD disks) demonstrated that > > the > > system.paxos table was by far the most read > > and compacted table (ratio 5). > > The table triggering the most reads to the `system.paxos` table was > > the > > `acl` table. Deactivating LWT on this table alone > > (lightweight transactions & SERIAL consistency level) enabled an > > instant > > 80% throughput, latencies reductions > > as well as softer degradations when load breaking point is > > exceeded. > > > > > > Do you mean that Cassandra is the bottleneck in this setup? > > What is the effect of having more Cassandra nodes? > > Yes, it is. > > The effect of adding more Cassandra nodes means more costs. You didn't answered the question I asked, do you? > Our ownership cost is so far of 5€/user/year which is around 25 time > more than our competitors. The goal is to lower such costs, in order > to > have a viable commercial solution built on top of James. Do yo
Re: About our usage of LWT in Cassandra related code
and a proof of concept is available. - `mailboxes` relies on LWT to enforce name unicity. We hit the same pitfalls than for ACLs as this is a very often read table (however mailboxes of a given user being grouped together, primary key read are more limited hence this is less critical). Similar results could be expected. Discussions on this topic have not been started yet. Further impact studies on performance needs to be conducted. Well, lagging on ACL is not really a problem but for mailbox, don't you fear having race conditions and thus name collision on mailbox? - `messages` as flags update is so far transactional. However, by better relying on the table structure used to store flags we could be relying on Cassandra to solve data race issues for us. Note also that IMAP CONDSTORE extension is not implemented, and might be a non-viable option performance-wise. We might choose to favor performance other transactionality on this topic. Discussions on this topic have not started yet. I think that modern IMAP extensions are important for the user experience: they can make email handling faster by themselves. I would not make a choice that prevents implementation of such extensions in the futures. LWT are required for `eventSourcing`. As event sourcing usage is limited to low-usage use cases, the performance degradations are not an issue. I think I understand but I ask anyway: the performance gain is not really the removal of LWT but the CQRS nature of Event Sourcing, you'll read in a view that doesn't use LWT. Can't you achieve the same with a "simpler" CQRS architecture without using Event Sourcing? LWT usage is required to generate `UIDs`. As append message operations tend to be limited compared to message update operations, this is likely less critical. UID generation could be handled via alternative systems, past implementations have been conducted on ZooKeeper. If not implementing IMAP CONDSTORE, generation of IMAP `MODSEQ` likely no longer makes sense. As such the fate of `MODSEQ` is linked to decisions on the `message` topic. Oh, here we are: we need yet another system. Note that I'm in favor of it but that's the reason why we use LWT in the first place: avoid this additional dependency. It's rather LWT or any transactional system as we can't find a wait to workaround the need for monotonic distributed counter (for example). You listed several problems and in my opinion each one may have a different solution. What about debating each one separately? Could we start from here: what's the best solution to implement a monotonic distributed counter? Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Jenkins CI setup
Hi, On Fri, 2020-12-04 at 14:59 +0700, btell...@linagora.com (OpenPaaS) wrote: [...] > > > > Here is what I would like to do at this stage : > > - Isolate the unstable tests under with an unstable tag (akin to > > "feature > > tags") > I'd advocate a @Disabled tag, referencing both a JIRA ticket specific > to > the bugfix needed, and the JIRA of the CI build. > > Having a list of such issues in the JIRA (CI setup) ticket would be > valuable. I'd even advise doing subtickets to have a nice checklist. Let's say there's 10 unstable tests that will prevent the CI PR to be green, do you expect Jean to open 10 tickets with explanation of each problem? That would be a very high expectation. > > - exclude these tests from the default surefire execution profile, > > - add a parallel pipeline step for these tests where the step > > failure > > doesn't fail the pipeline [2] > > - ensure that the build is green > > - merge so the project finally has a working public CI > > > > I intend to start working on this quickly so we can all enjoy a > > functional > > public CI. > +1 I agree on the approach. I think we can event skip the "add a parallel pipeline step" part entirely. The simpler the better. [...] Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: JMS & File mail queue still rely on serializable.
Hi Benoit, On Thu, 2020-12-03 at 11:54 +0700, Benoit Tellier wrote: > While working on https://issues.apache.org/jira/browse/JAMES-3431 I > discovered that JMS & File mailqueue do still rely on serialization. > > This is what motivates to re-open this ticket: > https://issues.apache.org/jira/browse/JAMES-2578 > > Please kindly note that all of the MailRepository implementation no > longer uses Java serialization. > > Our AttributeValue adoption is partial; I would like to finish the > job. > > Here are the options we have: > > Accept DSN feature do not work on tese implementations (not my > prefered at all!) > Re-implement DSNParameters attribute mapping to not use > collection > attributeValues. This work around the main issue for this specific > use > case of attribute values. (I feel okay with that) > Try to fix collection attributeValue java serialization is likely > hard to do, but also keeps java serialization around for longer in > the > code base. Likely a dead-end. > No longer rely on Java serialization for "JMS" & "File" mail > queues. > This means either smart fallback code, or at worst an upgrade path > with > an empty mail queue. That is by far my preferred option, and I will > start community discussions in that direction. > > Do we got consensus around this topic? This Java serialization compat has been crippling our developpements for too long. I'm in favor of complete removal and a migration strategy. Cheers, -- Matthieu - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3400) Develope new James CLI based on WebAdmin API
[ https://issues.apache.org/jira/browse/JAMES-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214860#comment-17214860 ] Matthieu Baechler commented on JAMES-3400: -- Forget my comment, I read the doc and it requires graalvm download. Let’s drop that requirement > Develope new James CLI based on WebAdmin API > > > Key: JAMES-3400 > URL: https://issues.apache.org/jira/browse/JAMES-3400 > Project: James Server > Issue Type: Improvement > Components: CLI >Reporter: Tran Hong Quan >Priority: Major > > Webadmin command-line interface is an upcoming replacement for the outdated, > security-vulnerable JMX command-line interface. It also aims at providing a > more modern and intuitive interface. > For now, objective for the new CLI is interact with Domains, Users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3400) Develope new James CLI based on WebAdmin API
[ https://issues.apache.org/jira/browse/JAMES-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214853#comment-17214853 ] Matthieu Baechler commented on JAMES-3400: -- Did you tried https://www.graalvm.org/reference-manual/native-image/NativeImageMavenPlugin/ ? > Develope new James CLI based on WebAdmin API > > > Key: JAMES-3400 > URL: https://issues.apache.org/jira/browse/JAMES-3400 > Project: James Server > Issue Type: Improvement > Components: CLI >Reporter: Tran Hong Quan >Priority: Major > > Webadmin command-line interface is an upcoming replacement for the outdated, > security-vulnerable JMX command-line interface. It also aims at providing a > more modern and intuitive interface. > For now, objective for the new CLI is interact with Domains, Users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3406) Documentation page - distributed James consistency model
[ https://issues.apache.org/jira/browse/JAMES-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212881#comment-17212881 ] Matthieu Baechler commented on JAMES-3406: -- > To be honest, documentation should describe what we have, not what we could > possibly have. I completely agree > As such, I think it's premature to advertise a Multi DC Cassandra based IMAP > server. It's not my intent neither. I'm stating that describing where the limitation is helps people to understand why we don't have X or Y yet. > Documentation page - distributed James consistency model > > > Key: JAMES-3406 > URL: https://issues.apache.org/jira/browse/JAMES-3406 > Project: James Server > Issue Type: Improvement > Components: cassandra, Documentation, elasticsearch, guice, rabbitmq >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > Document, in a dedicated section of the new documentation website the > consistency model > (`/docs/modules/servers/pages/distributed/architecture/consistency-model.md`) > - Data Replication > - Words about Cassandra consistency model > - Words about ElasticSearch consistency model > - Discourage General usage Cassandra MultiDC set-up (because of > Lightweight Transaction) > - De-normalization > - Which data is denormalized ? > - What can go wrong (denormalization inconsistencies) ? > - `Solve Inconsistency tasks` > - Applicative read repairs > - Consistency across data stores > - Write to object storage first, then position Cassandra meta-data > - Cassandra <=> ElasticSearch: point to the EventBus (async, retries, > dead-letter) + reIndex > - Recovering RabbitMQ mailQueue from the Cassandra projection > Don't forget to point/reuse existing ADRs ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3427) definitely delete mailbox-store module
[ https://issues.apache.org/jira/browse/JAMES-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212875#comment-17212875 ] Matthieu Baechler commented on JAMES-3427: -- > I believe a lot of efforts will have to go on writing tests for the manager > logic - as a common behaviour is enforced via inheritance. Removing this > inheritance will lead to diverging behaviors unless we test it. With the usual rule "don't change code that is not covered with tests" it has no reason to diverge. By the way, given the amount of methods in MailboxManager for instance we'd better remove a lot of them before expecting a good coverage. > Did you evaluate the parts of the code that today relies directly on the > mappers? I'm thinking to all these mailbox modules needing low level access > like reindexing, quotaroot, indexing, many mailbox listener. What's your plan > regarding them? No. I know that some plans where drawn regarding some of these topics in the past. I don't have anything to propose yet regarding this. Let's see what come out of this initiative. By the way, thank you for caring to comment. > definitely delete mailbox-store module > -- > > Key: JAMES-3427 > URL: https://issues.apache.org/jira/browse/JAMES-3427 > Project: James Server > Issue Type: Improvement > Components: mailbox >Reporter: Matthieu Baechler >Priority: Major > > mailbox-store module aims at sharing code between mailbox implementations. > To Achieve that, it relies on inheritance and a level of abstraction, namely > Mappers. > This design promotes code sharing as a way to share behaviors of Managers > implementation. > This strategy is brittle because there's no way to ensure that inheriting a > class will define its behavior. > In James, we usually define Contract TestSuites to ensure that the > implementation of an interface behaves as expected, Managers are an exception > to this rule. > Also, the use of the Mapper layers offer very weak guarantees and at the same > time constrains the way we implement Managers. > Dropping the mapper layer would help simplify the codebase. > So the plan is to push methods down in all implementations, remove abstract > classes, inline Mappers in Managers and port relevant Mappers tests to > Managers level. > Some efforts to reduce duplication generated by pushing methods down the > hierarchy will lead to new helper classes to share between classes or > enriching existing APIs for a higher level of abstraction. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3427) definitely delete mailbox-store module
Matthieu Baechler created JAMES-3427: Summary: definitely delete mailbox-store module Key: JAMES-3427 URL: https://issues.apache.org/jira/browse/JAMES-3427 Project: James Server Issue Type: Improvement Components: mailbox Reporter: Matthieu Baechler mailbox-store module aims at sharing code between mailbox implementations. To Achieve that, it relies on inheritance and a level of abstraction, namely Mappers. This design promotes code sharing as a way to share behaviors of Managers implementation. This strategy is brittle because there's no way to ensure that inheriting a class will define its behavior. In James, we usually define Contract TestSuites to ensure that the implementation of an interface behaves as expected, Managers are an exception to this rule. Also, the use of the Mapper layers offer very weak guarantees and at the same time constrains the way we implement Managers. Dropping the mapper layer would help simplify the codebase. So the plan is to push methods down in all implementations, remove abstract classes, inline Mappers in Managers and port relevant Mappers tests to Managers level. Some efforts to reduce duplication generated by pushing methods down the hierarchy will lead to new helper classes to share between classes or enriching existing APIs for a higher level of abstraction. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3406) Documentation page - distributed James consistency model
[ https://issues.apache.org/jira/browse/JAMES-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212363#comment-17212363 ] Matthieu Baechler commented on JAMES-3406: -- >> Could you explain why multi-dc + LOCAL_QUORUM + LOCAL_SERIAL is a bad idea? > Because the transaction (and strong consistency) only happens within a DC > with such a solution, as far as I understand. > Maybe I need to add more details here? Yes, I have the same analysis. However, it's probably the only scalable setup. If we want to have a scalable mail server we should dig that solution. >> Actually I think it's because we really want UID generation to be monotonic >> to avoid re-assigning a mail uid but writing it in the doc could help to >> workaround that particular topic, with for example a very different uid >> generation implementation that would rely on in-memory datastructure. > Got the point. However change to Uid Validity means full resynch. I don't get the link between changing the implementation of uid generation and invalidating some uidvalidity. In-memory in my sentence where not about "transient information" but rather "fast to synchronize vs Cassandra LWT". Yes, I'm being nostalgic of ZooKeeper uid generator. > From what I saw it results to a Fetch of the MessageId headers. > While doable in theory, in practice it would likely result in an inefficient > solution. > Maybe a shortcut here is stating that IMAP don't scale multi-DC but the JMAP > can (as it is a more modern protocol) without major data loss (other than > flags). > What do you think? I think that we could ensure that without IMAP we are really scalable than tackle the IMAP scalability issue. > Documentation page - distributed James consistency model > > > Key: JAMES-3406 > URL: https://issues.apache.org/jira/browse/JAMES-3406 > Project: James Server > Issue Type: Improvement > Components: cassandra, Documentation, elasticsearch, guice, rabbitmq >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > Document, in a dedicated section of the new documentation website the > consistency model > (`/docs/modules/servers/pages/distributed/architecture/consistency-model.md`) > - Data Replication > - Words about Cassandra consistency model > - Words about ElasticSearch consistency model > - Discourage General usage Cassandra MultiDC set-up (because of > Lightweight Transaction) > - De-normalization > - Which data is denormalized ? > - What can go wrong (denormalization inconsistencies) ? > - `Solve Inconsistency tasks` > - Applicative read repairs > - Consistency across data stores > - Write to object storage first, then position Cassandra meta-data > - Cassandra <=> ElasticSearch: point to the EventBus (async, retries, > dead-letter) + reIndex > - Recovering RabbitMQ mailQueue from the Cassandra projection > Don't forget to point/reuse existing ADRs ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3406) Documentation page - distributed James consistency model
[ https://issues.apache.org/jira/browse/JAMES-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212283#comment-17212283 ] Matthieu Baechler commented on JAMES-3406: -- I'm late to this but I'm struck by > Running the Distributed Server in a multi datacenter setup will likely result > either in data loss, or very slow operations. Isn't it the goal of a distributed server to work with redundant DCs? Could you explain why multi-dc + LOCAL_QUORUM + LOCAL_SERIAL is a bad idea? Actually I think it's because we really want UID generation to be monotonic to avoid re-assigning a mail uid but writing it in the doc could help to workaround that particular topic, with for example a very different uid generation implementation that would rely on in-memory datastructure. > Documentation page - distributed James consistency model > > > Key: JAMES-3406 > URL: https://issues.apache.org/jira/browse/JAMES-3406 > Project: James Server > Issue Type: Improvement > Components: cassandra, Documentation, elasticsearch, guice, rabbitmq >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > Document, in a dedicated section of the new documentation website the > consistency model > (`/docs/modules/servers/pages/distributed/architecture/consistency-model.md`) > - Data Replication > - Words about Cassandra consistency model > - Words about ElasticSearch consistency model > - Discourage General usage Cassandra MultiDC set-up (because of > Lightweight Transaction) > - De-normalization > - Which data is denormalized ? > - What can go wrong (denormalization inconsistencies) ? > - `Solve Inconsistency tasks` > - Applicative read repairs > - Consistency across data stores > - Write to object storage first, then position Cassandra meta-data > - Cassandra <=> ElasticSearch: point to the EventBus (async, retries, > dead-letter) + reIndex > - Recovering RabbitMQ mailQueue from the Cassandra projection > Don't forget to point/reuse existing ADRs ! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3425) ElasticSearchIntegrationTest is failing
Matthieu Baechler created JAMES-3425: Summary: ElasticSearchIntegrationTest is failing Key: JAMES-3425 URL: https://issues.apache.org/jira/browse/JAMES-3425 Project: James Server Issue Type: Bug Components: elasticsearch Reporter: Matthieu Baechler I pulled master branch from today and I found that `ElasticSearchIntegrationTest` is failing on `internalDateAfterShouldReturnMessagesAfterAGivenDate` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3405) Expose metrics over HTTP
[ https://issues.apache.org/jira/browse/JAMES-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210125#comment-17210125 ] Matthieu Baechler commented on JAMES-3405: -- To be honest, It's not because I don't find this relevant that we as a community should not implement it, you don't need my approval. Go ahead with your plan, by the way, code changes can be undone if needed. Please explain what access-control you expect for this endpoint in the DoD. > Expose metrics over HTTP > > > Key: JAMES-3405 > URL: https://issues.apache.org/jira/browse/JAMES-3405 > Project: James Server > Issue Type: Improvement > Components: guice, Metrics, webadmin >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > = Why ? > I want to export James metrics to prometheus. > This modern metric stack relies on pulling (prometheus collects the metrics) > instead of pushing. > This need had been expressed by [~ieugen] here: > https://github.com/ieugen/james-self-hosting-sandbox/issues/20 > = How ? > With the https://github.com/prometheus/client_java dependency expose that as > part of a `webadmin-metrics` Unauthentictated endpoint. > Adapt content at > https://github.com/prometheus/client_java/blob/master/simpleclient_servlet/src/main/java/io/prometheus/client/exporter/MetricsServlet.java > Be aware that there is some conversion format to be done here: > {code:java} > CollectorRegistry collectorRegistry = CollectorRegistry.defaultRegistry; > new DropwizardExports(environment.metrics()).register(collectorRegistry); > new MetricsServlet(collectorRegistry) > {code} > Source > https://stackoverflow.com/questions/53408121/which-metrics-are-exported-by-dropwizardmetrics-prometheus-client > = Definition of done > {code:java} > As an admin, > I have an HTTP endpoint > Exposing metrics with the prometheus format > {code} > Scope: all guice products -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3405) Expose metrics over HTTP
[ https://issues.apache.org/jira/browse/JAMES-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210043#comment-17210043 ] Matthieu Baechler commented on JAMES-3405: -- > I don't see how it would hurt: people not willing to use it will simply not > call the endpoint. I thing this argument is weak: when we build things we try to provide a service to users with the minimal amount of effort (code, maintenance, etc). When something looks like a bad idea we usually don't implement it, we don't put it in the code and let users figure out what to do or not with it. Every single endpoint is more work regarding maintenance, security, DoS attacks, etc. By the way, the definition of done of this issue doesn't explain what is expected regarding security. > Expose metrics over HTTP > > > Key: JAMES-3405 > URL: https://issues.apache.org/jira/browse/JAMES-3405 > Project: James Server > Issue Type: Improvement > Components: guice, Metrics, webadmin >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > = Why ? > I want to export James metrics to prometheus. > This modern metric stack relies on pulling (prometheus collects the metrics) > instead of pushing. > This need had been expressed by [~ieugen] here: > https://github.com/ieugen/james-self-hosting-sandbox/issues/20 > = How ? > With the https://github.com/prometheus/client_java dependency expose that as > part of a `webadmin-metrics` Unauthentictated endpoint. > Adapt content at > https://github.com/prometheus/client_java/blob/master/simpleclient_servlet/src/main/java/io/prometheus/client/exporter/MetricsServlet.java > Be aware that there is some conversion format to be done here: > {code:java} > CollectorRegistry collectorRegistry = CollectorRegistry.defaultRegistry; > new DropwizardExports(environment.metrics()).register(collectorRegistry); > new MetricsServlet(collectorRegistry) > {code} > Source > https://stackoverflow.com/questions/53408121/which-metrics-are-exported-by-dropwizardmetrics-prometheus-client > = Definition of done > {code:java} > As an admin, > I have an HTTP endpoint > Exposing metrics with the prometheus format > {code} > Scope: all guice products -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3405) Expose metrics over HTTP
[ https://issues.apache.org/jira/browse/JAMES-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209889#comment-17209889 ] Matthieu Baechler commented on JAMES-3405: -- Before implementing a pull-based metrics thing, just be aware that prometheus is basically the only monitoring implementation doing that and that even at google (which created prometheus in the first place) they dropped that concept entirely. > Expose metrics over HTTP > > > Key: JAMES-3405 > URL: https://issues.apache.org/jira/browse/JAMES-3405 > Project: James Server > Issue Type: Improvement > Components: guice, Metrics, webadmin >Reporter: Benoit Tellier >Priority: Major > Fix For: 3.6.0 > > > = Why ? > I want to export James metrics to prometheus. > This modern metric stack relies on pulling (prometheus collects the metrics) > instead of pushing. > This need had been expressed by [~ieugen] here: > https://github.com/ieugen/james-self-hosting-sandbox/issues/20 > = How ? > With the https://github.com/prometheus/client_java dependency expose that as > part of a `webadmin-metrics` Unauthentictated endpoint. > Adapt content at > https://github.com/prometheus/client_java/blob/master/simpleclient_servlet/src/main/java/io/prometheus/client/exporter/MetricsServlet.java > Be aware that there is some conversion format to be done here: > {code:java} > CollectorRegistry collectorRegistry = CollectorRegistry.defaultRegistry; > new DropwizardExports(environment.metrics()).register(collectorRegistry); > new MetricsServlet(collectorRegistry) > {code} > Source > https://stackoverflow.com/questions/53408121/which-metrics-are-exported-by-dropwizardmetrics-prometheus-client > = Definition of done > {code:java} > As an admin, > I have an HTTP endpoint > Exposing metrics with the prometheus format > {code} > Scope: all guice products -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208072#comment-17208072 ] Matthieu Baechler commented on JAMES-3403: -- I have a mitigation patch here : https://github.com/linagora/james-project/pull/3865 It will very likely be too slow to adopt it. I didn't found any persistent datastructure that match this need and btw it didn't prevent locking on write. I think that we need to tackle this problem another way: we should keep internals of a Session mutable but rely on an external mechanism to avoid concurrency on it (actor-like pattern could do) > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207926#comment-17207926 ] Matthieu Baechler commented on JAMES-3403: -- Well, to detail the problem: we need a Set to ensure there's no duplicate that would provide a "indexOf" method, TreeSet doesn't provide that and Vector (the closest collection to array) doesn't handle duplicate removal. > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207922#comment-17207922 ] Matthieu Baechler commented on JAMES-3403: -- I tried to fix that but vavr doesn't seem to provide the right collections for this requirement. Let's merge with revert for now. Thank you for answering. > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207909#comment-17207909 ] Matthieu Baechler commented on JAMES-3403: -- Well, the problem comes from {code} public NullableMessageSequenceNumber getMsn(MessageUid uid) { return uids .zipWithIndex() .toMap(Function.identity()) .get(uid) .map(position -> NullableMessageSequenceNumber.of(position + 1)) .getOrElse(NullableMessageSequenceNumber.noMessage()); } {code} It should not be hard to write in a more efficient way > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207901#comment-17207901 ] Matthieu Baechler commented on JAMES-3403: -- Well, it's exchanging a correctness fix for a performance fix. You didn't answer my question: does 4 lookups in this TreeSet really several seconds long? And how about grab 4 locks on the array? Can you share figures about that please? > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3403) IMAP Fetch: UidMSNConverter :: get MSN is slow
[ https://issues.apache.org/jira/browse/JAMES-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207879#comment-17207879 ] Matthieu Baechler commented on JAMES-3403: -- Do you mean that 40k lookups in this vavr datastructure takes several seconds? > IMAP Fetch: UidMSNConverter :: get MSN is slow > -- > > Key: JAMES-3403 > URL: https://issues.apache.org/jira/browse/JAMES-3403 > Project: James Server > Issue Type: Bug > Components: IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Fix For: master, 3.6.0 > > > # What? > Following openpaas-1.7.15-rc2 deployment, and the use of VAVR immutable > datastructures in UidMsnConverter, I notice some offending slow FETCH > requests. > A closer look shows Cassandra is not to blame here > Flame graphs correlate the analysis > On our prod set up 40k messages are enough to trigger endless FETCHs. > # Why ? > Fetch is calling getMSN(uid) one by one. > Previous code was "getMSN optimized" as it uses an array as the underlying > data structure. O(1) upon reads * n messages. > The later code uses a red-black tree as an underlying data structure. I'm > usure about the complexity here (because I am not a VAVR expert) but it is at > least O(log n) * n messages. > # Short term reaction > Revert uneeded changes added in UidMsnConverter as part of work on JAMES-3177. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3397) Set up travis-ci continuous integration to build the project and all pull requests
[ https://issues.apache.org/jira/browse/JAMES-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206085#comment-17206085 ] Matthieu Baechler commented on JAMES-3397: -- AFAIU it's just a matter of dropping a JenkinsFile in the master branch > Set up travis-ci continuous integration to build the project and all pull > requests > -- > > Key: JAMES-3397 > URL: https://issues.apache.org/jira/browse/JAMES-3397 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > continuous integration increases quality and reduces chances that some PR can > accidentally break the build > I would like to work on this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3397) Set up travis-ci continuous integration to build the project and all pull requests
[ https://issues.apache.org/jira/browse/JAMES-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206040#comment-17206040 ] Matthieu Baechler commented on JAMES-3397: -- Well, each person trying to have a CI on James @ Apache ends up proposing a new solution. I know that [~ieugen] worked at having James to build on Apache in https://issues.apache.org/jira/browse/JAMES-3397 so I would rather people trying to help with this initiative than starting a new one. WDYT ? > Set up travis-ci continuous integration to build the project and all pull > requests > -- > > Key: JAMES-3397 > URL: https://issues.apache.org/jira/browse/JAMES-3397 > Project: James Server > Issue Type: Improvement >Reporter: Juhan Aasaru >Priority: Major > > continuous integration increases quality and reduces chances that some PR can > accidentally break the build > I would like to work on this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: [Discussion] Road to 4.0
On Tue, 2020-09-29 at 08:59 +0700, Tellier Benoit wrote: [...] > Spring deprecation could be seen as this big event for most users ? You are not very good at public relation, do you? (: I don't see feature deprecation as a good opportunity to increment major version number. > > 4. About defining a vision > > > > [...] > > > > What I would really like is to break things! Let's remove all > > these anachronic modules, or even better: let's build James 4 by > > adopting only well selected modules, ones that are here for a purpose. > I do think that such an effort will end up with similar effects for the > community than the 3.0 release effort. I agree to some extent. > What I am looking for is a scalable, modern mail server, we mostly have > that already (after 5 years of development which is already a huge > commitment). I know that it's Linagora intent. However, you are still slow to reach that goal partly because of James codebase size. Also, you will keep having a clutered product that most people won't see as a "scalable, modern mail server". > Note that I am also convinced that we need better documentation, and > that we need to simplify the project structure. > > With my Linagora hat on now, I see no way to convince my management to > dedicate any effort toward a completely reworked 4.0 with a several > years ETA. I think you disagreement is mainly about what you assume 4.0 would be. Let's dig this topic. A plan could be: 1. create a 3.99 branch which would be a new James-from-scratch project 2. define progressive goals to implement 3. progressively import modules to 3.99 following defined goals, breaking as many things as we want to simplify the project 4. release 3.99.x versions along the way with no guarantee about API and operation stability 5. as 3.99 reach a product we like and want to support, switch from 3.99 to 4.0 3.0 was stuck in alpha/beta stage for years. People started using it by forking it or using a fixed snapshot. Given the work left to release the 3.0, nobody care enough because they had a working project locally. By the way, 3.0 wanted to keep API stability on some topics while at the same time never provided any decent upgrade path. Now our components are basically working: we don't have to write much things to build this 3.99, we have to combine things and get rid of the weight preventing us to have a leaner product. If it takes years, it's because there's no workforce on it, not because it's a huge work. The good thing about this strategy is: we can go ahead with 3.x branch at the same time and can drop this 3.99 strategy at any point if we conclude it's a dead end. What Linagora wants or not is not relevant in this case. > > People could either jump to this fresh version of James or keep > > maintaining the 3.x branch. If they lack some modules that were not > > selected for James 4, they could just port these modules to the new > > APIs. > > > > By doing such a move, we could focus to finally solve our longstanding > > problems like: developer experience, newcomers welcoming, having a > > decent and up-to-date documentation, very easy first deployment, etc. > > > > What would you think of such a move ? > I would prefer a more pragmatic alternative. > > We as a community could be identifying features / modules that should > not have made it in the 3.x release line. We could decide to deprecate > then remove these modules, hopefully letting time for third party to > backport alternatives. How it is different from what we did for years? Did it solve velocity issues? Is the project fun now? I may start this 3.99 thing on my fork to see how it goes and will keep you posted about that. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: [Discussion] Road to 4.0
in > > enterprise context or paid to work on it. > > > > Apache Foundation projects should not be led by a single company > > so > > Linagora is not taking the lead on this side. > > Why do you state this? Is there some rule? Or you are worried about > appearance? It's a rule as far as I know: projects have to be neutral enough (not tied to a single vendor). > There is indeed a really heavy Linagora presence here. I think it’s > great, and I really appreciate your caution. Just to be clear, I'm not working on Linagora's projects anymore. I'm here as a individual contributor. > Without yet agreeing to this point, I don’t mind acting like a > moderator to capture the vision. As a moderator I can stamp my name > on it to assure the community that it is not too biased in favor of a > single company. > > > > It means to me that the best we can do as contributors is to > > contribute to the project in a way that is useful to us and try > > to > > welcome newcomers as well as we can. > > > > If the project is valuable to us, it will eventually be valuable > > to > > others. We have to stay commited to it for the time being, > > continue > > as we did for the past years. > > > > I can't see any vision that the community would be able to commit > > to, so let's say it's just a project that wait for its hour of > > glory. > > 😀 > > Well, I think we’ll have to be a bit more proactive, but let’s > continue this discussion to see where it takes us. I think from what > we experienced over the past few months that for a slowly evolving > project like this, the Documentation actually seems to be a great way > to capture all these ideas. We can debate the contents of the docs > and as a nice side-effect resolve issues like: should there be a > roadmap, and if so, what should it contain? > > Thanks a lot for always putting in so much effort and thought into > this. I hope we can have another live chat soon. > Thanks for your feedback too. -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: [Discussion] Road to 4.0
Hi, I'm not sure which message to answer in the thread so I start a new thread to summarize my thoughts on the various topics discussed. 1. About Roadmap Having a roadmap is a very good way to deceive users. It's ok for a company because you can somehow foresee what you'll be able to achieve in the future if you care about but when it comes to people and their spare time, well, life is unprecdictable. The less we say about the future, the better. People interested into the future can always get in touch with the community and ask. 2. About documentation You definitely deserve much praises for what you did already. We know that it's the missing piece for James to shine. I would just like to list what I think we lack the most: * examples of working setups * easy-to-search documentation for details like configuration or mailets * guides for the most usual things: configure some mailets, write my own, how to debug a mailet pipeling, how to plug my app in James efficiently James already works for a lot of use cases, there's nothing really lacking in term of code, we should stop thinking the software should be better before being happy with it. A lot of successful softwares are less mature than James. TL;DR we have to explain how it already works 3. About versioning (3.x vs 4.0) Like roadmaps, major version numbers give people expectations: there must be something very new and/or very different because the community decided to increment major version number. Last time the community tried that (3.0) the projects almost died because too many things had to ship at the same time and then was never ready. It took years to finally release it after Linagora started to invest a lot it and by the way, we never finished what was supposed to, we just decided that, no software being perfect, we had to release this much-better-than-2.x version. I would like to take the Linux Kernel path: * only increment minor version for the time being * don't build a backlog or any list of things we want to achieve before incrementing * release 2 to 4 times a years with what is ready * increment major version when what will be ready deserves it or when minor number get to big 4. About defining a vision You defined different kinds of software project styles and it's interesting to try to define where James is. Let me define how I see it. The community and the project doesn't have a vision yet. The community is small, mostly composed of people using James in enterprise context or paid to work on it. Apache Foundation projects should not be led by a single company so Linagora is not taking the lead on this side. It means to me that the best we can do as contributors is to contribute to the project in a way that is useful to us and try to welcome newcomers as well as we can. If the project is valuable to us, it will eventually be valuable to others. We have to stay commited to it for the time being, continue as we did for the past years. I can't see any vision that the community would be able to commit to, so let's say it's just a project that wait for its hour of glory. My conclusion is that this project is a very valuable one, written by some very talented developers for something like 20 years or so. This legacy is putting us in a very difficult situation: the codebase is huge, the test suite takes ages to execute, a lot of things are here for historical reasons but as we are careful about not breaking too much people deployments we don't remove enough things. In short: we are not able to move fast, to simplify the codebase, to implement new things easily and finally it's hard to have fun for newcomers or even James veterans. What I would really like is to break things! Let's remove all these anachronic modules, or even better: let's build James 4 by adopting only well selected modules, ones that are here for a purpose. People could either jump to this fresh version of James or keep maintaining the 3.x branch. If they lack some modules that were not selected for James 4, they could just port these modules to the new APIs. By doing such a move, we could focus to finally solve our longstanding problems like: developer experience, newcomers welcoming, having a decent and up-to-date documentation, very easy first deployment, etc. What would you think of such a move ? Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Reopened] (JAMES-3265) Investigate Slow IMAP SELECT (26 minutes +)
[ https://issues.apache.org/jira/browse/JAMES-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler reopened JAMES-3265: -- It looks like you ignored my last comment: at least https://github.com/linagora/james-project/pull/3555 requires a more subtle solution to avoid API duplication. I prefer to keep that ticket open to track the need for a better solution > Investigate Slow IMAP SELECT (26 minutes +) > --- > > Key: JAMES-3265 > URL: https://issues.apache.org/jira/browse/JAMES-3265 > Project: James Server > Issue Type: Task > Components: cassandra, IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Labels: perf > Fix For: 3.6.0 > > Attachments: Capture_d_écran_de_2020-06-22_11-42-01.png, > Capture_d_écran_de_2020-06-22_11-48-28.png > > > Using glowroot APM on Linagora run instances, I noticed some select commands > takes around 20 minutes. > A performance review shows thousands of MODSEQ updates undermines the > performance. > {code:java} > Transaction type: IMAP > Transaction name: IMAP processor : > org.apache.james.imap.processor.SelectProcessor > Start: 2020-06-22 2:28:04.433 am (+07:00) > Duration: 1,618,718.3 milliseconds > {code} > I noticed a high allocation of new ModSeq (28.000 instead of 1) due to uid > set disjonction. > I believe a solution would be to implement a new MessageMapper method: > Mono removeRecentFlags(Mailbox mbox); > That would enable some Cassandra query optimizations... > # DOD > - unlock significant performance improvments for such queries (x100) > Attached you will find the query stats and flame graph backing up the > analysis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: SMTP Relay (Was: Queuing vs. spooling)
On Wed, 2020-07-08 at 16:21 +0900, David Leangen wrote: > Still on the topic of SMTP relay and the original image I posted at > the beginning of this thread: > —> https://james.apache.org/images/james-smtp-relay.png > > > Would the attached (lousy) image be a reasonable representation of > the general concept of an SMTP Relay? > > Still on the topic of SMTP relay and the original image I posted at > the beginning of this thread: > > —> https://james.apache.org/images/james-smtp-relay.png > > > Would the attached (lousy) image be a reasonable representation of > the general concept of an SMTP Relay? > The only wrong thing about this picture is the SMTP Service before "Outgoing".As weird as it is, the delivery of messages to a remote server is done by a Mailet called RemoteDelivery and it's not handled by the SMTP Service.As far as I know, a lot of people are actually forking this Mailet because they want specific behaviors for delivery so I think this design makes sense. > -- Matthieu Baechler
Re: Queuing vs. spooling
On Tue, 2020-07-07 at 22:58 +0900, David Leangen wrote: > > Hope it helps. > > Yes, quite a lot!! > > A few clarifications, please. 😀 > > > > > SMTP Service is talking TCP with the client. When it is asked to > > deliver a message, it simply calls `enqueue` on the MailQueue. > > Can you be more precise about what you mean by “client”? Yes, the client (either a MUA or a MTA) is the remote process talking SMTP to our SMTP server via a TCP connection. > > > As everything happens in the Java process, you don't have a > > protocol, > > just a method call. > > By the way, is the message communicated via TCP or via a method call. The SMTP server talks TCP with the outside world. Then it calls parts of James with method calls (the SMTP server is living in the same process as the other services James is running) > Sorry, I am a bit confused about the two statements you made above. > > Once the mail is received by the mail queue, does all the rest of the > process happen via method calls, too? Yes > > The spooler is the thing taking messages from the queue for > > processing. > > The MailQueue allows to decouple the reception from the handling. > > A spooler usually is able to concurrently process several mails. > > Thanks. > > Are there any other important “parts” that I should be aware of? > I don't know. -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Reopened] (JAMES-3265) Investigate Slow IMAP SELECT (26 minutes +)
[ https://issues.apache.org/jira/browse/JAMES-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler reopened JAMES-3265: -- It may not be 26 minutes long but it's still slow and refactoring are probably needed for some quick-and-dirty fixes that happened today. > Investigate Slow IMAP SELECT (26 minutes +) > --- > > Key: JAMES-3265 > URL: https://issues.apache.org/jira/browse/JAMES-3265 > Project: James Server > Issue Type: Task > Components: cassandra, IMAPServer >Affects Versions: master >Reporter: Benoit Tellier >Priority: Major > Labels: perf > Fix For: 3.6.0 > > Attachments: Capture_d_écran_de_2020-06-22_11-42-01.png, > Capture_d_écran_de_2020-06-22_11-48-28.png > > > Using glowroot APM on Linagora run instances, I noticed some select commands > takes around 20 minutes. > A performance review shows thousands of MODSEQ updates undermines the > performance. > {code:java} > Transaction type: IMAP > Transaction name: IMAP processor : > org.apache.james.imap.processor.SelectProcessor > Start: 2020-06-22 2:28:04.433 am (+07:00) > Duration: 1,618,718.3 milliseconds > {code} > I noticed a high allocation of new ModSeq (28.000 instead of 1) due to uid > set disjonction. > I believe a solution would be to implement a new MessageMapper method: > Mono removeRecentFlags(Mailbox mbox); > That would enable some Cassandra query optimizations... > # DOD > - unlock significant performance improvments for such queries (x100) > Attached you will find the query stats and flame graph backing up the > analysis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Queuing vs. spooling
Hi On Tue, 2020-07-07 at 21:42 +0900, David Leangen wrote: > Hi! > > Please take a look at this image: > > —> https://james.apache.org/images/james-smtp-relay.png > > > I have several questions. 😀 > > > First, is it correct to say that in this image, the “SMTP Service” is > an MTA? Furthermore, is it correct to say that it’s the “terminal” > MTA? The SMTP service on this schema is just the server you talk to on port 25. The full picture is an MTA. And in this case it's just a relay with some rules. > How does the mail get forwarded from the SMTP server to the Mail > Queue? Would that be via LMTP? Or something else? SMTP Service is talking TCP with the client. When it is asked to deliver a message, it simply calls `enqueue` on the MailQueue. As everything happens in the Java process, you don't have a protocol, just a method call. > > How does the mail get transferred from the queue to the spooler? The spooler basically pull messages from the Queue and then handles them. > And by the way, what the heck is the difference between the queue > and the spooler?? The spooler is the thing taking messages from the queue for processing. The MailQueue allows to decouple the reception from the handling. A spooler usually is able to concurrently process several mails. > After that, the part about the Mailet Container I think makes sense > to me, but everything up to the point is not clear at all, at least > not to me. > > > Thanks as always for helping me understand! > Hope it helps. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3297) Publish the number of items currently in the mailet pipeline as a metric
Matthieu Baechler created JAMES-3297: Summary: Publish the number of items currently in the mailet pipeline as a metric Key: JAMES-3297 URL: https://issues.apache.org/jira/browse/JAMES-3297 Project: James Server Issue Type: Improvement Reporter: Matthieu Baechler Some MailQueue implementations have a limit on the number of elements that can be dequeued but not yet ack (RabbitMQ has that feature). In some cases it can prevent new mails to being dequeued and it's not easy to diagnose. By exposing that number as a metric and/or issuing a warning when this number reach the limit, we could help the operator to understand what's going wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3259) Reorganize source code
[ https://issues.apache.org/jira/browse/JAMES-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151835#comment-17151835 ] Matthieu Baechler commented on JAMES-3259: -- > We can and should provide a distribution with all of them in the same app, > but that should be a side-effect and a bonus of how we organize. It can be your goal if you will but it's not mine. I'm focusing my efforts on a all-in-one server right now, it doesn't mean I don't want separate servers to work but my goal is not exactly a side effect. That being said, I don't having anything against reorganizing around domain concepts. Would you mind taking some current submodules and propose a way to organize them in this hierarchy proposal? > Reorganize source code > -- > > Key: JAMES-3259 > URL: https://issues.apache.org/jira/browse/JAMES-3259 > Project: James Server > Issue Type: Improvement >Affects Versions: 3.5.0 >Reporter: Matthieu Baechler >Priority: Major > > I want to suggest a new organization of the source-code (I won't handle every > concerns but some important ones I have about the current state). > I would like the first level to be: > {code} > core (domain code) > data (that we should rename) > docs > extensions (containing mdn and third-party for example) > infrastructure (containing backends-common, event-sourcing, json, metrics) > mailbox > mailet > products (containing server/container/cli > server/container/guice/cassandra-rabbitmq-guice) > protocols > server > testing (containing mpt) > {code} > I'm not sure it's the best organization but: > * it allows to see easily what james most important concepts are > * put technical details into a common sub-tree > * have products a top level thing instead of a hidden one > * group what we think are extensions somewhere > * put functional testing sources somewhere that is easy to find (because a > lot of people starts by reading functional tests) > What do you think? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3291) Badly formatted mailqueue causes RabbitMQMailQueue to crash
[ https://issues.apache.org/jira/browse/JAMES-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149341#comment-17149341 ] Matthieu Baechler commented on JAMES-3291: -- Every component consuming messages from Rabbit should rely on the dead-letter queue feature to put messages triggering bugs in it. `nack` with requeue = false does that. We can also allow some retries to prevent transient failures by setting a header on the message on failure to count retries. > Badly formatted mailqueue causes RabbitMQMailQueue to crash > --- > > Key: JAMES-3291 > URL: https://issues.apache.org/jira/browse/JAMES-3291 > Project: James Server > Issue Type: New Feature > Components: Queue, rabbitmq >Affects Versions: master, 3.5.0 >Reporter: Benoit Tellier >Priority: Major > > ## Reproduction steps: > Given a bad payload published on the mailQueue exchange > Then the dequeuer will crash and stop any following dequeuing processing > ## Consequences: > This can be leveraged to knock down mail reception given only the right to > publish messages to RabbitMQ. > This can generate problems to users when upgrading with non-empty mailqueue > upon MailReferenceDTO changes > ## Alternatives > To not be crashing, we actually need to handle the deserialization exception. > Dropping the message would be a quick fix, but could result in data loss. > A better alternative would be to leverage a dead-letter queue in order to > enable to not abort processing, while keeping track of the failure, and > allowing to resume its processing. > ## Related issues > We are considering improving the reliability of the distributed mailqueue > component, and allow to drop all RabbitMQ content. To recover from such a > situation, non-dequeued emails would be tracked using the Cassandra browsing > projection, and requeued in a newly provisionned rabbitMQ. > Given the ability to re-generate non - dequeued entries, dropping invalid > rabbitMQ messages could be acceptable, as the admins will have the right > tools to re-generate legitimate traffic. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-2335) Modernize James configuration
[ https://issues.apache.org/jira/browse/JAMES-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146189#comment-17146189 ] Matthieu Baechler commented on JAMES-2335: -- Commons-configuration is a pain: old API, doesn't provide Optional type on reads, doesn't handle modern format like HOCON, etc. I don't care if we keep it in the source code as long as it's in an optional module. BTW, having commons-configuration types everywhere in the code is a pain for a long time: let's just pass parsed configuration as POJOs and then we'll choose the library we prefer, whatever it is. > Modernize James configuration > - > > Key: JAMES-2335 > URL: https://issues.apache.org/jira/browse/JAMES-2335 > Project: James Server > Issue Type: Improvement > Components: configuration >Reporter: Benoit Tellier >Priority: Major > Labels: feature, refactoring > > Apache James currently relies on commons-configuration, and thus on XML > configuration files. > As such the configuration process has several problems: > - Working with XML is boiler plate > - Working with file leads to a real lack of flexibility. > - For instance, in a cluster environment, we would like all the James > server to share the same configurations. > - Also, in tests, we need to test the different configuration values. > We can not do this without overwriting files, which is dangerous, and > boilerplate. > What we need is: > - To represent all possible configuration via java objects. > - Configuration providers should be able to convert the configuration stored > into the java configuration object. > - We should be able to inject different configuration providers from > guice/spring. > It would allow to specify alternative configuration backends (different > formats, different storage techniques) and allow direct injection (for tests > for instance). > Here would be the steps for this work: > - Add a *Initializable* class in *lifecycle-api*. This should be called by > Guice and Sprint at initialization > - *configure* in Configurable will save a Java object (parse the > HierachicalConfiguration into a java object representing it's content). > Initialization will then be done by *Initializable*. > - Then we can move away, object by object, from the *Configurable* > interface: We need to move the configuration parsing in a separated class > (behind an interface). We can register *ConfigurationProviders*, with an > XML/commons-configuration default implementation. > - Deprecate *Configurable*. > - Provide alternative configuration providers, for example, a Cassandra > stored configuration provider -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Call for vote: Apache James 3.5.0
Hi Benoit, Sorry for the delay. I retrieved the artifacts from staged, unzip james-project-3.5.0- source-release.zip and tried to build it without running tests and it's not working because of a plugin looking for .git directory: Failed to execute goal pl.project13.maven:git-commit-id- plugin:3.0.1:revision (get-the-git-infos) on project testing-base: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1] As the primary distribution of any Apache project is supposed to be zipped source code, I guess we have to fix that before the release. Also, the PR requires a rebase and the two linked documentations have the following issues: * we need to remove the "Unreleased" section from them * we have to update the date of the release to match the tentative release date * some SHA1 are missing from upgrade-instructions Otherwise, I checked signatures and they are ok. So I vote -1 for now. -- Matthieu Baechler On Fri, 2020-06-19 at 10:44 +0700, Tellier Benoit wrote: > Hi, > > I would like to propose the 3.5.0 release of the Apache James server. > > Here are the changes since the previous proposal: > > ``` > JAMES-3197 Matcher processing should handle NoClassDefFoundError > f0c6576760 JAMES-3197 Mailet processing should handle > NoClassDefFoundError > 5ad068e2da JAMES-3195 Avoid loosing stacktrace when fails to delete a > file > 0894fab28e JAMES-3192 Upgrade Apache configuration to 2.7 > ba532d0d5b Remove leading line breaks > 2207d8b3d8 JAMES-3187 Provisionned docker: add a cli helper script > 3b4828c069 JAMES-3187 Add webadmin enabled by default to jpa-guice > docker image > 537af6ed32 JAMES-1541 Fix a typo in MailPriorityHandler in > configuration > 04e8ec7906 JAMES-1541 Remove no longer existing > SuppressDuplicateRcptHandler from configuration > 492e458cda JAMES-1541 Remove no longer existing TarpitHandler from > configuration > ``` > > You can see changes proposed to the website at the occasion > of that release on this GitHub pull > request: https://github.com/apache/james-project/pull/187 > > You can find: > > - The maven release staged in repository.apache.org as the > artifact #1047: > https://repository.apache.org/content/repositories/orgapachejames-1047/ > - The changelog for > 3.5.0: > https://github.com/chibenwa/james-project/blob/website-3.5.0/CHANGELOG.md > - The compatibility instructions/upgrade > recommendation: > https://github.com/chibenwa/james-project/blob/website-3.5.0/upgrade-instructions.md > > Voting rules: > - This is a majority approval: this release may not be vetoed. > - A quorum of 3 binding votes is required > - The vote starts at Monday 19th of June 2020, 10am45 UTC > - The vote ends at Sunday 26th of June 2020, 10am45 UTC > > You can answer to it just with +1 and -1. Down-votes may be > motivated. > > Cheers, > > Benoit Tellier > > - > To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org > For additional commands, e-mail: server-dev-h...@james.apache.org > - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3279) install link is broken on homepage
Matthieu Baechler created JAMES-3279: Summary: install link is broken on homepage Key: JAMES-3279 URL: https://issues.apache.org/jira/browse/JAMES-3279 Project: James Server Issue Type: Bug Reporter: Matthieu Baechler Currently, on homepage, the link at "Instructions that do not imply docker are also available here. " pointing to http://james.apache.org/install.html is not working. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Distributed James: make ElasticSearch indexing optional?
Hi, On Wed, 2020-06-17 at 17:53 +0200, Raphaël Ouazana-Sustowski wrote: > Hi Matthieu, > > I don't see much new arguments in your last answer. It's weird because I think I didn't really repeat what I said previously. > I can answer your > questions one by one, but I would like to go forward. It's your freedom of course, it's a bit sad that such a debate don't end in a consensus but it happens. > Would it make a consensus for you if we work on merging the current > PR, > with always the option to revert it and go to a product if needed? No, that's not what I call consensus as it doesn't take into account my opinion. > Or do you prefer that we ask for a vote? Feel free to do it or not. I won't stand in your way. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Updated] (JAMES-3260) Explore building Apache James with Gradle
[ https://issues.apache.org/jira/browse/JAMES-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler updated JAMES-3260: - Description: Creating an issue to track the process of using Gradle for building Apache James. There have been a few discussions on this topic from multiple parties. The main benefit is having faster builds which Maven is unable to provide because of it's limitations on how it approaches build life-cycle and caching. We should take care of: * all that is related to release and deploy (but this can be taken from other Apache projects already using Gradlle) * the site building (but this should disappear with the migration to Antora) * the mailets plugin * checking Spring build * adding partial tests on JMAP integration (allowing to run only some smoke tests on some big integration tests suite) * adding and configuration the checkstyle plugin * updating the Jenkins build * documenting the migration for all the users that are building James themselves was: Creating an issue to track the process of using Gradle for building Apache James. There have been a few discussions on this topic from multiple parties. The main benefit is having faster builds which Maven is unable to provide because of it's limitations on how it approaches build life-cycle and caching. We should take care of: * all that is related to release and deploy (but this can be taken from other Apache projects already using Gradlle) * the site building (but this should disappear with the migration to Antora) * the mailets plugin * checking Spring build * adding partial tests on JMAP integration (allowing to run only some smoke tests on some big integration tests suite) * adding and configuration the checkstyle plugin * updating the Jenkins build * documenting the migration for all the users that are building James themselves > Explore building Apache James with Gradle > - > > Key: JAMES-3260 > URL: https://issues.apache.org/jira/browse/JAMES-3260 > Project: James Server > Issue Type: Improvement >Reporter: Ioan Eugen Stan >Assignee: Ioan Eugen Stan >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Creating an issue to track the process of using Gradle for building Apache > James. > There have been a few discussions on this topic from multiple parties. > The main benefit is having faster builds which Maven is unable to provide > because of it's limitations on how it approaches build life-cycle and > caching. > We should take care of: > * all that is related to release and deploy (but this can be taken from other > Apache projects already using Gradlle) > * the site building (but this should disappear with the migration to Antora) > * the mailets plugin > * checking Spring build > * adding partial tests on JMAP integration (allowing to run only some smoke > tests on some big integration tests suite) > * adding and configuration the checkstyle plugin > * updating the Jenkins build > * documenting the migration for all the users that are building James > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-1766) Design a IMAP search test with gatling
[ https://issues.apache.org/jira/browse/JAMES-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141956#comment-17141956 ] Matthieu Baechler commented on JAMES-1766: -- The title is really gatling specific but the description is James specific. I would put things about test runs on James project and things specific to gatling testsuite in the relevant github issue > Design a IMAP search test with gatling > -- > > Key: JAMES-1766 > URL: https://issues.apache.org/jira/browse/JAMES-1766 > Project: James Server > Issue Type: Sub-task > Components: IMAPServer >Reporter: btell...@apache.org >Priority: Major > Fix For: 3.0.0 > > > Run the tests for a constant number of users, with ~1000 mails in there inbox. > You will demonstrate the number of supported users (during 4 hours at > constant latency) between : > - the simpleMessageSearchIndex implementation > - the ElasticSearch implementation -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Support and service levels
On Sat, 2020-06-20 at 00:03 +0900, David Leangen wrote: > > Don't be afraid, debate is sane > > Thanks for the reassurance. It’s sometimes difficult in a “new” > community to know how people will take things. > > > > Commitment is just text and I don't have much trust in text. > > Well, if most people here think this way, then there really is no way > forward with this idea. > > I find this a bit cynical, but maybe I am too naive. Or perhaps I > have been living in Japan for way too long. > > > > I still think it's not a good idea unless you want to handle the > > SLA > > yourself or with some other people from the community because I > > don't > > expect to have a commitment with virtually anybody at anytime. > > But I think again you miss the point. > > Saying something like “you will never, ever, ever get any support > whatsoever” is also an SLA. So to some extent, what you just wrote > above ironically **IS** the SLA you are willing to commit to (at no > time to anybody). I'm not saying that nobody will ever get any support. I said that I don't when to commit to anything. The reality is that people do receive support, they just can't complain that we don't respect a specific commitment, it's a best effort initiative and that's the spirit of most free software licenses BTW. > It’s all about setting expectations so that people can make > reasonable plans, not about committing to volunteer time. It’s about > communicating what the community is willing (or not) to offer, so > people know how to approach us. > > But then if this is the thinking: > > > Commitment is just text and I don't have much trust in text. > > Then it’s all pretty much moot. 😀 > What we say is: "you are all welcome, come join us, we'll try to help as much as we can". It's very different from both "we commit to help you" and to "you won't get any support". Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-1766) Design a IMAP search test with gatling
[ https://issues.apache.org/jira/browse/JAMES-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141773#comment-17141773 ] Matthieu Baechler commented on JAMES-1766: -- It doesn't make that issue invalid in my opinion, where would you put such a issue? > Design a IMAP search test with gatling > -- > > Key: JAMES-1766 > URL: https://issues.apache.org/jira/browse/JAMES-1766 > Project: James Server > Issue Type: Sub-task > Components: IMAPServer >Reporter: btell...@apache.org >Priority: Major > Fix For: 3.0.0 > > > Run the tests for a constant number of users, with ~1000 mails in there inbox. > You will demonstrate the number of supported users (during 4 hours at > constant latency) between : > - the simpleMessageSearchIndex implementation > - the ElasticSearch implementation -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3259) Reorganize source code
[ https://issues.apache.org/jira/browse/JAMES-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141755#comment-17141755 ] Matthieu Baechler commented on JAMES-3259: -- > Is there any reason to separate data from core? you can setup a SMTP server without any user: for example you can setup a relay at the edge of your infrastructure that actually forward mails to other servers after applying a mailet pipeline. > Reorganize source code > -- > > Key: JAMES-3259 > URL: https://issues.apache.org/jira/browse/JAMES-3259 > Project: James Server > Issue Type: Improvement >Affects Versions: 3.5.0 >Reporter: Matthieu Baechler >Priority: Major > > I want to suggest a new organization of the source-code (I won't handle every > concerns but some important ones I have about the current state). > I would like the first level to be: > {code} > core (domain code) > data (that we should rename) > docs > extensions (containing mdn and third-party for example) > infrastructure (containing backends-common, event-sourcing, json, metrics) > mailbox > mailet > products (containing server/container/cli > server/container/guice/cassandra-rabbitmq-guice) > protocols > server > testing (containing mpt) > {code} > I'm not sure it's the best organization but: > * it allows to see easily what james most important concepts are > * put technical details into a common sub-tree > * have products a top level thing instead of a hidden one > * group what we think are extensions somewhere > * put functional testing sources somewhere that is easy to find (because a > lot of people starts by reading functional tests) > What do you think? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Support and service levels
part, I should point out that > you are stating exactly the opposite. 😂 I don't. I am both a volunteer and a paid contributor. > It’s all ok. You are just being honest about your intentions: you are > in it for your own gain, not to help others. If others can benefit > from your work, great, but that is not your primary objective. Fine. Not my opinion at all. > Of course there is nothing wrong with that. My efforts to help with > documentation are not entirely altruistic, either. But you can’t have > your cake and eat it, too. You can’t call yourself a “volunteer" if > really you’re just in it for yourself and others just happen to maybe > benefit. And if you really are in it to help others, then you would > be thinking of them first. There are more than two reasons to be here. I care about being paid, I care about users but I also have other motivations. I don't feel like help others should be "first". > > So yeah, let’s just be honest and set the right expectations so > everybody is happy. > > Having an SLA will help do that. We should only commit to what we are > willing to commit to, but there **must** be a clear definition of the > service level that is being offered (even if essentially says "f*** > you stupid user”, at least that’s clear and honest.) > > > Ok, all that was very abstract and waay off-topic, but I wasn’t > expecting a reaction like the one you had so I had to have a little > fun with it. 😂 > > If you reread my initial message in this new light, I hope you’ll > find that it was not intended to sound unreasonable. It is about > honesty and expectation setting, not free work. > > > So if we agree that an SLA is necessary (and I think we are > agreeing), then most of what you wrote (i.e. what you are not willing > to commit to) relates to what the contents of the SLA ought to be. > And what you write makes good sense in that context. > I still think it's not a good idea unless you want to handle the SLA yourself or with some other people from the community because I don't expect to have a commitment with virtually anybody at anytime. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-2335) Modernize James configuration
[ https://issues.apache.org/jira/browse/JAMES-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140583#comment-17140583 ] Matthieu Baechler commented on JAMES-2335: -- Partial response: > I do see it as a way to dynamically load / unload extensions like mailets and > change the processing pipeline at runtime. We don't need OSGi to load/unload things at runtime. It's not hard to load things into a sub-classloader. > OSGi really makes dependency management so much easier and is enforced by the > classloader (which does not make available any "private" packages). Do you mean Java modules are not providing that? I'm very curious, I thought it does. > Email is well-known. It seems to me that we ought to be able to provide good > Java APIs with interchangeable implementations. Yet there's no good implementation out there. The idea that email is a "done" topic is completely wrong: just search an IMAP client library and you'll see what I mean. Mime handling is not really better. I would not be writing email code if it already existed. > Using Declarative Services seems to me to be a lot cleaner than Guice. There's overlap in features but they don't solve the same problem. > It has a really advanced build system (BND) that works awesome with Gradle. > Resolving dependencies and packaging up distributions is really clean and > easy. Sounds so cool. But you have to write code now or it doesn't exist, right? > Modernize James configuration > - > > Key: JAMES-2335 > URL: https://issues.apache.org/jira/browse/JAMES-2335 > Project: James Server > Issue Type: Improvement > Components: configuration >Reporter: Benoit Tellier >Priority: Major > Labels: feature, refactoring > > Apache James currently relies on commons-configuration, and thus on XML > configuration files. > As such the configuration process has several problems: > - Working with XML is boiler plate > - Working with file leads to a real lack of flexibility. > - For instance, in a cluster environment, we would like all the James > server to share the same configurations. > - Also, in tests, we need to test the different configuration values. > We can not do this without overwriting files, which is dangerous, and > boilerplate. > What we need is: > - To represent all possible configuration via java objects. > - Configuration providers should be able to convert the configuration stored > into the java configuration object. > - We should be able to inject different configuration providers from > guice/spring. > It would allow to specify alternative configuration backends (different > formats, different storage techniques) and allow direct injection (for tests > for instance). > Here would be the steps for this work: > - Add a *Initializable* class in *lifecycle-api*. This should be called by > Guice and Sprint at initialization > - *configure* in Configurable will save a Java object (parse the > HierachicalConfiguration into a java object representing it's content). > Initialization will then be done by *Initializable*. > - Then we can move away, object by object, from the *Configurable* > interface: We need to move the configuration parsing in a separated class > (behind an interface). We can register *ConfigurationProviders*, with an > XML/commons-configuration default implementation. > - Deprecate *Configurable*. > - Provide alternative configuration providers, for example, a Cassandra > stored configuration provider -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3260) Explore building Apache James with Gradle
[ https://issues.apache.org/jira/browse/JAMES-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140503#comment-17140503 ] Matthieu Baechler commented on JAMES-3260: -- I suggest you use the migration feature of gradle to start. Then you could look at individual commits from my branch as they probably each fix something specific (I hope) > Explore building Apache James with Gradle > - > > Key: JAMES-3260 > URL: https://issues.apache.org/jira/browse/JAMES-3260 > Project: James Server > Issue Type: Improvement >Reporter: Ioan Eugen Stan >Priority: Major > > Creating an issue to track the process of using Gradle for building Apache > James. > There have been a few discussions on this topic from multiple parties. > The main benefit is having faster builds which Maven is unable to provide > because of it's limitations on how it approaches build life-cycle and > caching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
Re: Support and service levels
Hi, On Fri, 2020-06-19 at 12:39 +0300, Eugen Stan wrote: > > > [... snipped everything ...] I agree with everything you wrote. Also, I get money from working on free software and I think it would not make sense to expect free software to be funded exclusively by unpaid volunteers time. It's a sane thing that people can make free software go forward by using companies money, by the way. That's not to say I won't help people in the community on my own time but I won't take any commitment to it and I think it would not help the community to offer such commitments. Cheers, -- Matthieu Baechler - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Created] (JAMES-3259) Reorganize source code
Matthieu Baechler created JAMES-3259: Summary: Reorganize source code Key: JAMES-3259 URL: https://issues.apache.org/jira/browse/JAMES-3259 Project: James Server Issue Type: Improvement Affects Versions: 3.5.0 Reporter: Matthieu Baechler I want to suggest a new organization of the source-code (I won't handle every concerns but some important ones I have about the current state). I would like the first level to be: {code} core (domain code) data (that we should rename) docs extensions (containing mdn and third-party for example) infrastructure (containing backends-common, event-sourcing, json, metrics) mailbox mailet products (containing server/container/cli server/container/guice/cassandra-rabbitmq-guice) protocols server testing (containing mpt) {code} I'm not sure it's the best organization but: * it allows to see easily what james most important concepts are * put technical details into a common sub-tree * have products a top level thing instead of a hidden one * group what we think are extensions somewhere * put functional testing sources somewhere that is easy to find (because a lot of people starts by reading functional tests) What do you think? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-2510) use a consistent artifact naming scheme
[ https://issues.apache.org/jira/browse/JAMES-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140295#comment-17140295 ] Matthieu Baechler commented on JAMES-2510: -- See https://issues.apache.org/jira/browse/JAMES-3259 > use a consistent artifact naming scheme > --- > > Key: JAMES-2510 > URL: https://issues.apache.org/jira/browse/JAMES-2510 > Project: James Server > Issue Type: Task > Components: Build System >Affects Versions: 3.1.0 >Reporter: Matthieu Baechler >Priority: Major > > Maven modules have different naming scheme, for example : > * apache-james-* > * james-server-* > * protocols-* > it makes everything hard to read and to find and it's redundant because > groupId already gives context about an artifact. > I'll make a proposal to reduce the noise in our maven files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-2510) use a consistent artifact naming scheme
[ https://issues.apache.org/jira/browse/JAMES-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140263#comment-17140263 ] Matthieu Baechler commented on JAMES-2510: -- Let me explain what I had in mind (and failed to implement): Use groupId to put things into categories : org.apache.james.server for server stuff, org.apache.james.mailbox for mailbox stuff, etc Use artifactId for describing things without any prefix: cassandra in org.apache.james.mailbox for example Would people agree? I also have in mind some refactoring of the source tree but I'll put that in another ticket. > use a consistent artifact naming scheme > --- > > Key: JAMES-2510 > URL: https://issues.apache.org/jira/browse/JAMES-2510 > Project: James Server > Issue Type: Task > Components: Build System >Affects Versions: 3.1.0 >Reporter: Matthieu Baechler >Priority: Major > > Maven modules have different naming scheme, for example : > * apache-james-* > * james-server-* > * protocols-* > it makes everything hard to read and to find and it's redundant because > groupId already gives context about an artifact. > I'll make a proposal to reduce the noise in our maven files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Closed] (JAMES-2679) Some Mail instance can have a null name
[ https://issues.apache.org/jira/browse/JAMES-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler closed JAMES-2679. Resolution: Fixed > Some Mail instance can have a null name > --- > > Key: JAMES-2679 > URL: https://issues.apache.org/jira/browse/JAMES-2679 > Project: James Server > Issue Type: Bug >Affects Versions: 3.2.0 > Reporter: Matthieu Baechler >Priority: Major > Fix For: 3.3.0 > > > During a stress test, I encountered this kind of errors : > {code} > java.lang.NullPointerException: null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877) > at > org.apache.james.queue.rabbitmq.view.api.DeleteCondition.withName(DeleteCondition.java:170) > at > org.apache.james.queue.rabbitmq.Dequeuer.lambda$ack$1(Dequeuer.java:106) > at > com.github.fge.lambdas.consumers.ThrowingConsumer.accept(ThrowingConsumer.java:22) > at > org.apache.james.queue.rabbitmq.Dequeuer$RabbitMQMailQueueItem.done(Dequeuer.java:62) > at > org.apache.james.jmap.send.PostDequeueDecorator.done(PostDequeueDecorator.java:80) > at > org.apache.james.mailetcontainer.impl.JamesMailSpooler.lambda$run$0(JamesMailSpooler.java:164) > {code} > It looks like there's some code path that allows a Mail to have a null name. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-2335) Modernize James configuration
[ https://issues.apache.org/jira/browse/JAMES-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139972#comment-17139972 ] Matthieu Baechler commented on JAMES-2335: -- OSGi is a good idea, I like it a lot in theory. But I never found an decent implementation. Java modules bring some good things of OSGi directly into the java platform. Gradle helps also at enforcing good isolation of modules. If you want to try an OSGi implementation, I'm very curious to see how it looks. BTW this ticket bundles maybe too much. The important part for me is to split the configuration parsing from the remaining of the code to be able to use better configuration format and simpler testing. > Modernize James configuration > - > > Key: JAMES-2335 > URL: https://issues.apache.org/jira/browse/JAMES-2335 > Project: James Server > Issue Type: Improvement > Components: configuration >Reporter: Benoit Tellier >Priority: Major > Labels: feature, refactoring > > Apache James currently relies on commons-configuration, and thus on XML > configuration files. > As such the configuration process has several problems: > - Working with XML is boiler plate > - Working with file leads to a real lack of flexibility. > - For instance, in a cluster environment, we would like all the James > server to share the same configurations. > - Also, in tests, we need to test the different configuration values. > We can not do this without overwriting files, which is dangerous, and > boilerplate. > What we need is: > - To represent all possible configuration via java objects. > - Configuration providers should be able to convert the configuration stored > into the java configuration object. > - We should be able to inject different configuration providers from > guice/spring. > It would allow to specify alternative configuration backends (different > formats, different storage techniques) and allow direct injection (for tests > for instance). > Here would be the steps for this work: > - Add a *Initializable* class in *lifecycle-api*. This should be called by > Guice and Sprint at initialization > - *configure* in Configurable will save a Java object (parse the > HierachicalConfiguration into a java object representing it's content). > Initialization will then be done by *Initializable*. > - Then we can move away, object by object, from the *Configurable* > interface: We need to move the configuration parsing in a separated class > (behind an interface). We can register *ConfigurationProviders*, with an > XML/commons-configuration default implementation. > - Deprecate *Configurable*. > - Provide alternative configuration providers, for example, a Cassandra > stored configuration provider -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Commented] (JAMES-3225) Provide automated builds for Apache James - (restore builds.apache.org ?)
[ https://issues.apache.org/jira/browse/JAMES-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139970#comment-17139970 ] Matthieu Baechler commented on JAMES-3225: -- Our gradle work is here https://github.com/mbaechler/james-project/tree/gradle-2 Not very clean as far as I remember and quite old now. I want to comment a bit the Linagora involvement about James: most of the developers paid by Linagora are also personally involved with James project. I spent quite some time to contribute to the project on my personal time and I know at least [~btellier] is also in this situation. We should not be viewed exclusively as a mean to exercise Linagora leadership on the project. Regarding donation of computing power, If Linagora don't donate I can also ask some other organizations to do so, but let's first explore the most obvious solutions. > Provide automated builds for Apache James - (restore builds.apache.org ?) > -- > > Key: JAMES-3225 > URL: https://issues.apache.org/jira/browse/JAMES-3225 > Project: James Server > Issue Type: Task >Reporter: Ioan Eugen Stan >Assignee: Ioan Eugen Stan >Priority: Major > > For a long time we had builds that ran on the Apache Infrastructure > https://builds.apache.org/view/All/job/james-mailet/ . > The build infrastructure is not running for ~ 3 years now. > I believe it is important for us to have automated builds. > This ticket should gather the work needed to make this a reality. > There are lots of things to take into consideration. > My ( [~ieugen] ) opinions on how to handle this. > * builds should run automatically > * builds should run fast < 10 min > * there are several things they should do (not exhaustive) > ** verify the source code > ** compile the source code > ** run the unit tests > ** run the integration tests > ** publish SNAPSHOTS (only from master or develop ?!) > ** run code analytics > ** publish reports relating to build > ** provide build status for other services > For smaller projects this is a no-brainer. > For the current state of Apache James this is a challange, especially in the > context of > - multiple git branches and PR's > - the distributed integration tests which take a long time > Given the limited resources available for us on the Apache infrastructure we > will have to be selective of what we do. > Personally I don't see how we can run the current (40mni +) integration suite > on each push / build. I'm pretty sure we will get banned :) or throttled. > So a discussion should be in order on how to solve these issues but some > options regarding what we can do: > - make integration tests OPT-IN > - run (distributed) integration tests once a day or once every 6h / 12h > - have build profiles that build a common subset all the time and run > The nuclear option: prune some of the components we have in James and we > don't want to support or move them out of the common project. > This is something we should consider especially for buggy components or for > components that don't have a maintainer. > We have limited time and resources. > We can't maintain everything for everybody. > We should be mindful of this. > We can take inspiration from the OFBiz project > https://builds.apache.org/view/All/job/Apache%20OFBiz/ . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org
[jira] [Closed] (JAMES-2796) enhance Docker usage during tests
[ https://issues.apache.org/jira/browse/JAMES-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthieu Baechler closed JAMES-2796. Resolution: Fixed The description is a bit too vague to consider it valuable to keep. Let's close it > enhance Docker usage during tests > - > > Key: JAMES-2796 > URL: https://issues.apache.org/jira/browse/JAMES-2796 > Project: James Server > Issue Type: Improvement > Reporter: Matthieu Baechler >Priority: Major > > Docker is used extensively for testing. However, it grew organically and the > codebase can be enhanced/simplified at this point. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org