[GitHub] [james-project] chibenwa commented on a diff in pull request #1115: JAMES-3775 RSpamD ADR

GitBox Wed, 10 Aug 2022 18:45:18 -0700


chibenwa commented on code in PR #1115:
URL: https://github.com/apache/james-project/pull/1115#discussion_r943039163



##########
src/adr/0055-rspamd-spam-filtering.md:
##########
@@ -0,0 +1,66 @@
+# 55. RSPAMD
+
+Date: 2022-08-10
+
+## Status
+
+Accepted (lazy consensus).
+
+Implemented. 
+
+## Context
+
+In order to anti-spam, James provide 3 mechanism: 

Review Comment:
   ```suggestion
   In order to detect spam, James provide 3 mechanism: 
   ```



##########
src/adr/0055-rspamd-spam-filtering.md:
##########
@@ -0,0 +1,66 @@
+# 55. RSPAMD
+
+Date: 2022-08-10
+
+## Status
+
+Accepted (lazy consensus).
+
+Implemented. 
+
+## Context
+
+In order to anti-spam, James provide 3 mechanism: 
+
+- SMTP Hook: decide to reject or not before spooling on the SMTP level
+- Mailet: custom the mailet pipeline for changing the mail processing when 
detect spam mail.
+- Mailbox listener:  based on user action, eg: move in/out the message to 
Inbox/Spam mailbox, then decide report spam or not to separate Spam filtering 
system.
+
+For basic, we can base on username, DSN filter, domains, IP... for configuring 
black or grey list.
+If we want to analysis deeper in each message content more, we need a more 
complex system.
+The current, James supports the 
[SpamAssassin](https://spamassassin.apache.org/) for this problem.
+
+For more selection, James's repository provides the RSpamD extension, which 
provides the same way as SpamAssassin but another system - 
[RSpamD](https://github.com/rspamd/rspamd)
+
+A quick introduce about RSpamD:
+
+```
+Rspamd is an advanced spam filtering system and email processing framework 
that allows evaluation of messages by a number of rules including regular 
expressions, statistical analysis and custom services such as URL black lists. 
Each message is analysed by Rspamd and given a verdict that might be used by 
MTA for further processing (e.g. to reject a message, or add a special header 
indicating spam) along with other information, such as possible DKIM signature 
or modifications suggested for a message.
+
+Rspamd can act as a Milter allowing direct interaction with popular MTA 
systems, such as Postfix or Sendmail.
+
+Rspamd is designed to process hundreds of messages per second simultaneously, 
and provides a number of useful features including a comprehensive Lua API that 
allows access to messages processing in various aspects as well as asynchronous 
network API to access external resources, such as DNS, HTTP or even generic 
TCP/UDP services.
+```
+
+## Decision 
+
+Set up a new maven project dedicated to rspamd extension. This allows to be 
embedded in a James server as a soft dependency
+using the external-jar loading mechanism. With this way, the extension could 
be dropped in one's James installation, and not a runtime dependency.
+
+Based on James' support customize mailets, listeners, web admin routes. RSpamD 
extension can be done via:
+
+- `RSpamDScanner` mailet: with each mail income, this mailet will query to 
RSpamD for get result spam or ham. Then append new headers to mail with 
status/flag spam.
+By setup with `IsMarkedAsSpam` matcher, the mail will be action rejected or 
not.
+This mailet will be configured in 
[mailetcontainer.xml](/server/apps/distributed-app/sample-configuration/mailetcontainer.xml).
+
+- Web admin route: to create feeding ham/spam messages task (batch mechanism). 
It helps spam classify learning.
+
+- `RSpamDListener`: the listener will handler mailbox event, based on 
`MessageMoveEvent`, `MailboxEvents.Added` to detecting the mail is spam or ham, 
then report to RSpamD,
+en rich data to RSpamD, thus we will get more exact results in the next query.
+This listener will be configured in 
[mailetcontainer.xml](/server/apps/distributed-app/sample-configuration/listeners.xml).
+
+For connect to RSpamD, we use http protocol with reactor http client. 
+
+## Consequences
+
+- For higher performance, lower latency, the RSpamD should run in same network 
with James.
+- The query to RSpamD will get different score for same message. 
+
+
+## Alternatives
+
+- Rspamd can act as a [milter](https://en.wikipedia.org/wiki/Milter), we can 
use it to replace HTTP call, but more complicated

Review Comment:
   ```suggestion
   - Rspamd can act as a [milter](https://en.wikipedia.org/wiki/Milter), we can 
use it to replace HTTP call, implementing a milter client in James is harder to 
implement.
   ```



##########
src/adr/0055-rspamd-spam-filtering.md:
##########
@@ -0,0 +1,66 @@
+# 55. RSPAMD
+
+Date: 2022-08-10
+
+## Status
+
+Accepted (lazy consensus).
+
+Implemented. 
+
+## Context
+
+In order to anti-spam, James provide 3 mechanism: 
+
+- SMTP Hook: decide to reject or not before spooling on the SMTP level
+- Mailet: custom the mailet pipeline for changing the mail processing when 
detect spam mail.
+- Mailbox listener:  based on user action, eg: move in/out the message to 
Inbox/Spam mailbox, then decide report spam or not to separate Spam filtering 
system.
+
+For basic, we can base on username, DSN filter, domains, IP... for configuring 
black or grey list.
+If we want to analysis deeper in each message content more, we need a more 
complex system.
+The current, James supports the 
[SpamAssassin](https://spamassassin.apache.org/) for this problem.
+
+For more selection, James's repository provides the RSpamD extension, which 
provides the same way as SpamAssassin but another system - 
[RSpamD](https://github.com/rspamd/rspamd)

Review Comment:
   We should explain that we found hard to operate SpamAssassin, which we had 
performance issue with.



##########
src/adr/0055-rspamd-spam-filtering.md:
##########
@@ -0,0 +1,66 @@
+# 55. RSPAMD
+
+Date: 2022-08-10
+
+## Status
+
+Accepted (lazy consensus).
+
+Implemented. 
+
+## Context
+
+In order to anti-spam, James provide 3 mechanism: 
+
+- SMTP Hook: decide to reject or not before spooling on the SMTP level
+- Mailet: custom the mailet pipeline for changing the mail processing when 
detect spam mail.
+- Mailbox listener:  based on user action, eg: move in/out the message to 
Inbox/Spam mailbox, then decide report spam or not to separate Spam filtering 
system.
+
+For basic, we can base on username, DSN filter, domains, IP... for configuring 
black or grey list.
+If we want to analysis deeper in each message content more, we need a more 
complex system.
+The current, James supports the 
[SpamAssassin](https://spamassassin.apache.org/) for this problem.

Review Comment:
   ```suggestion
   Currently, James is integrated with 
[SpamAssassin](https://spamassassin.apache.org/) to tackle this problem.
   ```



##########
src/adr/0055-rspamd-spam-filtering.md:
##########
@@ -0,0 +1,66 @@
+# 55. RSPAMD
+
+Date: 2022-08-10
+
+## Status
+
+Accepted (lazy consensus).
+
+Implemented. 
+
+## Context
+
+In order to anti-spam, James provide 3 mechanism: 
+
+- SMTP Hook: decide to reject or not before spooling on the SMTP level
+- Mailet: custom the mailet pipeline for changing the mail processing when 
detect spam mail.
+- Mailbox listener:  based on user action, eg: move in/out the message to 
Inbox/Spam mailbox, then decide report spam or not to separate Spam filtering 
system.
+
+For basic, we can base on username, DSN filter, domains, IP... for configuring 
black or grey list.
+If we want to analysis deeper in each message content more, we need a more 
complex system.
+The current, James supports the 
[SpamAssassin](https://spamassassin.apache.org/) for this problem.
+
+For more selection, James's repository provides the RSpamD extension, which 
provides the same way as SpamAssassin but another system - 
[RSpamD](https://github.com/rspamd/rspamd)
+
+A quick introduce about RSpamD:
+
+```
+Rspamd is an advanced spam filtering system and email processing framework 
that allows evaluation of messages by a number of rules including regular 
expressions, statistical analysis and custom services such as URL black lists. 
Each message is analysed by Rspamd and given a verdict that might be used by 
MTA for further processing (e.g. to reject a message, or add a special header 
indicating spam) along with other information, such as possible DKIM signature 
or modifications suggested for a message.
+
+Rspamd can act as a Milter allowing direct interaction with popular MTA 
systems, such as Postfix or Sendmail.
+
+Rspamd is designed to process hundreds of messages per second simultaneously, 
and provides a number of useful features including a comprehensive Lua API that 
allows access to messages processing in various aspects as well as asynchronous 
network API to access external resources, such as DNS, HTTP or even generic 
TCP/UDP services.
+```
+
+## Decision 
+
+Set up a new maven project dedicated to rspamd extension. This allows to be 
embedded in a James server as a soft dependency
+using the external-jar loading mechanism. With this way, the extension could 
be dropped in one's James installation, and not a runtime dependency.
+
+Based on James' support customize mailets, listeners, web admin routes. RSpamD 
extension can be done via:
+
+- `RSpamDScanner` mailet: with each mail income, this mailet will query to 
RSpamD for get result spam or ham. Then append new headers to mail with 
status/flag spam.
+By setup with `IsMarkedAsSpam` matcher, the mail will be action rejected or 
not.
+This mailet will be configured in 
[mailetcontainer.xml](/server/apps/distributed-app/sample-configuration/mailetcontainer.xml).
+
+- Web admin route: to create feeding ham/spam messages task (batch mechanism). 
It helps spam classify learning.
+
+- `RSpamDListener`: the listener will handler mailbox event, based on 
`MessageMoveEvent`, `MailboxEvents.Added` to detecting the mail is spam or ham, 
then report to RSpamD,
+en rich data to RSpamD, thus we will get more exact results in the next query.
+This listener will be configured in 
[mailetcontainer.xml](/server/apps/distributed-app/sample-configuration/listeners.xml).
+
+For connect to RSpamD, we use http protocol with reactor http client. 
+
+## Consequences
+
+- For higher performance, lower latency, the RSpamD should run in same network 
with James.
+- The query to RSpamD will get different score for same message. 

Review Comment:
   Please mention a distributed mode for RSpamD is allowed by the use of Redis.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [james-project] chibenwa commented on a diff in pull request #1115: JAMES-3775 RSpamD ADR

Reply via email to