Re[2]: spamassassin with gmail

2024-04-15 Thread Michael Grant via users






https://isbg.gitlab.io/isbg/index.html

support gmail and spamassassin

other then that i tryed to make a gentoo ebuild for it, have to retry now :)


Yes that's kinda similar!  I'll have to try that!  Thanks.


Re[2]: spamassassin with gmail

2024-04-15 Thread Michael Grant via users

Matija

Sorry, you have misunderstood what I posted.  I am not at all advocating 
people use gmail.  Something like 68% of the planet already uses it and 
few people like you and me have the skills to host our own email.  It's 
not crazy for the people who use gmail or yahoo or other providers, they 
use it, they're used to it, and they apparently like it enough not to 
leave.


It's not easy for people to run their gmail acct through spamassassin.  
Maybe some hack with forwarding and adding headers and a check for 
looping might work.  This isn't what I was really talking about.  But it 
doesn't matter.


Michael Grant


Re[2]: spamassassin with gmail

2024-04-15 Thread Michael Grant via users

From "Marc" 
You can add to this, that gmail actually is also losing email and annoying is 
that you can't send zip files. I am constantly asking people to give me a 
different email address.
Yup!  And it's not too difficult to pull messages out of the Spam folder 
and put them back into Inbox.  That is, if the message makes it into the 
Spam folder and isn't rejected.


I don't know if it still works but I have had people send me zip files 
to my gmail acct by renaming them as like .tip or .zap or something.  
Frankly it's better to share such potentially large files in a link like 
from dropbox, onedrive, or any one of a number of similar things.



I don't like any daemon connecting to my mail storage. Can you imagine if your 
solution gets hacked, how much data would be compromised? I prefer messages 
being scanned/marked before stored. I wonder if this is even gdpr compliant, 
because you can access private data constantly.
First, for people like yourself, you would want to run such a daemon 
yourself on your own infrastructure, hence why I am thinking of this 
could be useful to other people as open source.


Second, there are plenty of people who don't run their own email, as in, 
gmail users, that entrust their email to google.  Though GDPR probably 
has something to say about such a service, I doubt it would be 
impossible under GDPR, especially EU users using a suitable EU server 
and whatever rules necessary were followed.



Why not just forward messages? Register a domain put some mx servers in front 
of gmails mx. I recently was testing with such relay/forward, works perfectly, 
I am only changing the envelope nothing else. DKIM, spf everyting perfectly 
working.

I'd be interested to know if anyone runs spamassassin forwarding from 
gmail back into gmail, how does this work?  How to get it so mail isn't 
in a loop?  You can't do what I'm talking about just by forwarding.  
More below on that.



So for the whole of Europe you need data processing agreement for accessing the 
mail storage as a 3rd party.
Probably, yes.  Is it any different with a mail server that uses a back 
end scanner as a service?  I know there are several such services for 
corporate email that work with a google workspace account that allows 
you to modify the mail routing which you can't do with a free gmail 
account.



I think this design is just wrong from the start. I have sometimes that we see 
that clients mailboxes are accessed from the digitalocean cloud because they 
granted access via their phone. Especially IOS is really insecure/bad with such 
privacy. It is just crazy giving access to your whole mailbox for maybe a 1 
time action on a incoming email.
I wouldn't say the design is ideal but I haven't seen any better way.  I 
didn't find a way to do it by forwarding myself, maybe I missed 
something obvious?  There's no way in consumer gmail to tell gmail to 
loop messages through some external service.  I guess you could forward 
all messages and then use POP to "import" them back in.  You wouldn't be 
able to manipulate folders like the Spam folder or set up spam-training 
and ham-training messages.  I remain unconvinced just forwarding is the 
best way to do this.


You can argue that it's really crazy giving access to your whole mailbox 
to your email provider too.  I guess I don't see the difference here.  
Your mail service provider could be broken into as well.  Read about 
Microsoft's recent break-in?


I'm just wondering if there's enough interest in this to do the work to 
make it open source.  If there were a lot of people mailing me saying 
"Yes!  I've been looking for something like this but I don't want to run 
it myself!", then I'd consider making it into a service, as well as 
probably open sourcing it.  Thing is, such a service has to minimally 
viable.  So far, you're the only response I've seen to this and your 
response appears to be overwhelmingly negative.


In my own testing of this, my gmail Spam folder varies between 1500 and 
5000 messages at any given time.  Sometimes there's a false positive 
that no matter how many times I tell gmail it's not spam, mail from that 
user ends up in Spam.  I also find gmail is not perfect and it misses 
1-2 spams roughly every day that end up in my inbox.  I have already 
pressed the spam button once this morning.  I've spent quite a bit of 
time pulling down individual false negative messages and running them 
through spamassassin on my server and they almost always get scored 
highly as spam.  So I personally find such a plumbing to be useful.


What I have is a plumbing that does the message manipulation and a bunch 
of other things which are not pertinent.  Some of the hard work is done, 
it would still need some work to release to the world.  Pulling messages 
out and putting them back in is not as easy as it sounds and I can 
honestly say the devil is in the details, but the good news is that part 
now works well.  I am just trying to figure 

spamassassin with gmail

2024-04-15 Thread Michael Grant via users
Do any of you use spamassassin with a gmail account, and if so, how are 
people doing it?  The reason to do this is gmail's spam filtering isn't 
perfect and you don't have the control you have with spamassassin.


We built some plumbing to do this using gmail's API, and also IMAP which 
can work with other services such as yahoo or outlook.  I'm wondering if 
this is of any use to anyone other than myself.


Essentially, it's a daemon that connects to the account and acts as a 
mail client (an MUA).  When messages arrive in a mailbox (could be any 
folder really), sucks out the message, runs it through spamassassin, and 
puts the result either into the Spam folder or Inbox.


I'm just wondering what to do with this plumbing software, if it should 
be open sourced or run as a service.  Running it as a service couldn't 
be free as I don't have access to free servers.  The daemon in it's 
current state is a bit complicated to set up on it's own but it could 
definitely be cleaned up, especially if there was sufficient interest.


I bet this could also be put together using getmail5 instead of this 
special built daemon but that would imply polling instead of push.  
Several ways to do this.


Michael Grant

Re: Question about forwarding email (not specifically SA, pointers greatly appreciated)

2024-01-03 Thread Michael Grant via users
Here's what I have done in the past from my server to get around this
situation you are having:

1. In my .procmailrc file

:0c:
!exam...@gmail.com

This sends a copy (the c flag in first line) of the message to the
gmail account and leaves a copy in your inbox.

2. From your exam...@gmail.com acct, go to Settings -> Accounts and
Import.  Under the section 'Check email from other accounts', Add an
email account.  Then add your server's account and use POP to suck
over emails as they arrive.  Have it delete the emails once they are
sucked over.

What this does is it causes messages to be forwarded to gmail, but
some small number of them bounce because of whatever decision gmail
makes.  But those messages are popped in later, so there's no lost
mail.  Gmail de-duplicates the messages so you don't get messages
twice, and it never refuses to pop the messages in.  Popping in
messages is slow, so when the forward works (which seems to be most of
the time), mail comes in quick, unless it bounces, in which case, it's
popped in a few minutes, sometimes 10s of minutes, later.

If you are concerned about the bounce messages going back into your
mailbox (gmail doesn't loop here fortunately), you can write a
procmail rule to siphon those off into another folder or into
/dev/null.  (Left as exercise for the reader...)

3. You *may* need to do one further thing, you may need to go back
into gmail's Account and Import settings and set up 'Send mail as' and
set up to send mail as your email address on your server.  I can't
remember if gmail does this automatically for you in step 2 above or
not.

4. You probably want to then click the radio button "Reply from the
same address to which the message was sent".  Otherwise, when you
reply, it'll come from your gmail address and not your server's email
address. These radio buttons only appear once you have at least one
Send As address set up.

Michael Grant


signature.asc
Description: PGP signature


Re: Beginner Setting up Spam Assassin

2023-12-30 Thread Michael Grant via users
Can you ban this user in whatever your equivalent of the access file is so 
instead of putting the messages into a spam folder, you reject messages from 
that address at delivery time (SMTP)?



On 30 December 2023 04:08:17 CET, FalconChristopher 
 wrote:
>ⓘ *No issues found, please report it if otherwise*
>Anyone know how I can check and setup SpamAssassin so that I can 
>eliminate some spam from coming in from a email account ?
>
>
>On 12/28/2023 2:24 AM, Matus UHLAR - fantomas wrote:
>> On 27.12.23 16:53, FalconChristopher wrote:
>>> Hi, I want to setup Spam Assassin so that any email that Spam 
>>> Assassin flags as spam
>>
>> this is spamassassin's job
>>
>>> gets placed into a folder for a specific SMTP or IMAP email account.
>>
>> this is not spamassassin's job.
>> It's job of mail delivery agent - procmail, maildrop, sieve
>>
>>> Then if Spam Assassin flags emails that are not spam I can tell it 
>>> which of those emails to not place into the spam folder for the 
>>> specific email client. Until it gradually learns which emails are 
>>> spam and which are not.
>>
>> dovecot (imap/pop3 server) has plugins that support training of 
>> spam/ham, if you move the mail from/to spam folder.
>>
>> https://doc.dovecot.org/configuration_manual/spam_reporting/
>>
>>> I've done a little research and I have access with my distribution to 
>>> a mail directory as well as the local.cf file for which 
>>> configurations are for Spam Assassin but I don't know how to setup 
>>> what I mentioned above ?
>>
>


spamd with mix of real and virtual users

2023-11-04 Thread Michael Grant via users
I'm in the process of setting up virtual users on my mail server.  It
looks like I may have a mix of both real and virtual users.

The flow when scanning a message is:

sendmail -> spamass-milter -> spamc -> spamd

spamass-milter looks at the To: header and passes just the user part.
I see a -e option which causes the whole address (user@domainname) to
be passed to spamc.  cool.

spamc then will pass that verbatim to spamd.

and here's where my problem begins...

If the user exists locally, I want spamd to use that, but if not, I
want it to use the virtual-config-dir.

but to use --virtual-config-dir option requires I specify a -u option
(pin spamd to run as a specific user).

but there's a -U option which causes spamd to fall back to a specific
user.  It would seem like I should be able to specify something like
'-U dovecot-virtual', but no, spamd doesn't allow -U and
--virtual-config-dir options.  That seems like an oversight.

I'm wondering if the better solution here is to pull the problem back
a level and have spamass-milter try to look up the local user and fall
back to a fallback user (dovecot-virtual in my case).

Has anyone else tackled this issue?

Michael Grant


signature.asc
Description: PGP signature


Re: check_rbl question

2023-07-07 Thread Michael Grant via users
On Fri, Jul 07, 2023 at 04:50:18PM +0200, giova...@paclan.it wrote:
> if can(Mail::SpamAssassin::Conf::has_tflags_nolog)
>   tflags URIBL_IVMURI net nolog
> else
>   tflags URIBL_IVMURI net
> endif

and Benny Pedersen's idea of using a rule like:

header __FOO eval:check_rbl('ivmSIP-lastexternal', 'my_key.inv-sip.')
meta INVSIP __FOO
describe IVMSIP listed at dnsbl.invaluement.com/ivmsip,
score IVMSIP 5

Neither of these are ideal.  I really need to see what ip address is
being looked up.  Perhaps yes, I'll need to do a feature request.





signature.asc
Description: PGP signature


check_rbl question

2023-07-07 Thread Michael Grant via users
I'm using check_rbl with some paid lists for example invaluement.  I
don't want to put my license key into the rule or it ends up in the
spamassassin X-Spam-Report header.  On one server, I've configured
bind9 with DNAME records to hide the key.  But what do others do?  Is
there some easier way to do this?

Michael Grant


signature.asc
Description: PGP signature


Re: installing spamassassin plugins on debian

2023-03-17 Thread Michael Grant via users
> you dont need this

I see, I stand corrected!

> maybe ask how to configure extracttext ?

Sure, I'd be happy to see some examples.  The man page looks pretty
straight forward.

I see it depends on some external tools like tesseract and odt2txt so
I had better install those first.

I have not had good luck with tesseract out of the box, I wonder if
there's some options to tune it to make it work better.  Is there
anything better?

To see how well this is working, I am hoping to be able to see the
output of these tools with -D so I can write some rules.

Similarly, is there a way to see the 'body' text that is fed into the
rules?  I don't see that in the output of -D.  By 'body', I mean the
text with the html cleaned out of it plus the subject line.  I have a
message and I want to write a new body rule, I want to see what
spamassassin is using as the 'body' so I can write the regex.  I don't
see the body text in -D.




signature.asc
Description: PGP signature


Re: installing spamassassin plugins on debian

2023-03-17 Thread Michael Grant via users
> I guess you didn't notice that you are actually installing SpamAssassin
> 4.0.0, since that's what you are looking at from CPAN?  It's part of the
> official SA package starting from 4.0.0, not a standalone plugin.

Thank you!  I did not notice that, now I see its there.  I know why, I
have 2 boxes, one with the older 3.4 and a newer one with 4.0.0.  So
that little problem is now a non-issue!




signature.asc
Description: PGP signature


Re: installing spamassassin plugins on debian

2023-03-17 Thread Michael Grant via users
On Fri, Mar 17, 2023 at 04:03:03PM +0100, Benny Pedersen wrote:
> Michael Grant via users skrev den 2023-03-17 09:52:
> 
> > What do people do to keep things up to date easily?
> 
> i just use gentoo, or freebsd, not a precompiled problems (hehe)
> 
> but what plugin do you need with spamassassin 4 now ?
> 
> are you willing to apt maintain a custom plugin in debian ?, i see no
> problem if you do this :)

I want to try the ExtractText plugin.

What if I just install this from CPAN?  It installs in
/usr/share/perl5/Mail/SpamAssassin/Plugin/ which looks correct.

It was also recommended to me maybe use cpan2deb and install that, but
then I'm maintaining my own private debian package which I really did
not want to do.  What's wrong with just installing from CPAN in this case?



signature.asc
Description: PGP signature


Re: installing spamassassin plugins on debian

2023-03-17 Thread Michael Grant via users
On Fri, Mar 17, 2023 at 11:26:21AM +0200, Henrik K wrote:
> On Fri, Mar 17, 2023 at 04:52:41AM -0400, Michael Grant via users wrote:
> > Is there a recommended way of installing a spamassassin plugin on
> > debian (or ubuntu) such that the plugin gets updated via say apt?  I'm
> > guessing no because I don't see many spamassassin plugins when I do an
> > "apt search".
> > 
> > Up to now, I have been manually putting things in /etc/spamassassin/
> > but I feel like there has to be a better way to manage these.
> > 
> > What do people do to keep things up to date easily? 
> 
> There is no automated handling of third party plugins.  It's up the
> maintainers to provide or not provide any support.  Which usually just means
> monitoring some github repo etc.

What about CPAN?  Do people use that?  It seems like there's quite a
few modules in CPAN already.  I will admit that if I see a debian
package, I go for that, I rarely if ever install stuff from CPAN but I
could be convinced to use it more if this created some order out of
the chaos.



signature.asc
Description: PGP signature


installing spamassassin plugins on debian

2023-03-17 Thread Michael Grant via users
Is there a recommended way of installing a spamassassin plugin on
debian (or ubuntu) such that the plugin gets updated via say apt?  I'm
guessing no because I don't see many spamassassin plugins when I do an
"apt search".

Up to now, I have been manually putting things in /etc/spamassassin/
but I feel like there has to be a better way to manage these.

What do people do to keep things up to date easily? 


signature.asc
Description: PGP signature


Re: Strange findings debugging bayes results

2023-02-21 Thread Michael Grant via users
On Mon, Feb 20, 2023 at 01:30:15PM -0800, Loren Wilton wrote:
> This is a home system with only a few users. All users have "Spam" and "Ham"
> folders showing up in their email program of choice, and they just drag
> messages they do or don't like into the appropriate folders. There are 
> "Oldham"
> and "Oldspam" mboxes, and the new spam and ham (respectively) get merged into
> these folders after learning, and removed from the current Spam and Ham
> folders.

I had a similar idea but never implemmented it because I felt it was
too difficult for users to deal with.  I was considering 2 folders:
'Spam Training Set' and 'Ham Training Set' which would always
represent the set of messages that Spamassassin was currently trained
with.  If you changed the contents of these mboxes, a cron job would
delete the old bayes tokens and retrain with the current set.

The difference between these folders and the Spam folder (or Junk or
whatever you call it locally) is that messages older than 30 days get
auto-deleted.  After 30 days, those messages would no longer represent
the training set.

Having 2 spam folders is confusing and not easy to manage.

Neither of these 2 extra folders are folders that users would look for
messages so they really do have to copy messages into them which isn't
just dragging them.  That for me was the main issue I faced.

So I abandoned this line of thinkinking.

You mentioned harvesting ham and spam from mboxes as in from the inbox
directly.  This got me wondering more about this.

Clearly using messages that the user dragged to Spam that
spamassassin did not mark as Spam to train as spam.  Easy.

And use messages that the user left in their mailbox or deleted or
archived as ham.  Could be ok but less sure.

And lastly, messages that were in Spam (since Spamassassin marked them
as spam), that a user moved out of Spam.  Just look through all their
folders (except Spam) for messages that Spamassassin marked as spam
and retrain on those as ham.  Again, maybe a bad assumption, could
work though.

I was really just curious to know if other people had workable ideas
how to get bayes trained with the least amount of friction.


signature.asc
Description: PGP signature


Re: Strange findings debugging bayes results

2023-02-20 Thread Michael Grant via users
On 20 February 2023 12:28:00 CET, Loren Wilton  wrote:
>
> A cron job that will harvest Spam and Ham mboxes and feed them to sa-learn 
> once a day, then archive the learned messages. Per-user bayes and learning. 
> Mail is hand-moved into the spam and ham learning folders, and for my  
> personal account, I do this rarely, generally only when a message is 
> mis-categorized. Although messages being mis-categorized as spam is often the 
> result of a lot of quite aggressive local rules I have rather than a Bayes 
> mis-classification.

When you "harvest" ham from mboxes, what do you consider ham?

You also, additionally, have a Ham folder for your users then? Interesting. Did 
you manage to train your users to use it easily? Does it grow unbounded or are 
old messages removed from it?  If so, how to know they can be deleted like from 
the Spam folder.

It's an interesting idea, just wondering about the details.  Getting my users 
to train spamassassim has always been impossible for me.

Re: URIDNSBL full message checking

2023-02-08 Thread Michael Grant via users
> You can test with:
> 
> header SURBL_MULTI_HDR eval:check_hashbl_emails('multi.surbl.org',
> 'raw/max=10/shuffle/host', 'ALLFROM/Reply-To', '^127\.0\.0\.\d+$')
> priority   SURBL_MULTI_HDR   -100
> describe   SURBL_MULTI_HDR   Domain in email headers found in
> surbl multi

Raymond, thank you!  This works.

But I'm having an issue using this with multi.surbl.org and
multi.uribl.org.  The response addr needs to be bit-masked.  The \d+
in 127.0.0.\d+ is in fact a bitmap.

If I want to assign different scores for different entries in their
databases, I'd need to mask the \d+.  Is there any easier way to do
this than this?

header URIBL_BLACK eval:check_hashbl_emails('multi.uribl.com', 
'raw/max=10/shuffle/host', 'ALLFROM/Reply-To', 
'^127\.0\.0\.(2|3|6|7|10|11|14|15|18|19|22|23|26|27|30|31|34|35|38|39|42|43|46|47|50|51|54|55|58|59|62|63|66|67|70|71|74|75|78|79|82|83|86|87|90|91|94|95|98|99|102|103|106|107|110|111|114|115|118|119|122|123|126|127|130|131|134|135|138|139|142|143|146|147|150|151|154|155|158|159|162|163|166|167|170|171|174|175|178|179|182|183|186|187|190|191|194|195|198|199|202|203|206|207|210|211|214|215|218|219|222|223|226|227|230|231|234|235|238|239|242|243|246|247|250|251|254)$')

check_uridnsbl() handles bitmaps with the urirhssub parameter (the "2) below:

urirhssub   URIBL_BLACK multi.uribl.com.A   2

Is there something like the mask arg in urirhssub with check_hashbl?
I did have a look at the source of check_hashbl but I couldn't spot it
right off.  I get the feeling there's got to be a more straight
forward way than above!

Michael Grant


signature.asc
Description: PGP signature


Re: URIDNSBL full message checking

2023-02-06 Thread Michael Grant via users
On Mon, Feb 06, 2023 at 04:16:46PM -0500, Bill Cole wrote:
> On 2023-02-06 at 12:50:29 UTC-0500 (Mon, 6 Feb 2023 17:50:29 +)
> Michael Grant via users 
> is rumored to have said:
> 
> > I’m noticing that check_uridnsbl() seems only to check the message body.
> > Is there some way to make it check the headers as well?
> 
> No. Which is fine, because there are usually no URIs in headers, and when
> there are, they are likely to be standard List-* headers, which are unlikely
> to be useful.

It's actually just a domain name.  This uridnsbl keys off domain names
in the body too, I was kinda hoping it would look at the domain names
in the headers like the body, guess not.

> You can obviously use 'full' or the 'all' pseudo-header and look for
> specific domains, but identifying everything in the header that COULD be a
> domain and just testing that against a DNSBL designed for domains found in
> URIs could have very bad failure modes.

How about just say the from or received headers?  Is there something
like check_rbl that would look up a domain name rather than an ip
address that I could look up the domain in that URIBL list?

I played with check_rbl() but this seems only to look up numeric ip
addresses.

Michael Grant


signature.asc
Description: PGP signature


URIDNSBL full message checking

2023-02-06 Thread Michael Grant via users
I’m noticing that check_uridnsbl() seems only to check the message body.  Is 
there some way to make it check the headers as well?

In 25_uribl.cf, I have:

urirhssub   URIBL_BLACK multi.uribl.com.A   2
bodyURIBL_BLACK eval:check_uridnsbl('URIBL_BLACK')
describeURIBL_BLACK Contains an URL listed in the URIBL blacklist
tflags  URIBL_BLACK net
reuse   URIBL_BLACK

First obvious thing I tried was changing ‘body’ to ‘full’ in the above.  It 
continues to check only the body.  In fact, changing it to ‘header’, it 
continues to check the body.  I then read through the man page on URIDNSBL and 
it does clearly state a ‘body’ rule.

Is there some clever way to have a URIDNSBL rule check the header of a message 
as well?  Or is there something else I can use separately that would look up a 
domainname in the header section of an email?

Michael Grant


Providing my own body text parts function

2023-01-20 Thread Michael Grant via users
In a body rule, SA uses the textual body of the message. 

From the docs: "The 'body' in this case is the textual parts of the message 
body; any non-text MIME parts are stripped, and the message decoded from 
Quoted-Printable or Base-64-encoded format if necessary.
The message Subject header is considered part of the body and becomes the first 
paragraph"

Is there a way I could provide my own function (override SA's internal 
function) to produce this textual representation myself?

Michael Grant


Re: subscribe to blacklist for domains

2022-08-14 Thread Michael Grant via users
> WTF, that has been a terrible idea since the 90s, given most spam is 
> spoofed, the end result of this will be your mail server getting the 
> poor reputation as source of backscatter and going into blacklists :)

If you reject, you should reject on their SMTP connection.  If you
return a DSN later, there's a high chance you are causing back-scatter
spam to the wrong place.

When you reject on the initial connection, if the spammer is abusing
someone else's infrastructure, you may cause errors to go back to the
owner of that infrastructure which will clue them into a problem they
need to clean up.  Not always though.

Some ESPs track DSNs they get back and remove those addresses from
future mailouts.  If the spammer reuses that ESP, your address may not
be used again with that account.  This is really more useful for
fringe spam like things you didn't realize you signed up for or things
that weren't meant for you.

On the other hand, some ESPs let you report the account as spam, but
to do that you'd have had to received the message first to click on
some link in it.  Mailchimp for example lets you click a box to be
removed and tell them you consider it spam and if they get sufficient
complaints, the account is blocked.

In short, I don't think it's bad to reject spam.  Care needs to be
taken blanket blocking mail from ESPs though.



signature.asc
Description: PGP signature