Re: ChatGPT > Spamassassin? :)

2024-06-28 Thread Marcin Mirosław

W dniu 2024-06-25 15:55, John Hardin napisał(a):

On Mon, 24 Jun 2024, Mark London wrote:

I received a spam email with the text below, that wasn't caught by 
Spamassasin (at least mine).   The text actually looks like something 
that was generated using ChatGPT.  In any event,  I put the text 
through ChatGPT, and asked if it looked like spam.  At the bottom of 
this email , is it's analysis.  I've not been fully reading this 
group.  Has there been any work to allow Spamassassin to use AI?  
Thanks.  - Mark


In a very limited manner. There is code in the repo that allows you to 
set up ham and spam corpora and scan the spam corpora to pick out 
common phrases and filter them via the ham corpora, then create 
rules.based on the phrases and (IIRC) combinations of them.


This was being used to generate dynamic fraud rulesets (the "sought" 
rules, still somewhat there as ADVANCE_FEE rules which I occasionally 
manually update) until Justin Mason left the project. It's been 
languishing since as he was providing the resources (infra and 
maintenance) to run it for those rules. I was feeding those corpora for 
a long time.


Take a look in the repo at the stuff under:

  https://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-dev/

  
https://svn.apache.org/viewvc/spamassassin/trunk/masses/evolve_metarule/


I don't know whether the project would be willing to set up infra to 
revive dynamic advance fee fraud (or more general) rule generation, but 
it's possible if someone was willing to bring that code up-to-date and 
figure out what was needed and corpora providers were available.



This code still works, at least for me. I'm using my own corpora.


Re: txrep - why it inserts some strange values to db?

2020-12-13 Thread Marcin Mirosław
W dniu 2020-12-12 o 17:56, RW pisze:
> On Thu, 10 Dec 2020 17:40:42 +0100
> Marcin Mirosław wrote:
> 
>> Hi!
>> I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam 
>>  it put six tuples to database (postgres):
>> # select * from txrep ;
> ...
>> this doesn't look correctly, none of this tuple is 100% correct. 
> 
> Hopefully this version isn't wrapped:
> 
>  username |   email   | ip
> | msgcount | totscore |  signedby  |  last_hit
> --+---+---+--+--++
>  nobody   | y...@multifinansowanie.com.pl  | 
> 5.199.143 |1 |   20 || 2020-12-10 17:34:25.830758
>  nobody   | multifinansowanie.com.pl  | 5.199.143 
> |1 |   20 || 2020-12-10 17:34:25.83376
>  nobody   | slot10.multifinansowanie.com.pl   | none  
> |1 |   20 | helo   | 2020-12-10 17:34:25.836672
>  nobody   | y...@multifinansowanie.com.pl  | none 
>  |1 |   20 || 2020-12-10 17:34:25.840392
>  nobody   | 5.199.143.45  | none  
> |1 |   20 || 2020-12-10 17:34:25.843831
>  nobody   | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none  
> |2 |   40 | 1606432596 | 2020-12-10 17:34:25.850803
> 
> 
> I don't use TxRep and I've not looked at a TxRep database of either version.
> However my concerns would be: 
> 
> 1. msgcount=2 in the last line (assuming the database was previously empty)

Database was empty and I run `sa-learn --spam ` only once.

> 2. The absence of either DKIM or SPF entries. 
> 
> The truncated IP address "5.199.143" is correct if you have
> "txrep_ipv4_mask_len 24". The helo and epoch time in the 
> signedby column look OK. 

In column "signedby" should be, IMHO, domain (d=) or selector (s=) from
DKIM.


> The use of the username nobody suggests you are running sa-learn as root.

username is forced in configuration to be the same as send from MTA.

> Part of the reason I don't use TxRep is that I have no confidence 
> in its correctness.

In my opinion this looks completly useless. How it looks in other databases?
Marcin



txrep - why it inserts some strange values to db?

2020-12-10 Thread Marcin Mirosław

Hi!
I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam 
 it put six tuples to database (postgres):

# select * from txrep ;
 username | email |
ip | msgcount | totscore |  signedby  |  last_hit

--+---+---+--+--++
 nobody   | y...@multifinansowanie.com.pl  | 
5.199.143 |1 |   20 || 2020-12-10 
17:34:25.830758
 nobody   | multifinansowanie.com.pl  | 
5.199.143 |1 |   20 || 2020-12-10 17:34:25.83376
 nobody   | slot10.multifinansowanie.com.pl   | none 
 |1 |   20 | helo   | 2020-12-10 17:34:25.836672
 nobody   | y...@multifinansowanie.com.pl  | none 
 |1 |   20 || 2020-12-10 17:34:25.840392
 nobody   | 5.199.143.45  | none 
 |1 |   20 || 2020-12-10 17:34:25.843831
 nobody   | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none 
 |2 |   40 | 1606432596 | 2020-12-10 17:34:25.850803

(6 rows)

this doesn't look correctly, none of this tuple is 100% correct. 
Original message's headers, with redacted destination address: 
https://pastebin.com/JxfNm7LE


Regards,
Marcin


Re: SA 3.4.3 and mimeheader check

2019-12-29 Thread Marcin Mirosław
W dniu 2019-12-28 o 21:07, RW pisze:
> On Sat, 28 Dec 2019 20:06:04 +0100
> Marcin Mirosław wrote:
> 
>> Hello,
>> it looks something changed in mimeheader check. After upgrade to 3.4.3
>> this rule throws error:
>>
>> warn: config: SpamAssassin failed to parse line, "__LRM_BAD_ZIPSRULE
>> Content-Type:raw =~
>> /^application\/zip;.*xls\s\.zip|^application/zip;.*\.zip\"/" is not
>> valid for "mimeheader", skipping: mimeheader __LRM_BAD_ZIPSRULE
>> Content-Type:raw =~
>> /^application\/zip;.*xls\s\.zip|^application/zip;.*\.zip\"/
>>
>> What changed in the way how mimeheader works?
> 
> It has better validation. You are missing a backslash in
> "^application/zip".
> 

Indeed, thank you!



SA 3.4.3 and mimeheader check

2019-12-28 Thread Marcin Mirosław
Hello,
it looks something changed in mimeheader check. After upgrade to 3.4.3
this rule throws error:

warn: config: SpamAssassin failed to parse line, "__LRM_BAD_ZIPSRULE
Content-Type:raw =~
/^application\/zip;.*xls\s\.zip|^application/zip;.*\.zip\"/" is not
valid for "mimeheader", skipping: mimeheader __LRM_BAD_ZIPSRULE
Content-Type:raw =~
/^application\/zip;.*xls\s\.zip|^application/zip;.*\.zip\"/

What changed in the way how mimeheader works?

Marcin


Re: running a private SA-Mirror

2019-05-01 Thread Marcin Mirosław
W dniu 2019-05-01 o 10:05, A. Schulze pisze:
> Hello,
> 
> we've a number of SA instances that need rule updates. For now we configured 
> them to use a proxy. Works...
> But there are also instances that can't us a proxy at all.
> 
> My idea was to setup a private SA-Mirror (apache+rsync) but, I've to manage
> DNS-Data for mirrors.spamassassin-mirror.example and 
> 2.3.4.spamassassin-mirror.example.
> :-/
> 
> Are there other methods to distribute current ruleset to SA-instances using 
> sa-update?

Hi,
I'm using rbldns as dns server (beacuse I've got own bl rbl) so my
script can't be used 1:1. Maybe you will adopt it for your environment.


unbound-control flush_zone sa.mejor.pl # flushing resolver cache
current_version="$(dnsget -q -t txt 0.4.3.sa.mejor.pl)" || { echo
"Error: can't get current rules version" ; exit 1; }

set -e
spamassassin --lint
cd /etc/spamassassin/sa.mejor.pl
new_version="$((${current_version}+1))"
tar --owner=spamassassin --group=spamassassin -czf
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.new *cf
sha1sum
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.new >>
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha1.new
sha256sum
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.new >>
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha256.new
# remove old versions
rm -f //sa.mejor.pl/htdocs/sa-updates/*.tar.gz
rm -f //sa.mejor.pl/htdocs/sa-updates/*.tar.gz.sha1
rm -f //sa.mejor.pl/htdocs/sa-updates/*.tar.gz.sha256
mv "//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.new
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz
mv
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha1.new
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha1
mv
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha256.new
"//sa.mejor.pl/htdocs/sa-updates/${new_version}".tar.gz.sha256

echo "Modyfikuję dns"
cat << EOF > /var/db/rbldnsd/sa.mejor.pl.zone
\$TTL 60
\$NS 7200 rb.mejor.pl.

*.4.3   ${new_version}
*.3.3   ${new_version}

mirrors http://update.sa.mejor.pl/MIRRORED.BY
:193.33.111.90:
update
EOF



(You can use rndc to update bind, if you use bind)


Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Marcin Mirosław
W dniu 20.03.2019 o 15:27, Dominic Raferd pisze:
> On Wed, 20 Mar 2019 at 13:14, piecka  wrote:
>>
>> Hello
>>
>> We've encountered a high false positive rate with MIXED_ES rule for emails
>> written in Czech language. Czech naturally uses all of the e,ě and é.
>>
>> The situation is similar for Slovak language, which includes e and é.
>>
>> It seems the same with Greek
>> (https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7691).
>>
>> Email messages written in one of the above mentioned (probably even other)
>> languages have a much higher false positive rate than I would consider
>> acceptable.
>>
>> Additionally, the default score for the rule is 3.999 which is quite high.
>>
>> I don't think the rule is suitable for the default ruleset in the current
>> form.
> 
> I have seen similar problems and agree. I reduced its score with this
> line in /etc/spamassassin/local.cf:
> score MIXED_ES 0.499
> 


MIXED_ES has hits in ham in masscheck
https://ruleqa.spamassassin.org/20190317-r1855682-n/MIXED_ES/detail
part of ham mails in corpus which trigger MIXED_ES is in polish language.






Re: sa-update not properly parsing urls in MIRRORED.BY files?

2019-01-10 Thread Marcin Mirosław
W dniu 2019-01-10 o 12:05, Kevin A. McGrail pisze:
> I believe this is a known issue fixed in svn.  We need to get 3.4.3 out
> the door for this.  Are you able to test with the 3.4 branch from svn?

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7623




Re: [SA 3.4.2] sa-update doesn't see custom channel

2018-12-20 Thread Marcin Mirosław
W dniu 19.12.2018 o 16:16, Kris Deugau pisze:
> RW wrote:
>> It looks like sa-update has lost support for paths in mirror URLs. The
>> SA mirrors don't currently have paths, but the commented-out dostech
>> entry suggests that they have been supported in the past.
> 
> I came across this myself since my local channels also use
> subdirectories.  It's fixed for the pending 3.4.3 (I think) and in trunk
> as per https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7623.

Hi,
thank you for link to bug. I'll test fix.
Marcin


Re: [SA 3.4.2] sa-update doesn't see custom channel

2018-12-19 Thread Marcin Mirosław
W dniu 03.12.2018 o 15:42, Marcin Mirosław pisze:
> Hi!
> I have problem with sa-update and my own channel. sa-update queries for
> A record of strange domain:
> 
> # /usr/bin/sa-update --channel sa.mejor.pl --no-gpg -vv
> DNS TXT query: 2.4.3.sa.mejor.pl -> 3209
> Update available for channel sa.mejor.pl: -1 -> 3209
> DNS A query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
> DNS  query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
> channel: could not find working mirror, channel failed
> Update failed, exiting with code 4
> 
> and this is what logged local resolver:
> 2018-12-03T15:35:42.613624+01:00 jowisz unbound: [8540:0] info:
> 127.0.0.1 update.sa.mejor.pl?sa-updates. A IN
> 2018-12-03T15:35:42.617145+01:00 jowisz unbound: [8540:0] info:
> 127.0.0.1 update.sa.mejor.pl?sa-updates.  IN
> 
> Why sa-update queries for update.sa.mejor.pl?sa-updates (or
> update.sa.mejor.pl/sa-updates) domain?
> 
> I just run sa-update in debug mode, I paste relevant parts:
> [...]
> Dec  3 15:40:10.955 [24739] dbg: channel: attempting channel sa.mejor.pl
> Dec  3 15:40:10.955 [24739] dbg: channel: using existing directory
> /var/lib/spamassassin/3.004002/sa_mejor_pl
> Dec  3 15:40:10.955 [24739] dbg: channel: channel cf file
> /var/lib/spamassassin/3.004002/sa_mejor_pl.cf
> Dec  3 15:40:10.955 [24739] dbg: channel: channel pre file
> /var/lib/spamassassin/3.004002/sa_mejor_pl.pre
> DNS TXT query: 2.4.3.sa.mejor.pl -> 3209
> Dec  3 15:40:10.966 [24739] dbg: dns: 2.4.3.sa.mejor.pl => 3209, parsed
> as 3209
> Update available for channel sa.mejor.pl: -1 -> 3209
> Dec  3 15:40:10.967 [24739] dbg: channel: preparing temp directory for
> new channel
> Dec  3 15:40:10.967 [24739] dbg: channel: created tmp directory
> /tmp/.spamassassin24739FTCF1ttmp
> Dec  3 15:40:10.967 [24739] dbg: generic: lint checking site pre files
> once before attempting channel updates
> [...]
> Dec  3 15:40:11.189 [24739] dbg: channel: protocol family available:
> inet,inet6
> Dec  3 15:40:11.189 [24739] dbg: channel: reading MIRRORED.BY file
> /var/lib/spamassassin/3.004002/sa_mejor_pl/MIRRORED.BY
> Dec  3 15:40:11.189 [24739] dbg: channel: parsing MIRRORED.BY file for
> channel sa.mejor.pl
> Dec  3 15:40:11.189 [24739] dbg: channel: found mirror
> http://update.sa.mejor.pl/sa-updates/
> Dec  3 15:40:11.193 [24739] dbg: dns: query failed:
> update.sa.mejor.pl/sa-updates => NXDOMAIN
> DNS A query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
> Dec  3 15:40:11.194 [24739] dbg: dns: query failed:
> update.sa.mejor.pl/sa-updates => NXDOMAIN
> DNS  query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
> Dec  3 15:40:11.195 [24739] dbg: generic: reject mirror
> http://update.sa.mejor.pl/sa-updates: no common address family (IPv4 IPv6)
> channel: could not find working mirror, channel failed
> 
> # cat /var/lib/spamassassin/3.004002/sa_mejor_pl/MIRRORED.BY
> http://update.sa.mejor.pl/sa-updates/
> 
> Something changed how channel should be configured beetwen 3.4.1 and 3.4.2?
> 


Hi,
any ideas what can be wrong?
Marcin




[SA 3.4.2] sa-update doesn't see custom channel

2018-12-03 Thread Marcin Mirosław
Hi!
I have problem with sa-update and my own channel. sa-update queries for
A record of strange domain:

# /usr/bin/sa-update --channel sa.mejor.pl --no-gpg -vv
DNS TXT query: 2.4.3.sa.mejor.pl -> 3209
Update available for channel sa.mejor.pl: -1 -> 3209
DNS A query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
DNS  query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
channel: could not find working mirror, channel failed
Update failed, exiting with code 4

and this is what logged local resolver:
2018-12-03T15:35:42.613624+01:00 jowisz unbound: [8540:0] info:
127.0.0.1 update.sa.mejor.pl?sa-updates. A IN
2018-12-03T15:35:42.617145+01:00 jowisz unbound: [8540:0] info:
127.0.0.1 update.sa.mejor.pl?sa-updates.  IN

Why sa-update queries for update.sa.mejor.pl?sa-updates (or
update.sa.mejor.pl/sa-updates) domain?

I just run sa-update in debug mode, I paste relevant parts:
[...]
Dec  3 15:40:10.955 [24739] dbg: channel: attempting channel sa.mejor.pl
Dec  3 15:40:10.955 [24739] dbg: channel: using existing directory
/var/lib/spamassassin/3.004002/sa_mejor_pl
Dec  3 15:40:10.955 [24739] dbg: channel: channel cf file
/var/lib/spamassassin/3.004002/sa_mejor_pl.cf
Dec  3 15:40:10.955 [24739] dbg: channel: channel pre file
/var/lib/spamassassin/3.004002/sa_mejor_pl.pre
DNS TXT query: 2.4.3.sa.mejor.pl -> 3209
Dec  3 15:40:10.966 [24739] dbg: dns: 2.4.3.sa.mejor.pl => 3209, parsed
as 3209
Update available for channel sa.mejor.pl: -1 -> 3209
Dec  3 15:40:10.967 [24739] dbg: channel: preparing temp directory for
new channel
Dec  3 15:40:10.967 [24739] dbg: channel: created tmp directory
/tmp/.spamassassin24739FTCF1ttmp
Dec  3 15:40:10.967 [24739] dbg: generic: lint checking site pre files
once before attempting channel updates
[...]
Dec  3 15:40:11.189 [24739] dbg: channel: protocol family available:
inet,inet6
Dec  3 15:40:11.189 [24739] dbg: channel: reading MIRRORED.BY file
/var/lib/spamassassin/3.004002/sa_mejor_pl/MIRRORED.BY
Dec  3 15:40:11.189 [24739] dbg: channel: parsing MIRRORED.BY file for
channel sa.mejor.pl
Dec  3 15:40:11.189 [24739] dbg: channel: found mirror
http://update.sa.mejor.pl/sa-updates/
Dec  3 15:40:11.193 [24739] dbg: dns: query failed:
update.sa.mejor.pl/sa-updates => NXDOMAIN
DNS A query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
Dec  3 15:40:11.194 [24739] dbg: dns: query failed:
update.sa.mejor.pl/sa-updates => NXDOMAIN
DNS  query update.sa.mejor.pl/sa-updates failed: NXDOMAIN
Dec  3 15:40:11.195 [24739] dbg: generic: reject mirror
http://update.sa.mejor.pl/sa-updates: no common address family (IPv4 IPv6)
channel: could not find working mirror, channel failed

# cat /var/lib/spamassassin/3.004002/sa_mejor_pl/MIRRORED.BY
http://update.sa.mejor.pl/sa-updates/

Something changed how channel should be configured beetwen 3.4.1 and 3.4.2?

Marcin



Re: (was: FORGED_HOTMAIL_RCVD2 false positive) Can't locate object method "check_for_forged_gmail_received_headers" via package "Mail::SpamAssassin::PerMsgStatus" at (eval 1360) line 1587.

2018-02-01 Thread Marcin Mirosław
W dniu 30.01.2018 o 14:51, Kevin A. McGrail pisze:
> On 1/30/2018 4:11 AM, Marcin Mirosław wrote:
>> Can error pasted below be related to this commit?
> 
> Yes, without a doubt the same bug.

Hi!
I'm answering with one email, thanks for your anwsers and now sa-update
works fine.

Have a nice day



Re: (was: FORGED_HOTMAIL_RCVD2 false positive) Can't locate object method "check_for_forged_gmail_received_headers" via package "Mail::SpamAssassin::PerMsgStatus" at (eval 1360) line 1587.

2018-01-30 Thread Marcin Mirosław
W dniu 29.01.2018 o 08:26, Giovanni Bechis pisze:
> On 01/29/18 06:00, Alex wrote:
>> Hi,
>>
>>> FORGED_HOTMAIL_RCVD2 (hotmail.com 'From' address, but no 'Received:')
>>> triggers for valid hotmail messages...  (SA 3.4.1)
>>>
>>> This small change solves the problem but i do not know whether it is the
>>> correct way...maybe "hotmail" string should be changed widelly to
>>> "outlook|hotmail"...
>>>
>>> /usr/local/share/perl/5.14.2/Mail/SpamAssassin/Plugin/HeaderEval.pm.orig
>>> 357c357
>>> <   if ($rcvd =~ /from \S*\.hotmail.com \(\[$IP_ADDRESS\][ \):]/ && $ip)
>>> ---
   if ($rcvd =~ /from \S*\.(?:outlook|hotmail)\.com \(\[$IP_ADDRESS\][
 \):]/ && $ip)
>>
>> Any status on this? I believe you were going to open a bug report? It
>> doesn't appear this fix (or any fix) has been included to address the
>> hotmail fp's.
>>
> Committed yesterday by davej@
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7534


Hi!
Can error pasted below be related to this commit?
# sa-update -vD
[...]
sty 30 10:10:00.540 [3276] dbg: FreeMail: RULE (__freemail_reply)
check_freemail_replyto
sty 30 10:10:00.540 [3276] dbg: FreeMail: From address:
ign...@compiling.spamassassin.taint.org
sty 30 10:10:00.540 [3276] dbg: FreeMail: No Reply-To and From is not
freemail, skipping check
rules: failed to run FORGED_GMAIL_RCVD test, skipping:
(Can't locate object method
"check_for_forged_gmail_received_headers" via package
"Mail::SpamAssassin::PerMsgStatus" at (eval 1360) line 1587.
)
sty 30 10:10:00.540 [3276] dbg: rules: running body tests; score so
far=0.914
[...]



Re: mk_meta_rule_scores - does it work correctly?:)

2017-03-15 Thread Marcin Mirosław

W dniu 2017-03-14 16:23, Kris Deugau napisał(a):

mar...@mejor.pl wrote:

Hi!
Thanks to AXB seek-in-phrases-in-log works OK. Now I'm on the next
step with automated creating rules.
I suspect that mk_meta_rule_scores doesn't assign scores correctly. I
set in mk_meta_rule_scores:
my %scoremap = (
  '70' => '1.5',
  '4' => '2.0',
  '0.01' => '3.0',
);


If I understand correctly (quite possible I don't;  I haven't dug in
to the internals of this stage), the %scoremap hash above indicates
the percentage of the messages that need to hit on a subrule for that
subrule to be included in one of the meta rules.

So, 70% of the mail in the set would need to hit on a subrule for it
to be included in the first group, 4% in the second, and 0.01% in the
third.

If I read the information flow correctly, this is actually decided by
seek-phrases-in-log, which spits out subrules that reached a certain
hit rate in blocks, followed by the "# passed hit-rate threshold nnn"
line. mk_meta_rule_scores just takes that in, collects the rule names
in each block, and spits out the meta.



I made some tests and watch how output looks to understand how some 
paremeters works. Meseems that "--reqhitrate" works in this way:
a) if --reqhitrate contains only one value then output od 
seek-phrases-in-log contains only rules that hits more than value passed 
to --reqhitrate. So this cuts off rules that are hitted rarely


b) if --reqhitrate contains more than one value then:
 rules 
 other rules 

example:
--reqhitrate "70 10 1" gives:
<100%> - no rules here - <70%> - rules that matches less than 70% of 
spam - <10%> rules that matches less than 10% of spam and more than 1% - 
<1%> - cut off, no rules here



(In fact the stock setup refers to mk_meta_rule, although it's nearly
identical to mk_meta_rule_scores.)

By raising the hit percentage to 70%, you're requiring that 70% of the
spam you're using must hit on one of the subrules.  TBH, by that
point, you may as well hand-extract a couple of the subrules and make
them static, standalone rules.


Have you tried to use mk_meta_rule_scores and did I get more values of 
scores than two? The default and the value in medium range. I suspect 
that mk_meta_rule_scores doesn't play well with ranges. It is something 
that I can live with it but if somewhere is bug I would try report it. 
If it will not be fixed it can save some time of other users trying to 
use this scipt.




My experience has been that you need lots of mail to generate multiple
metas in any case;  I've taken a different tack and separated out
different groups of mail for different generated rule sets instead.



Thanks,
Marcin


Re: new powerful plugin

2016-08-17 Thread Marcin Mirosław
Hi!
W dniu 16.08.2016 o 17:36, Nicola Piazzi pisze:
> It is difficoult to write a doc of what this plugin that I wrote do
> But here is the ow.cf file, so you can see what this plugin do

You didn't attach ow.(cf|pm).

> It can be used ONLY when box is the same for send and receive emails
> What do you think about it ?someone want to have to try ?

And when outgoing emails are going through SA. It looks that is enough
that outgoing MTA do insert into database, it doesn't have to be the
same box
I like this idea but I didn't ever had skills to write perl plugin. I
don't use mysql I can't use your plugin without modification but
probably it wouldn't be hard to change DBI and queries for Postgresql.
Marcin




Re: How to make uridnsbl to not stripping subdomains?

2015-06-28 Thread Marcin Mirosław
W dniu 2015-06-28 o 15:57, Axb pisze:
> On 28.06.2015 15:17, Marcin Mirosław wrote:
>> Hi!
>> I've got simple rule with eval:check_uridnsbl to make check against own
>> uribl. And notice that uribl strips subdomains from uri so instead
>> querying for sub4.sub3.sub2.sub1.org.myuribl spamassassin makes query
>> for sub1.org.myuribl. But I prefer to query for full domain, without any
>> striping. Doc says:
>> " Note that hostnames are stripped from the domain used in the URIBL
>> lookup"
>> How can I change this behavior in my rule?
> Using the URIBL plugin you can't unless you use
> util_rb_2tld / util_rb_3tld to create pseudo TLDs.
> 
>> And why URIBL strips not only hostname but all subdomains level above
> foo.tld?
> 
> efficiency - there are probablky few cases where you want to list the
> subdomain chain and not the parent domain and spend time chasing the
> subdomains.
> 
> if theres's a domain which you don't want to list but only its
> subdomains you can add it as "util_rb_2tld example.net" to a local rules
> file. Your URI bl will then contain subdomains.
> 
> or: look into the ASKDns plugin which allows you do all kinds of magic
> around DNS queries.


Thanks for tips. ASKDns looks good for me but I have no idea how build
rule for querying for all URIs inside email.
askdns LR_TEST   A /127\.0\.0\2/




How to make uridnsbl to not stripping subdomains?

2015-06-28 Thread Marcin Mirosław
Hi!
I've got simple rule with eval:check_uridnsbl to make check against own
uribl. And notice that uribl strips subdomains from uri so instead
querying for sub4.sub3.sub2.sub1.org.myuribl spamassassin makes query
for sub1.org.myuribl. But I prefer to query for full domain, without any
striping. Doc says:
" Note that hostnames are stripped from the domain used in the URIBL lookup"
How can I change this behavior in my rule? And why URIBL strips not only
hostname but all subdomains level above foo.tld?
Thanks



Re: Ignoring Received: header added by real MTA

2015-05-11 Thread Marcin Mirosław
W dniu 06.05.2015 o 14:46, Kevin A. McGrail pisze:
> On 5/5/2015 3:56 PM, Marcin Mirosław wrote:
>> W dniu 2015-05-05 o 21:47, Kevin A. McGrail pisze:
>>> On 5/5/2015 3:38 PM, Marcin Mirosław wrote:
>>>> I'm thinking about removing all Received headers from email except
>>>> added
>>>> by my MTA, storing it, sending email to spamd and restoring headers.
>>>> But
>>>> it looks like using a sledgehammer to crack a nut:)
>>> What RBL are you concerned about specifically because some RBLs do deep
>>> header parsing which you can change with lastexternal?
>> RCVD_IN_SBL_CSS. It looks that "lastexternal" is what I'm looking for.
>> Is it possible to add '-lastexternal' to all RBL?
> Different RBLs are designed differently so it's not a one size fits all
> question & answer.
> 
> CSS is designed for ISPs...  http://www.spamhaus.org/css/
> 
> As such, a deep header parsing might be appropriate.


Hi!
Thank you all for explanations, "deep header inspection" is the reason
of such behavior. Good information is my configuration of SA is correct:)
Marcin


Re: Ignoring Received: header added by real MTA

2015-05-05 Thread Marcin Mirosław
W dniu 2015-05-05 o 22:07, Benny Pedersen pisze:
> Marcin Mirosław skrev den 2015-05-05 21:21:
> 
>>>> My goal is to configure SA to not check IP of client (in this example
>>>> 31.61.129.221).
>>>
>>> Can you elaborate about what's going on here? What do the two hand-overs
>>> represent? What do you mean by "real MTA"?
>>
>> Thanks for both answers. I'll try to describe it using ascii art:
>> ----
>> |random user sending email   |sends email |89.161.182.208 from this |
>> |(in my case: 31.61.129.221) |--->|MTA I'm getting email|
>> ----
>>
>> --
>> --->|my MTA -poczta.cibet.pl |
>> --
>>
>>
>> So it's not important for my if address 31.61.129.221 is on any rbl
>> because I'm not getting email directly from this ip. It's important for
>> me if server 89.161.182.208 (which directly connects to my mta) is in
>> any RBL. I'd like SA to check only ip which diectly connects to my
>> server against RBL.
> 
> please show the problem in spamassassin
> 
> are 31.61.129.221 a smtp auth user ?, in this case you should NOT add
> this ip to trusted_networks since the client ip would be your server ip
> in spamassassin

In 99% yes. Header with ip 31.61.129.221 was added by external MTA so I
can't trust in 100%.

> spamassassin -D -t sample-msg-file 2>&1 | less
> 
> in less press s to save test results, post this results headers so we
> can help solve it, what mta are you using ?, and how is spamassassin
> used in mta ?

In my first email I sended report from SA. Now 31.61.129.221 isn't
listed by Spamhaus SBL-CSS, I suspect that pasting another report from
SA would makes more problem.
I'm thinking how to describe my problem in different way...








Re: Ignoring Received: header added by real MTA

2015-05-05 Thread Marcin Mirosław
W dniu 2015-05-05 o 21:47, Kevin A. McGrail pisze:
> On 5/5/2015 3:38 PM, Marcin Mirosław wrote:
>> I'm thinking about removing all Received headers from email except added
>> by my MTA, storing it, sending email to spamd and restoring headers. But
>> it looks like using a sledgehammer to crack a nut:)
> What RBL are you concerned about specifically because some RBLs do deep
> header parsing which you can change with lastexternal?

RCVD_IN_SBL_CSS. It looks that "lastexternal" is what I'm looking for.
Is it possible to add '-lastexternal' to all RBL?





Re: Ignoring Received: header added by real MTA

2015-05-05 Thread Marcin Mirosław
W dniu 2015-05-05 o 21:28, Reindl Harald pisze:
> 
> 
> Am 05.05.2015 um 21:21 schrieb Marcin Mirosław:
>> Thanks for both answers. I'll try to describe it using ascii art:
>> ----
>> |random user sending email   |sends email |89.161.182.208 from this |
>> |(in my case: 31.61.129.221) |--->|MTA I'm getting email|
>> ----
>>
>>  --
>> --->|my MTA -poczta.cibet.pl |
>>  --
>>
>>
>> So it's not important for my if address 31.61.129.221 is on any rbl
>> because I'm not getting email directly from this ip. It's important for
>> me if server 89.161.182.208 (which directly connects to my mta) is in
>> any RBL
> 
> and who's MTA is 89.161.182.208?

It's not mine MTA. It is MTA used by someone on the world.

> if it's a known machine realying mail for you it *is* important if
> 31.61.129.221 is on a RBL - hence put 89.161.182.208 in trusted_networks

I'm thinking about removing all Received headers from email except added
by my MTA, storing it, sending email to spamd and restoring headers. But
it looks like using a sledgehammer to crack a nut:)



Marcin




Re: Ignoring Received: header added by real MTA

2015-05-05 Thread Marcin Mirosław
W dniu 2015-05-05 o 20:29, RW pisze:
> On Tue, 05 May 2015 10:51:22 +0200
> Marcin Miros?aw wrote:
> 
>> Hi!
>> I'm ashamed to ask because this problem is like boomerang but still
>> can't find solution. I'm reading e.g.:
>> http://spamassassin.1065346.n5.nabble.com/How-to-ignore-multiple-Received-headers-td52450.html
>>
>> I set trusted_networks but I can't see how can it helps (and it
>> doesn't work as I wish). I'm using Exim and I'm connecting to spamd
>> to check mail status. Headers of problematic email are:
>>
>> Delivery-date: Tue, 05 May 2015 08:22:49 +0200
>> Received: from v034244.home.net.pl ([89.161.182.208])
>> by poczta.cibet.pl with smtp (Exim 4.84)
>> (envelope-from )
>> id 1YpWFk-0001BY-EC
>> for spamt...@cibet.pl; Tue, 05 May 2015 08:22:49 +0200
>> Received: from public-gprs514716.centertel.pl (31.61.129.221) (HELO
>> Tos) by yyy.home.pl (89.161.182.208) with SMTP
>> (IdeaSmtpServer v0.80) id ea61105d60d70d9d; Tue, 5 May 2015 08:22:44
>> +0200
>>
>> ...
>> My goal is to configure SA to not check IP of client (in this example
>> 31.61.129.221).
> 
> Can you elaborate about what's going on here? What do the two hand-overs
> represent? What do you mean by "real MTA"?

Thanks for both answers. I'll try to describe it using ascii art:
----
|random user sending email   |sends email |89.161.182.208 from this |
|(in my case: 31.61.129.221) |--->|MTA I'm getting email|
----

--
--->|my MTA -poczta.cibet.pl |
--


So it's not important for my if address 31.61.129.221 is on any rbl
because I'm not getting email directly from this ip. It's important for
me if server 89.161.182.208 (which directly connects to my mta) is in
any RBL. I'd like SA to check only ip which diectly connects to my
server against RBL.
Marcin






Ignoring Received: header added by real MTA

2015-05-05 Thread Marcin Mirosław
Hi!
I'm ashamed to ask because this problem is like boomerang but still
can't find solution. I'm reading e.g.:
http://spamassassin.1065346.n5.nabble.com/How-to-ignore-multiple-Received-headers-td52450.html

I set trusted_networks but I can't see how can it helps (and it doesn't
work as I wish). I'm using Exim and I'm connecting to spamd to check
mail status. Headers of problematic email are:

Delivery-date: Tue, 05 May 2015 08:22:49 +0200
Received: from v034244.home.net.pl ([89.161.182.208])
by poczta.cibet.pl with smtp (Exim 4.84)
(envelope-from )
id 1YpWFk-0001BY-EC
for spamt...@cibet.pl; Tue, 05 May 2015 08:22:49 +0200
Received: from public-gprs514716.centertel.pl (31.61.129.221) (HELO Tos)
 by yyy.home.pl (89.161.182.208) with SMTP (IdeaSmtpServer v0.80)
 id ea61105d60d70d9d; Tue, 5 May 2015 08:22:44 +0200

and SA report for this:
X-Spam-Report: X-Spam-ASN:  AS12824 89.161.128.0/17
 X-Szczegoly:(mohikanin.in.cibet.pl)(6.9 points)
  pts rule name  description
  -- -
  0.8 RCVD_IN_SORBS_WEB  RBL: SORBS: nadawca posiada naduywany
serwer WWW
 [31.61.129.221 listed in dnsbl.sorbs.net]
 -1.9 BAYES_00   BODY: Bayesowskie prawdopodobiestwo
spamu wynosi 0 do 1%
 [score: 0.0011]
  2.1 HTML_IMAGE_ONLY_12 BODY: HTML: grafika i 1000-1200 bajtw
sw
  0.0 HTML_MESSAGE   BODY: Wiadomo zawiera kod HTML
  3.3 RCVD_IN_SBL_CSSRBL: Received via a relay in Spamhaus SBL-CSS
 [31.61.129.221 listed in zen.spamhaus.org]
  0.0 HTML_SHORT_LINK_IMG_2  HTML is very short with a linked image
  1.0 KAM_HTMLNOISE  Spam containing useless HTML padding
  0.0 LR_RCVD_NOT_IN_IPREPDNS Sender not listed at
 http://www.chaosreigns.com/iprep/
  0.6 LR_SHORT   Has URI and short body
  1.0 KAM_LAZY_DOMAIN_SECURITY Sending domain does not have any
 anti-forgery methods
  0.0 T_REMOTE_IMAGE Message contains an external image


My goal is to configure SA to not check IP of client (in this example
31.61.129.221).
I'm reading https://wiki.apache.org/spamassassin/TrustPath ,
_RELAYSUNTRUSTED_ gives: X-RelaysUntrusted [ ip=89.161.182.208
rdns=v034244.home.net.pl helo=v034244.home.net.pl by=poczta.cibet.pl
ident= envfrom=x...@y.pl intl=0 id=1YpWFk-0001BY-EC auth= msa=0 ] [
ip=31.61.129.221 rdns=public-gprs514716.centertel.pl
helo=public-gprs514716.centertel.pl by=hosttelekom.home.pl ident=
envfrom= intl=0 id=ea61105d60d70d9d auth= msa=0 ]

but still can find what configuration would give me needed behavior.
Thanks for advices.
Marcin



Re: Forex spam from botnet

2015-02-25 Thread Marcin Mirosław
W dniu 2015-02-25 o 19:17, Benny Pedersen pisze:
> On February 25, 2015 2:55:16 PM Marcin Mirosław  wrote:
> 
>> http://pastebin.com/bAm2yk8z , http://pastebin.com/6zLjMtM8 .
> 
> blacklist_uri_host businessanalyse.be
> blacklist_uri_host 143businesssecrets.com
> 
> and blacklist_from domains that have spf-pass
> 

Spam is sended from compromised hostings, domain changes frequently, it
takes to much time to add them to black list. And it needs time to
remove them from black list when someone fix php site.


Forex spam from botnet

2015-02-25 Thread Marcin Mirosław
Hi!
As I mentioned earlier I'm (and not only me but other users &postmasters
in Poland) getting a lot of spam from botnet. Usually it gets high
scores but from time to time spam is delivered to mailbox. Because this
spam is sended to many mailservers I think it could be worth to create
official or half official;) rule for other users. Two samples:
http://pastebin.com/bAm2yk8z , http://pastebin.com/6zLjMtM8 . If more
samples is needed I can send them offlist. As for know rule:

body   LR_PL_INNE2_B
\[.{1,10}http:\/\/.{1,80}php\?.{1,30}\=.{1,30}\].{0,20}(klikaj|odwiedz|wchodz)/i

catches it.
Marcin


Re: Lots of Polish spam

2015-02-24 Thread Marcin Mirosław
W dniu 2015-02-24 o 21:28, Yves Goergen pisze:
> Am 24.02.2015 um 19:56 schrieb Axb:
>> - Please post missed spam samples in pastebin.com - do not post samples
>> to mailing lists
> 
> It's too many to process them individually in pastebin. Here's an
> archive with ~60 messages in files:
> 
> https://drive.google.com/file/d/0B8CN0ghdY1SdSzBqdkswRUdOb0U/view
> 
> ZIP password: spam
> (Google thinks there's a virus in it so I needed to encrypt it.)

--- SCAN SUMMARY ---
Known viruses: 4360435
Engine version: 0.98.5
Scanned directories: 0
Scanned files: 58
Infected files: 30

This is with a bunch of unofficial databases for clamav, without foxhole
mentioned by Axb.
With foxhole rules:
--- SCAN SUMMARY ---
Known viruses: 4360690
Engine version: 0.98.5
Scanned directories: 0
Scanned files: 58
Infected files: 50

Imho you should take a look at clamav configuration to reject such emails.




Re: Lots of Polish spam

2015-02-24 Thread Marcin Mirosław
W dniu 2015-02-24 o 19:22, Yves Goergen pisze:
> Am 24.02.2015 um 19:00 schrieb Jeremy McSpadden:
>> Your better off to implement RBL at SMTP time, not SA. IMO
>> Which MTA are you using ?
> 
> Exim. But why should I do that? See my other message in this thread.
> RBLs make mistakes. But then, only one of them makes the mistake, not all.
> 
> Are RBLs the only measure to fight spam today? How do these lists learn
> spam quickly if there is no other way to detect it?
> 
> I'm not sure whether RBLs help here. These are some of the reports of
> recent messages:


I'm guessing that you are getting botnet spam. I'm getting thousands of
it per day since a couple weeks.
http://pastebin.com/6zLjMtM8




Re: Lots of Polish spam

2015-02-24 Thread Marcin Mirosław
W dniu 2015-02-24 o 19:56, Axb pisze:
[...]
> - Please post missed spam samples in pastebin.com - do not post samples
> to mailing lists

Yes, please share it, I'll take a look what kind of spamt it is.


Re: Malware Patrol SA Rules

2015-01-11 Thread Marcin Mirosław
P.S.2. All grepping was made in directory contains only spam.


Re: Malware Patrol SA Rules

2015-01-11 Thread Marcin Mirosław
W dniu 2015-01-11 o 04:49, Reindl Harald pisze:
> 
> Am 10.01.2015 um 22:07 schrieb Marcin Mirosław:
>> W dniu 2015-01-10 o 15:27, Reindl Harald pisze:
>>>
>>> Am 10.01.2015 um 15:19 schrieb David Flanigan:
>>>> Is anyone using the Malware Patrol 3rd party Spamassassin Rules
>>>> (https://www.malwarepatrol.net/index.shtml)?
>>>>
>>>> I have downloaded and looked them over and, in concept, they look
>>>> pretty
>>>> good.
>>>>
>>>> However the cf file is over 8.5megs (yes megs) in size. By far the
>>>> biggest ruleset I have. I cannot think this would do good things for
>>>> performance.
>>>>
>>>> Any experience, comments, etc?
>>>
>>> 8.5 MB SA rules is crazy
>>>
>>> that really belongs to clamav directly after SA because SA eats more
>>> http://sanesecurity.com/usage/signatures/
>>
>> Imho clamav needs less CPU power than SA (and need less time to scane
>> email) so I think it's better to use clamav before SA
> 
> that is true *but* after a few months it turned out that ClamAV don't
> catch that much mail which was killed by the SA milter after it and so
> 90% of all messages need to pass both - for the overall system so it
> makes more sense to have SA in front

I forgot about one, important thing, I'm using unofficial rules for
Clamav. My stats counted since 2014-12 are:
$ grep -r "X-ACL-Warn: Virus found" 201412* 201501*|wc -l
18314
$ find 201412* 2015* -type f|wc -l
33170
$ grep -hr "X-ACL-Warn: Virus found" 201412* 201501*|grep -c UNOFFICIAL
18288
$ grep -hr "X-ACL-Warn: Virus found" 201412* 201501*|grep -vc UNOFFICIAL
26
$ grep -hr "X-ACL-Warn: Virus found" 201412* 201501*|grep -v
UNOFFICIAL|sort|uniq
X-ACL-Warn: Virus found / znaleziono wirusa
:Heuristics.Phishing.Email.SpoofedDomain
X-ACL-Warn: Virus found / znaleziono wirusa
:Heuristics.Safebrowsing.Suspected-malware_safebrowsing.clamav.net
X-ACL-Warn: Virus found / znaleziono wirusa
:Heuristics.Safebrowsing.Suspected-phishing_safebrowsing.clamav.net
X-ACL-Warn: Virus found / znaleziono wirusa
:Zip.Suspect.ExecutablePhoto-zippwd-2
$ grep -hr "X-ACL-Warn: Virus found" 201412* 201501*|grep
UNOFFICIAL|grep -viE "(Junk|url|Spam)"|sort|uniq -c
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:BofhlandMWFile1302.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:BofhlandMWFile1306.UNOFFICIAL
  7 X-ACL-Warn: Virus found / znaleziono wirusa
:BofhlandMWFile1356.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Porcupine.Malware.29046.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Porcupine.Malware.29327.UNOFFICIAL
  2 X-ACL-Warn: Virus found / znaleziono wirusa
:Porcupine.Phishing.20003.UNOFFICIAL
724 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Foxhole.Zip_doc.UNOFFICIAL
715 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Foxhole.Zip_docx.UNOFFICIAL
 94 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Foxhole.Zip_jpeg.UNOFFICIAL
970 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Foxhole.Zip_pdf.UNOFFICIAL
 80 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Foxhole.Zip_xml.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.19736.UNOFFICIAL
  4 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.21933.ZipHeur.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24212.ZipHeur.UNOFFICIAL
  9 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24273.ZipHeur.UNOFFICIAL
  2 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24306.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24423.ZipHeur.UNOFFICIAL
  2 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24488.UNOFFICIAL
  9 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24594.ZipHeur.UNOFFICIAL
 15 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24595.ZipHeur.UNOFFICIAL
 13 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24639.ZipHeur.UNOFFICIAL
  2 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24646.DocHeur.UNOFFICIAL
  9 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24647.ZipHeur.UNOFFICIAL
  8 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24648.ZipHeur.UNOFFICIAL
  5 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Malware.24675.XlsHeur.UNOFFICIAL
  1 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Rogue.0hr.20141201-1850.UNOFFICIAL
 11 X-ACL-Warn: Virus found / znaleziono wirusa
:Sanesecurity.Rogue.0hr.20141202-0850.

Re: Malware Patrol SA Rules

2015-01-10 Thread Marcin Mirosław
W dniu 2015-01-10 o 15:27, Reindl Harald pisze:
> 
> Am 10.01.2015 um 15:19 schrieb David Flanigan:
>> Is anyone using the Malware Patrol 3rd party Spamassassin Rules
>> (https://www.malwarepatrol.net/index.shtml)?
>>
>> I have downloaded and looked them over and, in concept, they look pretty
>> good.
>>
>> However the cf file is over 8.5megs (yes megs) in size. By far the
>> biggest ruleset I have. I cannot think this would do good things for
>> performance.
>>
>> Any experience, comments, etc?
> 
> 8.5 MB SA rules is crazy
> 
> that really belongs to clamav directly after SA because SA eats more
> http://sanesecurity.com/usage/signatures/

Hi!
Imho clamav needs less CPU power than SA (and need less time to scane
email) so I think it's better to use clamav before SA.
Marcin


Re: [SPAM] Re: False positive in rule: FUZZY_XPILL

2014-09-29 Thread Marcin Mirosław
W dniu 10.09.2014 o 06:57, John Hardin pisze:
> On Tue, 9 Sep 2014, Marcin Mirosław wrote:
> 
>> W dniu 09.09.2014 o 15:19, John Hardin pisze:
>>> On Tue, 9 Sep 2014, Marcin Mirosław wrote:
>>>
>>>> Hi again,
>>>> I noticed FP on mentioned rule when checking ham email. Due to
>>>> confidential content I don't want to share it on ML. Is somebody
>>>> willing
>>>> to improve mentioned rule or one case is not enough to look at it? If
>>>> somebody would like to look insight it I can send such email offlist.
>>>
>>> I'll take a look.
>>
>> Hi!
>> Thank you. FUZZY_PILL has high score so it would be great to lower
>> chance of FP.
>> Attached email is has partially, manually removed pdf attachment. I hope
>> I didn't break mime parts too much. Attached email still triggers
>> FUZZY_XPILL.
>> Regards,
>> Marcin

Hi!
I'm sorry for huge delay in answer.

> Is that email supposed to have an image attached to it? I note one of
> the MIME parts has this:
> 
>Content-Type: text/plain; name="mpanic.png"
> 
> The content-type is wrong for a binary data attachment.
> 
> That attachment also doesn't appear to be a valid .PNG image file. Are
> you actually able to view that as an image?

$ file mpanic.png
mpanic.png: PNG image data, 684 x 750, 8-bit/color RGBA, non-interlaced

Okular doesn't have problem with this image, thunderbird also displays
it in message.

> The FUZZY_XPILL hit is on what appears to be binary data in the message
> body, likely due to that attachment being interpreted as body text due
> to the MIME type. I can find what appears to be the matched string
> within the mpanic.png file, but not anywhere in the actual text part of
> the message.
> 
> I think that you should contact whoever sent that message and have them
> review how they are generating it. I'm reluctant to call this SA's fault
> for trusting the MIME content type.


I'll try to contact but this is automated generated email with invoice.
I'm expecting that their can't modify buyed soft.

Thanks,
Marcin




Re: How to report spam to mailspike

2014-09-11 Thread Marcin Mirosław
W dniu 09.09.2014 o 23:53, Jose Borges Ferreira pisze:
> Hi Marcin,
> 
> I'm affiliated with Mailspike and just want to say that we have
> changed the contact form so you now have a option specify that you are
> contacting as Self-employed or private.
> 
> We have also configured ab...@mailspike.org if you need to contact directly.
> 
> Regarding your original problem, we don't use user feedback as a rule
> but you can send some samples ( arf format prefered ) along with your
> complain and we will take a look.

Hi!
Thank you for email and for changing contact form. And it's nice abuse@
also started to work. As for know I can't send samples as arf because I
can't find easy to use and working arf report generator.

agradecido,
Marcin



False positive in rule: FUZZY_XPILL

2014-09-09 Thread Marcin Mirosław
Hi again,
I noticed FP on mentioned rule when checking ham email. Due to
confidential content I don't want to share it on ML. Is somebody willing
to improve mentioned rule or one case is not enough to look at it? If
somebody would like to look insight it I can send such email offlist.

Regrds


Re: How to report spam to mailspike

2014-09-09 Thread Marcin Mirosław
W dniu 29.08.2014 o 23:36, Dave Warren pisze:
> On 2014-08-29 02:38, Marcin Mirosław wrote:
>> So what should I do in your opinion? I'm getting spam to my private
>> spamtrap so I can't fill fields about company - it doesn't matter where
>> I'm hired for reporting spam. What if I would be unemployed? Then I
>> would have to lie about company? IMHO it is the way to hinder sending
>> complaints from users.
> 
> If you're not willing to provide the information they request, and they
> won't accept an inquiry without it, then you're left with a different
> choice: 1) Do nothing, 2) Cease using the service.
> 
> From their perspective, either the policy will increase the quality of
> reports they get by reducing the noise, allowing them to focus on real
> queries, and ultimately increasing the quality of the list, or it will
> discourage people from reporting, decreasing the quality of the list,
> resulting in less users and less relevance.
> 
> They've made their choice, now you get to make yours. Personally, I'm
> quite pleased with their performance, and I have no problem identifying
> myself when I contact a company. If I'm acting on my own behalf, I'd put
> "Personal" or "None" or "N/A" into a form, and if it's not accepted, oh
> well.

Hi!
In a half of past week I asked them about how should I report spam to
them. I didn't get any answer yet. I don't expect to get it in future.
For me they are unreliable as a RBL provider.



Re: How to report spam to mailspike

2014-08-29 Thread Marcin Mirosław
W dniu 28.08.2014 o 11:20, Reindl Harald pisze:
> 
> Am 28.08.2014 um 11:11 schrieb Marcin Mirosław:
>> I've noticed growing volume of emails listed by mailspike. Usually it's
>> spam listed as "good reputation". On his webpage I can see only page
>> http://mailspike.org/contact.html , they want to fill many personal
>> information, I don't want to send it to them and I don't want to lie
> 
> i would say that's one part why they are somehow trustable
> because require that personal information makes a little
> barrier (you have proven) that any random guy with one
> single and maybe careless click can have impact in both
> directions (maybe bad - intentionally or unintentionally)

So what should I do in your opinion? I'm getting spam to my private
spamtrap so I can't fill fields about company - it doesn't matter where
I'm hired for reporting spam. What if I would be unemployed? Then I
would have to lie about company? IMHO it is the way to hinder sending
complaints from users.

Regards,
Marcin


How to report spam to mailspike

2014-08-28 Thread Marcin Mirosław
Hi!
I've noticed growing volume of emails listed by mailspike. Usually it's
spam listed as "good reputation". On his webpage I can see only page
http://mailspike.org/contact.html , they want to fill many personal
information, I don't want to send it to them and I don't want to lie.
abuse@ doesn't work:
: host zimbra.anubisnetworks.com[195.22.26.196]
said: 550
5.1.1 : Recipient address rejected:
mailspike.org (in
reply to RCPT TO command)

Thanks,
Marcin


Re: BayesStore::Redis can't do AUTH when Redis is <=2.6 (was: sa-learn site-wide bayes on Redis)

2014-08-21 Thread Marcin Mirosław
W dniu 21.08.2014 o 15:20, Matteo Dessalvi pisze:
> Which version of Redis are you using? I did have some
> problems with the 2.4 version packaged by Debian and
> I did solve a similar problem using a more recent
> version, like the 2.7 or 2.8.

And you fixed my problem! Indeed, upgrading from redis-2.6.15 to 2.8.13
fixed problem with not working AUTH.
Thanks Matteo!



Re: sa-learn site-wide bayes on Redis

2014-08-21 Thread Marcin Mirosław
W dniu 21.08.2014 o 13:45, Matteo Dessalvi pisze:
> I am pretty sure SA support the Redis authentication mechanism.
> For my tests I have used the following line:
> 
> bayes_sql_dsn  server=127.0.0.1:6379;password=MySecretPWD;database=2

Thanks Matteo,
firstly I should try then write to ML:) So now I did own check. It looks
that SA doesn't authenticate when connects to redis. It didn't work for
me with your example not when I used
bayes_sql_password   password

When redis needs passowrd then SA throws "bayes: Redis failed: Redis
error: ERR operation not permitted", tcpdump also confirms that SA
doesn't do AUTH.
It's strange because in Redis.pm I can see that authentication is
supported. Now I'm thinking where I could made mistake in configuration...

Thanks,
Marcin


Re: sa-learn site-wide bayes on Redis

2014-08-21 Thread Marcin Mirosław
W dniu 20.08.2014 o 14:42, Axb pisze:
> On 08/20/2014 02:25 PM, Matteo Dessalvi wrote:
>> Hi all.
>>
>>
>> I am managing a bunch of Linux MTAs which are placed in
>> front of some Exchange servers. In such a configuration
>> the Bayes filter is deployed site-wide.
>>
>> For a new deployment of these servers I am planning
>> to use Redis as a centralized backend (previously
>> the bayes db were just files saved on the disk).
>>
>> My question is: do I have to use a specific option
>> to tell sa-learn that the bayes db is now hosted on
>> Redis? Or sa-learn will use the info from the
>> bayes_sql_dsn directive in my local.cf?
>>
>> Looking into the wiki:
>> http://wiki.apache.org/spamassassin/SiteWideBayesSetup
>>
>> or into the sa-learn docs:
>> http://spamassassin.apache.org/full/3.4.x/doc/sa-learn.html
>>
>> did not give me any clues.
> 
> see
> 
> http://svn.apache.org/repos/asf/spamassassin/trunk/contrib/HOWTO.Bayes-Redis/
> 
> 
> hope that helps.
> This is not an official doc, so if you see anything that needs to be
> added/changed, pls let me know.

Hi!
I'm reading bayes_redis.cf and I can see:
"
#NOTE: We're not using authentication assuming the Redis server/port
should not be reachable form the "outside"
# You can add authentication once you've seen it work.
"

Does it means that this example config doesn't include authentication
options or it means that SA doesn't support auth for redis?

Marcin






Re: How the rules __TO_EQ_FROM_1 __TO_EQ_FROM_2 work?

2014-03-27 Thread Marcin Mirosław
W dniu 24.02.2014 16:24, John Hardin pisze:
Hi!

> On Mon, 24 Feb 2014, Marcin Mirosław wrote:
> 
>> Sorry for silly question. I'd like to know if mentioned rules catches
>> all email address or only user part?
> 
> It's not a silly question. All of the TO_EQ_FROM rules compare the full
> email address.

In theory I could it by self, it's only one line of code:) But I'm not
so good in regexp to do it.

Regards,
Marcin


How the rules __TO_EQ_FROM_1 __TO_EQ_FROM_2 work?

2014-02-24 Thread Marcin Mirosław

Hi!
Sorry for silly question. I'd like to know if mentioned rules catches 
all email address or only user part?

I'd like to catch such spam:

From: Charlene Torres 
To: zawodynn 

Where user part in header From: and To: is the same (and SPF_FAIL and 
SPF_SOFTFAIL etc). My tests shows that __TO_EQ_FROM doesn't catch it.
If rules catches all email I'd like to vote for creating rule for 
caching pasted example:)


Marcin


Re: Detecting very recently registered domain names

2013-12-19 Thread Marcin Mirosław
W dniu 19.12.2013 16:13, Alex pisze:
> Hi,

Hi,

> On Thu, Dec 19, 2013 at 10:02 AM, Joe Quinn  wrote:
> 
> Isn't that where Kevin works too? Couldn't you just walk down the hall
> and ask him? lol
> 
>> We are noticing a lot of spam coming from domains that are less than two
>> months old. Is there a good way to detect this automatically?
> 
> Two months? That's already ancient.
> 
> Check out the URIBL_RHS_DOB (day old bread) rule. Your domains should
> be hitting that.

I've noticed false positives in last days in this rule.

  1.5 URIBL_RHS_DOB  Contains an URI of a new domain (Day Old Bread)
 [URIs: imageshack.us]




Re: False positive in FB_CIALIS_LEO3 rule

2013-10-18 Thread Marcin Mirosław
W dniu 18.10.2013 15:23, Axb pisze:
> On 10/18/2013 03:07 PM, Marcin Mirosław wrote:
>> Hi!
>> I'm not sure if false positives should be reported here or in bugzilla.
>> If I choosen wrong place please let me know.
>> Innocent phrase in Polish language "brakuje Ci aliasów"[1] triggers
>> rules mentioned above.
>>
>> [1] - it means: "[...] you are missing aliases [...]"
>> Regards,
>> Marcin
>>
> 
> Please post the full SA report you go on this msg.

Is this what you are asking for?

Content analysis details:   (7.7 points, 5.0 required)

 pts rule name  description
 --
--
 3.1 FB_CIALIS_LEO3 BODY: Uses a mis-spelled version of cialis.
 2.4 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
above 50%
[cf: 100]
 0.4 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
[cf: 100]
 1.7 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/)
 0.1 MISSING_MIDMissing Message-Id: header

Very similar email was sended to many recipients on one mail server this
is why email was so high scored by razor rules.

Thanks,
Marcin



False positive in FB_CIALIS_LEO3 rule

2013-10-18 Thread Marcin Mirosław
Hi!
I'm not sure if false positives should be reported here or in bugzilla.
If I choosen wrong place please let me know.
Innocent phrase in Polish language "brakuje Ci aliasów"[1] triggers
rules mentioned above.

[1] - it means: "[...] you are missing aliases [...]"
Regards,
Marcin


Re: [solved] SA-3.3.2 options max-spare and max-children doesn't work as i expect

2012-10-24 Thread Marcin Mirosław
W dniu 23.10.2012 22:24, RW pisze:

Hi,

> On reading you your question more thoroughly I see that your main
> point was that you aren't getting as many processes as expected.
> 
> The number of child processes isn't adjusted immediately, it's
> incremented or decremented when a child announces that it is idle.
> Testing with only six calls isn't enough to expect sensible results.
> 
> What you need to do is hammer spamd with lots more spamc calls  and
> watch the number of child processes evolve in real time - maybe have
> the background processes log the child count as each spamc process
> completes. 

Indeed, I've flooded spamd with many connections. In results I got as
much spamd processess as I defined using "-m" option.
Thanks for tip, now all is clear for me.

Regards,
Marcin


Re: SA-3.3.2 options max-spare and max-children doesn't work as i expect

2012-10-23 Thread Marcin Mirosław
W dniu 23.10.2012 15:52, Bowie Bailey pisze:
> On 10/23/2012 7:30 AM, Marcin Mirosław wrote:
>> W dniu 23.10.2012 12:03, Arthur Dent pisze:
>> [...]
>>> Just thought I'd ask...
>>>
>>> You did restart SA after you made the changes?
>> Yes I did. In meanwhile I've found bug
>> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6843 . But it
>> doesn't give me anwser why max-spare limit max childs. (And honestly
>> it's strange for me how min,max/children,spare options works).
>> Regards,
>> Marcin
> 
> spare = idle child process
> 
> So...
> 
> min-spare = minimum number of idle child processes
> max-spare = maximum number of idle child processes
> 
> min-children = minimum number of children (busy or idle)
> max-children = maximum number of children (busy or idle)
> 
> Usually, max-children is the one you want to adjust.  Make it high
> enough to handle your load, but make sure you don't over-commit your
> RAM.  If you start swapping, everything slows to a crawl.
> 

This is what I thought but as you can see in my test case spamd doesn't
behave in this way. (e.g. I can't start more childs than set with option
max-spare)


Re: SA-3.3.2 options max-spare and max-children doesn't work as i expect

2012-10-23 Thread Marcin Mirosław
W dniu 23.10.2012 12:03, Arthur Dent pisze:
[...]
> Just thought I'd ask...
> 
> You did restart SA after you made the changes?

Yes I did. In meanwhile I've found bug
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6843 . But it
doesn't give me anwser why max-spare limit max childs. (And honestly
it's strange for me how min,max/children,spare options works).
Regards,
Marcin


SA-3.3.2 options max-spare and max-children doesn't work as i expect

2012-10-23 Thread Marcin Mirosław
Hello list!
I'm playing with options min-spare,max-spare,min-children and
max-children, I'd like to save memory on my vps. So I'd like to have one
children awaiting for connection from MTA, when MTA receives more emials
in short time I'd like SA to spawn more children (max-children=6). I
thinks it's enough to have zero (or one) spare children in my case.
I'm starting spamd with such parameters:
... --min-spare=0 --max-spare=1 -m 6
I'm expecting I can check 6 emails in the same time because option "-m
6" suggests that six children should be spawned.
Ok, so I've got 2 proceessess:
# pgrep -fc spamd
2
It's ok as for now. Now I'm starting scanning 6 mails at one time:

# (for x in $(seq 1 6); do spamc -c /dev/null &
done) ; pgrep -fc spamd ; sleep 1; pgrep -fc spamd;sleep 1;pgrep -fc spamd
2
2
2
Hmm, still I've got 2 processes (parent+one child).

Let me change start option for spamd:
... --min-spare=0 --max-spare=3 -m 6
# pgrep -fc spamd
4
(How much time of innactivity is needed to kill spare, unused child?).
And I'm launching one-liner:
# (for x in $(seq 1 6); do spamc -c /dev/null &
done) ; pgrep -fc spamd ; sleep 1; pgrep -fc spamd;sleep 1;pgrep -fc
spamd
4
4
4
So for me it looks like max-spare limits max number of children, me
seems it's not desired behavior.
In log I can find:
 spamd[21140]: prefork: child states: BBB
So spamd really didn't spawn more childs.

Am I doing something wrong?

I've found answer for one question:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6754



Re: use_bayes=0 completly disables report function

2012-04-22 Thread Marcin Mirosław
W dniu 2012-04-21 15:29, Axb pisze:
Hello,
> http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.txt
> 
> You want:
> 
> use_bayes 1
> use_bayes_rules 0
> use_learner 1

I found use_learner 0 is better for me, bayes engine isn't called while
message is reported.
Thanks for tip!
Marcin


Re: use_bayes=0 completly disables report function

2012-04-20 Thread Marcin Mirosław
W dniu 2012-04-21 07:38, Marcin Mirosław pisze:
> I'm guessing, as for now, it's the best to set
> connection to non existent database (disadvantage is high number of
> error messages in log).

I tried, it looks none rror messages appears in logs. Fine for me:)
Regards,
Marcin


Re: use_bayes=0 completly disables report function

2012-04-20 Thread Marcin Mirosław
W dniu 2012-04-21 04:58, dar...@chaosreigns.com pisze:
> On 04/20, Marcin Mirosław wrote:
>> Hello,
>> i've notice when i set use_bayes 0 then spamc -C report stops to work.
>> I've got in log:  spamd: Can't call method "learn" on an undefined value
> 
> bayes_learn_during_report 0

Hi darxus,
thanks for reply. It solves my problem in half. I'd like to completly
turn off bayes. I'm guessing, as for now, it's the best to set
connection to non existent database (disadvantage is high number of
error messages in log).
Thanks,
Marcin




use_bayes=0 completly disables report function

2012-04-20 Thread Marcin Mirosław
Hello,
i've notice when i set use_bayes 0 then spamc -C report stops to work.
I've got in log:  spamd: Can't call method "learn" on an undefined value
at /usr/lib64/perl5/vendor_perl/5.12.4/Mail/SpamAs
sassin/PerMsgLearner.pm line 111,  line 117.

Whay i've to use bayes when i want to report spam to
spamcop/dcc/pyzor/razor/etc ? Couldn't it be independent?
Regards,
Marcin


Re: [OT] Disable a Rule

2011-10-30 Thread Marcin Mirosław
W dniu 2011-10-30 21:36, Jeremy McSpadden pisze:
> Yes, that is in place. (not a newbie here)

Only newbie can say "I'm not newbie".


Re: Disable a Rule

2011-10-30 Thread Marcin Mirosław
W dniu 2011-10-30 18:37, Jeremy McSpadden pisze:
> I have several MS boxes and it seems that the RCVD_IN_DNSWL_HI rule in
> 72_active is allowing way to much through. Running at a score of 5 for
> spam, and it -5 on score is pushing it as clean. How do i disable the
> rule completely, even on sa-updates. It seems nightly the rule is
> re-enabled. 

Hi!
Maybe rescore would be enough?
score RCVD_IN_DNSWL_HI 0
in your local conf.



Re:[SOLVED] Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

W dniu 26.09.2011 15:53, Bowie Bailey pisze:

There is nothing in that sample that would cause the rule to fire.  I
downloaded it and ran it against my SA and did not get a match for
MISSING_SUBJECT.  The only thing I can think of is that the headers end
at the first blank line.  If there is a blank line somewhere in the
headers, that will cause SA to treat everything below that line as part
of the body rather than the header.


Email doesn't contain body so there is nothing after headers.


Download your sample from pastebin and run it through SA to see if it
still matches the rule for you.  You may have inadvertently fixed the
problem when you munged the recipient address prior to uploading the sample.


I've downoloaded email from pastebin and nothing changed.

Reason: PEBKAC , i've got redundant backslash in own rules. I should use 
--lint more often.


Sorry for noise and thanks for your time!
Regards.


Re: Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

W dniu 26.09.2011 15:52, Matus UHLAR - fantomas pisze:

I don't see other X-Spam headers there. How are you running
spamassassin? Aren't you using amavis ot other software using just
spamassassin libraries?

Are you sure some 3rd party does not modify mail headers?


No, i don't use any 3rd packages, i'm using exim+spamd.
Sorry i didn't start spamd with en_us locales, some headers are translated.
All headers from SA are in X-Spam-Report, header X-Szczegoly contains 
report which rules hitted email.


Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

Hello!
I'd like to ask you if this rule works correctly? I've sended email from 
thunderbird and roundcube and in both cases this rule scores email. Here 
is sample email: http://pastebin.com/rVTwNp5X (with little mungled 
recipient).

Rules are in version: 1162027, spamassassin-3.3.2
Thanks for help.
Regards.


Re: translations of SpamAssassin descriptions?

2011-09-12 Thread Marcin Mirosław

W dniu 12.09.2011 13:59, Tomasz Chmielewski pisze:

On 12.09.2011 13:25, Marcin Mirosław wrote:

Hello,
short answer: yes.


What's the long answer?


I wanted to say "as a user i know there are available translations, but 
i don't know informations about how many translations are available or 
if are them up to date".


[...]

So, rule descriptions are not translated. Do I have to set anything else?


I've found polish translations in file /var/lib/[..]/30_text_pl.cf , you 
should have it. I've tested on my box what to do to have description not 
translated;) Enviroment variable "LANGUAGE" is responsible for version 
of description used in report:


$ LANGUAGE=de ;spamassassin -xtL < q1R35OW-38036633
 3.8 BAYES_99   BODY: Spamwahrscheinlichkeit nach 
Bayes-Test: 99-100%


$ LANGUAGE=nl ;spamassassin -xtL < q1R35OW-38036633
 3.8 BAYES_99   BODY: Bayesiaanse kans op spam is 99 tot 100%
[score: 1.]
 0.0 HTML_MESSAGE   BODY: HTML opgenomen in het bericht
 0.0 HTML_FONT_SIZE_LARGE   BODY: HTML font size is large
 1.8 MIME_HEADER_CTYPE_ONLY 'Content-Type' gevonden zonder de benodigde
MIME headers
 0.2 KHOP_SC_TOP_CIDR8  Relay CIDR /8 leads SpamCop in worst /8s
 2.6 SINGLE_HEADER_1K   A single header contains 1K-2K characters

Regrads



Re: translations of SpamAssassin descriptions?

2011-09-12 Thread Marcin Mirosław

W dniu 12.09.2011 13:18, Tomasz Chmielewski pisze:

Are SpamAssassin descriptions available in other languages?


For example, the following would produce SpamAssassin output below - are 
localized versions of it available anywhere?

$ cat /tmp/new.eml | spamassassin -Lt

[...]

Hello,
short answer: yes.

spamc -R< q1R2977-3240678
[...]X-Szczegoly:(poczta.cibet.pl)(16.3 points)
 pts rule name  description
 -- -
 2.6 RCVD_IN_BL_SPAMCOP_NET RBL: Odebrane od systemu klasy RELAY w/g:
bl.spamcop.net
  [Blocked - see 
]

 0.4 RCVD_IN_XBLRBL: Received via a relay in Spamhaus XBL
[82.91.13.8 listed in zen.spamhaus.org]
 0.8 RCVD_IN_SORBS_WEB  RBL: SORBS: nadawca posiada nadu�ywany 
serwer WWW

[...]

Did you set locale to pl?
Regards


Re: Self addressed spam

2011-08-10 Thread Marcin Mirosław

W dniu 10.08.2011 12:00, akrohnke pisze:


Hello,

Currently one of our clients are getting spam that looks like it comes from
the sender itself. Spamassassin only occasionally catches it.


Hello!
It should be done at smtp level.
if ("sender domain" is "my domain") and sender didn't authenticated then 
reject mail .


Re: One thing about bug 6558

2011-04-01 Thread Marcin Mirosław
W dniu 30.03.2011 16:27, Adam Moffett pisze:

> Your watchdog idea is a valid one, 

Should i fill bug with this idea (or bug descibed there:
http://osdir.com/ml/users-spamassassin/2011-03/msg00481.html or there:
http://www.gossamer-threads.com/lists/spamassassin/users/161931#161931 ) ?
Or it's enough it was discussed here?

Regards!
Marcin


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 17:12, RW pisze:
> Are you sure that there were idle spamd children? The above is
> consistent with  ~14 seconds queuing for a child, followed by a
> timeout after  10 seconds.

I was watching cpu time of spamd child (i took delta, not last cpu time
;) ). 14+10 is coincidence.


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 16:34, Kris Deugau pisze:
> My experience here has been that if a spamd child is pegging a CPU core
> for an extended period, there's simply a *lot* of body text to run
> (raw)body rules against (eg, ~ >200K).
> 
> We've found that a fairly effective defense against this is to set up a
> second spamd instance with ~20 high-scoring rules (Spamhaus, local
> DNSBL, local and remote URI blacklists, Pyzor and/or Razor, plus one or
> two "normal" rules) and do two scanning passes:
> 
> -> call the second (lean) instance, skip further filtering if tagged.
> This skims off ~80% of the junk (much of which would score >20 points
> with the full ruleset) at *very* low CPU usage.
> -> call the main instance on all remaining mail

This is interesting idea! I thought about configuration like this, but i
think about SA with and without bayes. I'm still trying to avoid double
instance of SA. I don't have big emails traffic, i have a little
specific traffic.
Regards


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 16:27, Adam Moffett pisze:
> Your watchdog idea is a valid one, but if you can improve bayes
> performance it may become a non-issue.  Have you tried moving bayes to a
> MySQL database?  It consumes more disk space and memory, but it's
> dramatically faster than using the default database (which is Berkeley
> if I recall correctly).

I'm using postgresql, but machine isn't quick... Any db is slowly there.


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 16:21, Per Jessen pisze:
> Well, isn't the behaviour you're seeing working-as-expected then?  If it
> was an indefinite loop, setting up a time-out would be a possible 
> work-around.  If the bayes code is doing what it is supposed to do, but
> just taking long to do it, no work-around is needed. 
> Instead, you could try limiting the size of what is being processed by
> spamd. 

Border case (i don't have much time for SA, i need to know quickly):
--timeout-child=1
I start spamc, and i'm expecting after one second i'll get info:
-spam
-ham
-time_limit

but know i'm getting info "time limit" after (e.g) 5 seconds.


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 16:21, Per Jessen pisze:
> Well, isn't the behaviour you're seeing working-as-expected then?  If it
> was an indefinite loop, setting up a time-out would be a possible 
> work-around.  If the bayes code is doing what it is supposed to do, but
> just taking long to do it, no work-around is needed. 
> Instead, you could try limiting the size of what is being processed by
> spamd. 

It's not so easy to limit message size. It would be ideally to have
config option "bayes_max_msg_size". I'm getting huge mail (UCE) which
should be scanned by SA. Bayes isn't needed in this case (remained rules
are enough to score such mails), but i can't turn off bayes engine for
some mails. To do it i will have to setup another SA instance, with the
same config but with bayes turned off, in exim i can decide wich engine
use to scan using mail size. But this needs two SA running, needs more
memory etc.
I can't try now what would happen with rules mention in bug #6558, but
meseem spamd child wasn't killed after --timeout-child.
Regards!


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 15:47, Per Jessen pisze:
> Yes, I meant the child - obviously, it sounds as if it's a problem in
> the bayes processing.  I don't use SA bayes, but that problem ought to
> be investigated first before we look at work-arounds. IMHO.

I'm expecting that bayes can do its work for long time, i'm working on
mail with many, many words.


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 14:06, Per Jessen pisze:
> Have you looked at what spamd is doing when it so busy? 

Did You mean "spamd child"? At this moment bayes engine do very hard
work with email.


Re: One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
W dniu 30.03.2011 12:33, Daniel Lemke pisze:
> You mean something like --timeout-child=secs as a spamd starting option? ;)
> http://spamassassin.apache.org/full/3.3.x/doc/spamd.html

Thanks for quick reply!
This option doesn't work as i wish ;) spamd child isn't killing after
time set in --timeout-child, it is working, and working and utilizes
100% CPU until i send kill signal.
It seems for me this option works as described below:
(for this example --timeout-child=10 sec)
0s - i'm starting scanning mail with spamc
[...] - spamd child havy works with mail
10s - spamd child still works and uses 100% of cpu
20s - spamd child still works and uses 100% of cpu
30s - spamd child finished scanning, it ends his work, in this moment
spamc gets answer from daemon: 0.0 TIME_LIMIT_EXCEEDEDExceeded time
limit / deadline

So, if child doesn't end in time definied in timeout-child it means that
result of spamd child's work will not be usedby spamd. But spamd still
doesn't return "time_limit" after timeout-child pass, it waits for
something (cpu is free), and spamc waits for nothing (spamc should now
get "time_limit" imho). This is new question, why spamd doesn't return
"time_limit" immediately when timeout-child occurs?

# ps x|grep spamd
29513 ?SNs0:12 /usr/sbin/spamd -d -r /var/run/spamd.pid
--min-spare=2 --max-spare=4 -m 5 -i 127.0.0.1 -x -q -u nobody -l
--timeout-child=10

# time spamc -s 334 -R <
1297332681.M942138P14806.poczta\,S\=1557593\,W\=1608807
0.0/5.9
[cut X-info]
X-Szczegoly:(poczta.cibet.pl)(0.0 points)
 pts rule name  description
 -- -
 0.0 TIME_LIMIT_EXCEEDEDExceeded time limit / deadline


real0m24.851s
user0m0.000s
sys 0m0.010s

24 seconds was time how long spamd child was working on test email.


P.S. Yes, i'm using new version of SA, 3.3.1 ;)


One thing about bug 6558

2011-03-30 Thread Marcin Mirosław
Hi!
Sometimes (for example bug 6558) happen than spamd child(s) utilizes
100% cpu for a long time (until i kill them myself). My suggestion is to
add new functionality to SA, something like watchdog for spamd child
processes. If child hangs to long with email, it should be killed by
parrent proccess.
Regards.
Marcin


Re: [OT] Problem with mailserver and rejects

2011-03-07 Thread Marcin Mirosław
W dniu 07.03.2011 13:40, Michelle Konzack pisze:
> Hello,
> 
> since 2011-01-19 I have a problem because my FTTH was  accidently  cuted
> and now no one want ot be responsable including my ISP.
> 
> OK, <88.168.69.36> had an rDNS to  and was working
> perfectly and gotten never rejects except from Hotmail which use a realy
> weird ANTI-SPAM service/policy
> 
> Now I have changed the relay to
> 
> [michelle.konzack@michelle1:/~] dig MX tamay-dogan.net
> tamay-dogan.net.  3600IN  MX  20 vserver04.tamay-dogan.net.
> tamay-dogan.net.  3600IN  NS  dns1.tamay-dogan.net.
> tamay-dogan.net.  3600IN  NS  dns2.tamay-dogan.net.
> vserver04.tamay-dogan.net. 3600   IN  A   217.147.94.23
> dns1.tamay-dogan.net. 3600IN  A   88.168.69.36
> dns2.tamay-dogan.net. 3600IN  A   217.147.94.23
> 
> which is definitively correct, but EVEN  the  Debian  mailinglist server
> reject my mails for unknown reason.
> 
> Can someone tell me WHY?

Please look at it:
$ host -t mx tamay-dogan.net
tamay-dogan.net mail is handled by 20 vserver04.tamay-dogan.net.
$ host vserver04.tamay-dogan.net.
vserver04.tamay-dogan.net has address 217.147.94.23
^^
$ host 217.147.94.23
23.94.147.217.in-addr.arpa domain name pointer vserver4.tamay-dogan.net.

Regards!





Re: problem with custom rbl and addressess ipv6

2010-11-23 Thread Marcin Mirosław
W dniu 23.11.2010 00:18, Byung-Hee HWANG pisze:
> Marcin Mirosław  writes:
>> I'm using SA-3.3.1, NetAddr-IP-4.033.
>> May you give any advice?
> 
> Sorry, i don't know about 3.3.1 Version. By the way there is somewhat
> similar patchs for IPv6. You would check out as following:
> 
>  http://www.imasy.or.jp/~ume/ipv6/

I'm getting rejects while i try patch SA. So i'm still open for
suggestion what should i do :)
Thanks,
regards.


Re: how to create rule using CIDR

2010-11-22 Thread Marcin Mirosław
I'm going to use own rbl to do it.


problem with custom rbl and addressess ipv6

2010-11-22 Thread Marcin Mirosław
Hi all!
I've created custom rbl with own dns. I've noticed problem when
connection from remote smtp is via ipv6. It looks like SA dosen't query
rbl about address ipv6.

Example 1:
client(address BB, ipv4)->MTA( CC, ipv4)->dest. MTA
in this case, SA checks for both addresses BB and CC

example 2:
client(address BB, ipv4)->MTA( DD, ipv6)->dest. MTA
in this case, SA checks only address BB

I'm using SA-3.3.1, NetAddr-IP-4.033.

Rules looks like this:

header  __LR_NADAWCA_W_Heval:check_rbl('XXX',
'rbl.mejor.pl.')
describe__LR_NADAWCA_W_HOdebrane od nadawcy w RBL XXX
tflags  __LR_NADAWCA_W_Hnet
reuse   __LR_NADAWCA_W_H

header  LR_NADAWCA_W_RBL_H1 eval:check_rbl_sub('XXX',
'127.0.0.1')
describeLR_NADAWCA_W_RBL_H1 Odebrane od nadawcy w RBL XXX
tflags  LR_NADAWCA_W_RBL_H1 net
score   LR_NADAWCA_W_RBL_H1 1
reuse   LR_NADAWCA_W_RBL_H1

header  LR_NADAWCA_W_RBL_H2 eval:check_rbl_sub('XXX',
'127.0.0.2')
describeLR_NADAWCA_W_RBL_H2 Odebrane od nadawcy w RBL XXX
tflags  LR_NADAWCA_W_RBL_H2 net
score   LR_NADAWCA_W_RBL_H2 2
reuse   LR_NADAWCA_W_RBL_H2

(sorry for little obsfucation, i hope this will not be problematic)
May you give any advice?
Regards


how to create rule using CIDR

2010-11-20 Thread Marcin Mirosław
Hello,
I'd like to write rule for scoring mails originated from given CIDR.
Value of CIDR is in _ASNCIDR_ but i don't know hot to use it value. Or
mayby should i make lookup asn.routeviews.org ?
Thanks for help.


Re: sa-learn problems and comprehension question

2010-11-09 Thread Marcin Mirosław
W dniu 2010-11-10 07:37, Karl Meyer pisze:
> But the 15 new messages weren't learnd yet.
> 
> I had 10 messages in my inbox and run sa-learn on that folder. Then I got 15
> different new messages and re-run sa-learn again. But it said that it
> learned from 0 messages.

Do you run SA from smtp server? Porbably yes, bayes autolearned
this/those(15) emails earlier, while was called from MTA.
Regards


Re: sa-learn problems and comprehension question

2010-11-09 Thread Marcin Mirosław
W dniu 2010-11-09 17:24, Bowie Bailey pisze:
> If you learn a message as ham, it will not learn the same message as ham
> a second time (same with spam).  However, you can change your mind and
> learn the message as spam.  Bayes will "forget" what it learned the
> first time and re-learn the message.

Agree, i wasn't precise.


Re: sa-learn problems and comprehension question

2010-11-09 Thread Marcin Mirosław
W dniu 09.11.2010 17:14, Karl Meyer pisze:
> 
> Hi,
> 
> I want to configure bayes learning and still having some problems and
> questions after reading several tutorials:
> 
> 
> I executed sa-learn for my inbox
> # su -c "/usr/bin/sa-learn --dbpath /var/amavis/.spamassassin/bayes/ --ham
> --showdots /var/spool/imap/user/kmeyer/[0-9]*." amavis
> 
> and got a message, that it learned from n messages. Also in the dbpath foder
> two files appeared. After I got 15 new mails in my inbox, I executed the
> same command again. But this time it didn't learned.

Sa-learn "remember" msgid message which has been learned, it will never
learned twice the same email. Until you change msgid ;)
Regards


Re: Migrating Spamassassin Bayesian tokens to a fresh Spamassassin installation

2010-11-02 Thread Marcin Mirosław
W dniu 02.11.2010 10:26, Sharma, Ashish pisze:

> How can I migrate this Bayesian knowledge from one installation to other.

Hello,
Do sa-learn --backup and the sa-lear --restore
Regards