Re: SORBS bites the dust

2009-06-25 Thread Res

On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:


Personally I have mixed views on charging for delisting. In some
instances it would be appropriate and I would not dismiss it out of
hand. Certainly for repeat offenders I think it would be highly
desirable.


Agreed, its one wya to make the admin team get off their ass.



I don't recall saying you were a liar anywhere and I'm glad you are not


Not you, Mouss implied it.


hissy fits, throwing their toys out of their prams and suggesting people
are 'trolls' because they don't like the opinions of others.


if you jump on a bandwagon without first hand experience, thats *exactly* 
what you are, if you had experienced it first hand of course you become an
authority on the subject in your your case, and your opinion matters as 
factual, but you by your own admission, you have not, and last I checked 
guilt by association was not a crime in modernised civil countries :)



--
Res

-Beware of programmers who carry screwdrivers


Re: Plugin extracting text from docs (was: new spam using large images)

2009-06-25 Thread Matus UHLAR - fantomas
> Jason Haar wrote:
>
>> Speaking of image/rtf/word attachment spam; is there any work going on
>> to standardize this so that the textual output of such attachments could
>> be fed back into SA?

On 24.06.09 19:33, Jonas Eckerman wrote:
> Just as a note:
>
> I'm currently working on a modular plugin for extracting text and add it  
> to SA message parts.

if possible, extract images too, so the fuzzyocr and similar plugins would
be able to look at that too.

IIRC spammers did even put PDF's to .doc files to make the stuff harder, but
if you manage the above, it shouldn't be hard to extract PDF's too :)

(and then extracting text/images from PDF's too)

> The plugin can use either external tools or it's own simple plugin  
> modules. How to extract text from parts is configurable, and based on  
> mime types and file names, so new formats can be added by simply  
> configuring for new external tolls or creating a new plugin module.
>
> My *far* from finished module currently manages to extract text from  
> Word documents (using antiword), OpenXML text documents (using a simple  
> plugin) and RTF (using unrtf).
>
> I haven't tested where and how the extracted text is available to  
> SpamAssassin yet (as noted, it's *far* from finished), but I am using 
>   "set_rendered" method as in the example, so it should work. ;-)

great!
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
If Barbie is so popular, why do you have to buy her friends? 


Re: SORBS bites the dust

2009-06-25 Thread rich...@buzzhost.co.uk
On Thu, 2009-06-25 at 17:41 +1000, Res wrote:

> if you jump on a bandwagon without first hand experience, thats *exactly* 
> what you are, if you had experienced it first hand of course you become an
> authority on the subject in your your case, and your opinion matters as 
> factual, but you by your own admission, you have not, and last I checked 
> guilt by association was not a crime in modernised civil countries :)

Indeed. I can only apologise for any offence or 'trolling'.



URLs with Spaces

2009-06-25 Thread Andrew Hearn
Hello,

I'm wondering if I'm missing some rules that would have given this
message more points - I know it's missing bayes (I'm not sure why as our
servers should use bayes, but it seems not to have been run for this
message.)

http://www.pastebin.ca/1473975

Thanks

-- 
Andrew.


Re: SORBS bites the dust

2009-06-25 Thread Res

On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:


On Thu, 2009-06-25 at 17:41 +1000, Res wrote:


if you jump on a bandwagon without first hand experience, thats *exactly*
what you are, if you had experienced it first hand of course you become an
authority on the subject in your your case, and your opinion matters as
factual, but you by your own admission, you have not, and last I checked
guilt by association was not a crime in modernised civil countries :)


Indeed. I can only apologise for any offence or 'trolling'.


LOL your a joke, you send this on list, yet send me a private email
calling me a wanker..  LOL dont bother replying :)


--
Res

-Beware of programmers who carry screwdrivers


Re: URLs with Spaces

2009-06-25 Thread Kasper Sacharias Eenberg
There's been a rule circulating this mailing list for a couple of weeks.
This is the latest edition to catch those med-things (afaik).

--
body AE_MEDS35 /\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o
\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
describe AE_MEDS35 obfuscated domain in message
scoreAE_MEDS35 5.0
--

It works good for me.

With regards
Kasper


On Thu, 2009-06-25 at 09:19 +0100, Andrew Hearn wrote:
> Hello,
> 
> I'm wondering if I'm missing some rules that would have given this
> message more points - I know it's missing bayes (I'm not sure why as our
> servers should use bayes, but it seems not to have been run for this
> message.)
> 
> http://www.pastebin.ca/1473975
> 
> Thanks
> 



Permissions Issues

2009-06-25 Thread rich...@buzzhost.co.uk
A routine look in the logs shows me a steady warn in the logs.
It's probably harmless - but I would like to solve it for tidiness:
 

Thu Jun 18 16:45:21 2009 [12663] warn: config: created user preferences
file: /var/lib/spamassassin/.spamassassin/user_prefs
Tue Jun 23 16:58:42 2009 [13778] warn: config: cannot write
to /root/.spamassassin/user_prefs: Permission denied
Tue Jun 23 16:58:43 2009 [13778] warn: auto-whitelist: open of
auto-whitelist file failed: locker: safe_lock: cannot create tmp
lockfile /root/.spamassassin/auto-whitelist.lock.stinger.13778
for /root/.spamassassin/auto-whitelist.lock: Permission denied
Wed Jun 24 11:46:16 2009 [4734] warn: config: cannot write
to /root/.spamassassin/user_prefs: Permission denied
Wed Jun 24 11:46:17 2009 [4734] warn: auto-whitelist: open of
auto-whitelist file failed: locker: safe_lock: cannot create tmp
lockfile /root/.spamassassin/auto-whitelist.lock.stinger.4734
for /root/.spamassassin/auto-whitelist.lock: Permission denied
Wed Jun 24 12:08:10 2009 [4734] warn: config: cannot write
to /root/.spamassassin/user_prefs: Permission denied
Wed Jun 24 12:08:11 2009 [4734] warn: auto-whitelist: open of
auto-whitelist file failed: locker: safe_lock: cannot create tmp
lockfile /root/.spamassassin/auto-whitelist.lock.stinger.4734
for /root/.spamassassin/auto-whitelist.lock: Permission denied

I'm slightly confused as I see this;
/var/lib/spamassassin/.spamassassin/user_prefs created,
but then SA seems to be trying to write to /root/.spamassasin/...

Probably my configuration - any pointers ?




How many people are still using perl 5.6.x?

2009-06-25 Thread Justin Mason
For the upcoming release, we're considering dropping support for that
interpreter version.  If you're still using 5.6.x, or know of a
(relatively recent) distro that does, please reply to highlight
this

--j.


Re: SORBS bites the dust

2009-06-25 Thread rich...@buzzhost.co.uk
On Thu, 2009-06-25 at 18:24 +1000, Res wrote:
> On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:
> 
> > On Thu, 2009-06-25 at 17:41 +1000, Res wrote:
> >
> >> if you jump on a bandwagon without first hand experience, thats *exactly*
> >> what you are, if you had experienced it first hand of course you become an
> >> authority on the subject in your your case, and your opinion matters as
> >> factual, but you by your own admission, you have not, and last I checked
> >> guilt by association was not a crime in modernised civil countries :)
> >
> > Indeed. I can only apologise for any offence or 'trolling'.
> 
> LOL your a joke, you send this on list, yet send me a private email
> calling me a wanker..  LOL dont bother replying :)
> 
> 
4 things;

1. It's 'You're' a joke - not 'your' a joke
2. You could always try setting up your Mickey Mouse 'blocked using
dnsbl.lan' restriction so it works properly LOL.
3. The day I give a shit about what an Australian spammer thinks of me,
will be the day hell freezes over.
4. If that cap fits dude - wear it.

*plonk*



Re: A difficult one to weed out?

2009-06-25 Thread LuKreme

On 24-Jun-2009, at 08:20, Roger Marquis wrote:

PostConf http://www.postconf.com for example.



Looks interesting, but not FreBSD demo :/

--
There is no Humpty Dumpty, and there is no God. None, not
one, no God, never was.



Re: A difficult one to weed out?

2009-06-25 Thread rich...@buzzhost.co.uk
On Thu, 2009-06-25 at 03:08 -0600, LuKreme wrote:
> On 24-Jun-2009, at 08:20, Roger Marquis wrote:
> > PostConf http://www.postconf.com for example.
> 
> 
> Looks interesting, but not FreBSD demo :/
> 
Webmin?

http://www.webmin.com/



Re: SORBS bites the dust

2009-06-25 Thread Per Jessen
rich...@buzzhost.co.uk wrote:

> On Wed, 2009-06-24 at 19:00 +0200, Per Jessen wrote:
>> Benny Pedersen wrote:
> 
>> 2) I didn't include free email providers in my list of "large and
>> serious hosting providers" - I was thinking more of organisations
>> such as 1and1, hetzner, rackspace etc. etc.
> 
> My special award goes to 1and1. I get *so much* spam from their
> 'customers' that I block all of their ranges. I've come across many
> others who do the same.

Really?  Well, I can't afford that sort of thing, my customers would get
up and leave pretty quickly.

> I guess when you are bottom feeding in the Hosting marketplace
> spammers will make use of your facilities.

I think spammers will make use of whatever facilities they can get hold
of, even if it's only until they're shut down by the hosting company. 


/Per Jessen, Zürich



Re: SORBS bites the dust

2009-06-25 Thread Per Jessen
Matus UHLAR - fantomas wrote:

>> > On Wed, June 24, 2009 13:59, Per Jessen wrote:
>>
>> 3) I wouldn't refer to rfc-ignorant as a blacklist - nobody with half
>> a brain would block email just because of RFC ignorance on the part
>> of the sender.
> 
> Why not? I do that and intentionally - I don't like receiving spam
> from companies that don't accept complaints...

Why not?? - because you thereby block thousands of perfectly legitimate
and non-spamming companies and individuals who happen to have a
mail-admin who is a bit slow.  Using rfc-ignorant for scoring is fine,
but not for blocking.  


/Per Jessen, Zürich



Re: SORBS bites the dust

2009-06-25 Thread Per Jessen
Arvid Picciani wrote:

>> serious hosting providers" - I was thinking more of organisations
>> such as 1and1, hetzner, rackspace etc. etc.
> 
> whats the issue with hetzner?  I'm a customer so i'd be very
> interested in any spam issue not beeing processed by them.

There is no issue with Hetzner.  Read my posting:

>Blacklisting a large and serious hosting provider is just not serious
>and very bad for business.


/Per Jessen, Zürich



Re: SORBS bites the dust

2009-06-25 Thread rich...@buzzhost.co.uk
On Thu, 2009-06-25 at 11:39 +0200, Per Jessen wrote:
> rich...@buzzhost.co.uk wrote:
> 
> > On Wed, 2009-06-24 at 19:00 +0200, Per Jessen wrote:
> >> Benny Pedersen wrote:
> > 
> >> 2) I didn't include free email providers in my list of "large and
> >> serious hosting providers" - I was thinking more of organisations
> >> such as 1and1, hetzner, rackspace etc. etc.
> > 
> > My special award goes to 1and1. I get *so much* spam from their
> > 'customers' that I block all of their ranges. I've come across many
> > others who do the same.
> 
> Really?  Well, I can't afford that sort of thing, my customers would get
> up and leave pretty quickly.
I have found the opposite to be true. When I have pointed out to my
customers that using 1and1 is going to give *them* issues with
deliverability of *their* email, they are often keen to find another
provider. No small business wants the hassle of their mail getting
dropped silently on the floor because of the provider they are with and
it's a buyers market.
> 
> > I guess when you are bottom feeding in the Hosting marketplace
> > spammers will make use of your facilities.
> 
> I think spammers will make use of whatever facilities they can get hold
> of, even if it's only until they're shut down by the hosting company. 
Sure as eggs is eggs they will. It's relatively easy to block dynamic
ranges and bots with confidence - this makes it attractive to look for
'cheap' hosts that off 'trials' to stage mailouts - and 1and1 fit that
bill nicely.
> 
> 
> /Per Jessen, Zürich
> 



Re: SORBS bites the dust

2009-06-25 Thread Per Jessen
rich...@buzzhost.co.uk wrote:

> On Thu, 2009-06-25 at 11:39 +0200, Per Jessen wrote:
>> rich...@buzzhost.co.uk wrote:
>> 
>> > On Wed, 2009-06-24 at 19:00 +0200, Per Jessen wrote:
>> >> Benny Pedersen wrote:
>> > 
>> >> 2) I didn't include free email providers in my list of "large and
>> >> serious hosting providers" - I was thinking more of organisations
>> >> such as 1and1, hetzner, rackspace etc. etc.
>> > 
>> > My special award goes to 1and1. I get *so much* spam from their
>> > 'customers' that I block all of their ranges. I've come across many
>> > others who do the same.
>> 
>> Really?  Well, I can't afford that sort of thing, my customers would
>> get up and leave pretty quickly.
>
> I have found the opposite to be true. When I have pointed out to my
> customers that using 1and1 is going to give *them* issues with
> deliverability of *their* email, they are often keen to find another
> provider. No small business wants the hassle of their mail getting
> dropped silently on the floor because of the provider they are with
> and it's a buyers market.

None of my customers _use_ 1and1 themselves (afaik), but they may very
well be communicating with other legitimate businesses hosted by 1and
or 1und1 (same company), which is why I can't just block 1and1. 


/Per Jessen, Zürich



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 02:44, Justin Mason wrote:

For the upcoming release, we're considering dropping support for that
interpreter version.  If you're still using 5.6.x, or know of a
(relatively recent) distro that does, please reply to highlight
this


If moving away from 5.6 makes SA better then do it.

5.6 is pretty ancient, isn't it? Like 10 years?


--
By the way, I think you might be the prettiest girl I've ever seen
outside the pages of a really filthy magazine



Re: A difficult one to weed out?

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 03:22, rich...@buzzhost.co.uk wrote:

On Thu, 2009-06-25 at 03:08 -0600, LuKreme wrote:

On 24-Jun-2009, at 08:20, Roger Marquis wrote:

PostConf http://www.postconf.com for example.


Looks interesting, but not FreBSD demo :/


Webmin?

http://www.webmin.com/


I've used webmin, and have it installed. It is not luser friendly  
though.



--
Strange things are afoot at the Circle K



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Jan P. Kessler
Justin Mason schrieb:
> For the upcoming release, we're considering dropping support for that
> interpreter version.  If you're still using 5.6.x, or know of a
> (relatively recent) distro that does, please reply to highlight
> this
>
> --j.
>   

Don't know if it's still relevant: Solaris 8

# uname -a
 SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250

# perl -v
 This is perl, version 5.005_03 built for sun4-solaris



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Jan P. Kessler
Jan P. Kessler schrieb:
> Justin Mason schrieb:
>   
>> For the upcoming release, we're considering dropping support for that
>> interpreter version.  If you're still using 5.6.x, or know of a
>> (relatively recent) distro that does, please reply to highlight
>> this
>>
>> --j.
>>   
>> 
>
> Don't know if it's still relevant: Solaris 8
>
> # uname -a
>  SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250
>
> # perl -v
>  This is perl, version 5.005_03 built for sun4-solaris
>   

sorry, just missed the "relatively recent" statement ;-)



Re: SORBS bites the dust

2009-06-25 Thread Matus UHLAR - fantomas
> Matus UHLAR - fantomas wrote:
> 
> >> > On Wed, June 24, 2009 13:59, Per Jessen wrote:
> >>
> >> 3) I wouldn't refer to rfc-ignorant as a blacklist - nobody with half
> >> a brain would block email just because of RFC ignorance on the part
> >> of the sender.
> > 
> > Why not? I do that and intentionally - I don't like receiving spam
> > from companies that don't accept complaints...

On 25.06.09 11:42, Per Jessen wrote:
> Why not?? - because you thereby block thousands of perfectly legitimate

perfectly incompetent?

> and non-spamming companies and individuals who happen to have a
> mail-admin who is a bit slow.

I wouldn't call not having abuse contact for years "a bit slow" especially
for cases I warned the admin.

> Using rfc-ignorant for scoring is fine, but not for blocking.

I have a policy of requiring postmaster abuse contact, so refusing ignorants
it fine. They still can fix their behavior.

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"Two words: Windows survives." - Craig Mundie, Microsoft senior strategist
"So does syphillis. Good thing we have penicillin." - Matthew Alton


Re: SORBS bites the dust

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 03:55, rich...@buzzhost.co.uk wrote:

On Thu, 2009-06-25 at 11:39 +0200, Per Jessen wrote:

rich...@buzzhost.co.uk wrote:


On Wed, 2009-06-24 at 19:00 +0200, Per Jessen wrote:

Benny Pedersen wrote:



2) I didn't include free email providers in my list of "large and
serious hosting providers" - I was thinking more of organisations
such as 1and1, hetzner, rackspace etc. etc.


My special award goes to 1and1. I get *so much* spam from their
'customers' that I block all of their ranges. I've come across many
others who do the same.


Really?  Well, I can't afford that sort of thing, my customers  
would get

up and leave pretty quickly.

I have found the opposite to be true. When I have pointed out to my
customers that using 1and1 is going to give *them* issues with
deliverability of *their* email, they are often keen to find another
provider. No small business wants the hassle of their mail getting
dropped silently on the floor because of the provider they are with  
and

it's a buyers market.


Yep.  I'm not familiar with 1and1 specifically, but I've been in the  
position of having to tell someone that if they didn't move their  
domain and mail to a reliable and non-spam friendly host they were  
going to have a lot of mail not getting delivered.  The most recent  
one was a friend of a friend who notice that the volume on his mailing- 
lists had been dropping steadily for months. I checked and his IP  
block was listed in several RBLs.


Once he moved his domain his mailinglist recovered very quickly.

It's sort of like a nice store that is in a really bad neighborhood. A  
lot of people will simply not go there, no matter how great the store  
is. you want the best access, you move to a nicer neighborhood.


--
Bishops move diagonally. That's why they often turn up where the
kings don't expect them to be.



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 04:15, Jan P. Kessler wrote:

Don't know if it's still relevant: Solaris 8

# uname -a
SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250

# perl -v
This is perl, version 5.005_03 built for sun4-solaris


5.00?  

;)

--
Instant karma's going to get you!



Re: SORBS bites the dust

2009-06-25 Thread Yet Another Ninja

Could this thread be moved to spam-l ?
Seems it has little to do with SA


Re: more freemail domains: tunome.com

2009-06-25 Thread LuKreme

On 23-Jun-2009, at 06:31, McDonald, Dan wrote:

Guess I'd best make a list...


Share?


--
We all need help with our feelings. Otherwise, we bottle them up,
and before you know it powerful laxatives are involved.



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Henrik K
On Thu, Jun 25, 2009 at 12:21:25PM +0200, Jan P. Kessler wrote:
> Jan P. Kessler schrieb:
> > Justin Mason schrieb:
> >   
> >> For the upcoming release, we're considering dropping support for that
> >> interpreter version.  If you're still using 5.6.x, or know of a
> >> (relatively recent) distro that does, please reply to highlight
> >> this
> >>
> >> --j.
> >>   
> >> 
> >
> > Don't know if it's still relevant: Solaris 8
> >
> > # uname -a
> >  SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250
> >
> > # perl -v
> >  This is perl, version 5.005_03 built for sun4-solaris
> >   
> 
> sorry, just missed the "relatively recent" statement ;-)

When the system gets old enough that it's not supported officially and you
are forced to manually CPAN fresh modules (and possibly wreak havoc on the
OS), there is no reason not to compile your own perl (or upgrade system)
except lazyness.

SA is trying to be too supportive for the money it receives. ;-) If you ask
me, just ditch this and all other old baggage for 3.3. If you are not happy,
you are free to keep running 3.2. Some people are even still using 3.1.



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Jan P. Kessler
Henrik K schrieb:
>> sorry, just missed the "relatively recent" statement ;-)
>> 
>
> When the system gets old enough that it's not supported officially and you
> are forced to manually CPAN fresh modules (and possibly wreak havoc on the
> OS), there is no reason not to compile your own perl (or upgrade system)
> except lazyness.
>   

Full Ack - this is what I do on those few ancient boxes. Additionally
there are plenty of precompiled packages (sunfreeware, blastwave, ...).

> SA is trying to be too supportive for the money it receives. ;-) If you ask
> me, just ditch this and all other old baggage for 3.3. If you are not happy,
> you are free to keep running 3.2. Some people are even still using 3.1.
>   

Good proposal, imo.




Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Justin Mason
On Thu, Jun 25, 2009 at 11:15, Jan P. Kessler wrote:
> Justin Mason schrieb:
>> For the upcoming release, we're considering dropping support for that
>> interpreter version.  If you're still using 5.6.x, or know of a
>> (relatively recent) distro that does, please reply to highlight
>> this
>>
>> --j.
>>
>
> Don't know if it's still relevant: Solaris 8
>
> # uname -a
>  SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250
>
> # perl -v
>  This is perl, version 5.005_03 built for sun4-solaris

http://www.sun.com/software/solaris/support/sol8.xml :

'The Solaris 8 Operating System (OS) was originally released in
February 2000, and since then has been superseded by two later
releases: the Solaris 9 OS which was initially released in May 2002,
and the Solaris 10 OS which was initially released in January 2005.
The current update of this release is Solaris 10 5/09.

On August 16, 2006 Sun announced the transition of the Solaris 8 OS.
Per this transition:

* November 16, 2006 was the last date Solaris 8 media kits could be ordered
* Sun shipped Solaris 8 media up until February 16, 2007; Solaris
8 media kits are no longer available
* Solaris 8 entered retirement support mode Phase I on March 31, 2007;
* Solaris 8 will enter retirement support mode Phase II on March
31, 2009; and,
* Solaris 8 will reach the end of its service life on March 31, 2012.

The total service life of Solaris 8 will thus be slightly more than 12 years.'


So the OS itself is still supported.  however, that perl version (in
my experience) is quite broken; whenever I've used Solaris recently
I've been sure to install third-party precompiled perls from
sunfreeware/blastwave, or built my own, and used those instead.  it's
a moot point anyway, as SA 3.1.x/3.2.x doesn't support 5.005.

--j.


Re: I have an SA problem with Thunderbird.

2009-06-25 Thread Stefan
On Wednesday 24 June 2009 22:41:00 Steven W. Orr wrote:
> On 06/24/09 16:15, quoth René Berber:
> > Steven W. Orr wrote:
> >
> > [snip]
> > There is something close: have you seen the Habu plugin?
> >
> > Its used to report spam (to SpamCop for instance), it works by sending
> > anything you marked as spam as attachments in a report.  I don't know if
> > it is open source so changing it, adding the report back to your own
> > program, would be possible.
>
> I saw that but what's missing is the ability to run sa-learn to get the
> retraining to work.
>
> I'm also looking at an alias in sendmail to pipe the message to a script.
> That script can do what I already do, i.e., sa-learn plus the forward
> operation. To accomplish this, I found an addon to TB called Mail Redirect
> that will allow me to "bounce" the message to the alias (instead of
> forwarding it.

You may try sal-wrapper.pl (see: https://po2.uni-stuttgart.de/~rusjako/sal-
wrapper) as the pipe Script. You forward a message as an attachment to the 
alias and the script will unpack the message and feed it to sa-learn. 

Greetings
Stefan


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Karsten Bräckelmann
On Thu, 2009-06-25 at 13:20 +0200, Jan P. Kessler wrote:
> Henrik K schrieb:

> > SA is trying to be too supportive for the money it receives. ;-) If you ask
> > me, just ditch this and all other old baggage for 3.3. If you are not happy,
> > you are free to keep running 3.2. Some people are even still using 3.1.
> 
> Good proposal, imo.

Actually, that's pretty much exactly why we brought this up in the first
place. :)

  guenther

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: URLs with Spaces

2009-06-25 Thread Andrew Hearn
Kasper Sacharias Eenberg wrote:
> There's been a rule circulating this mailing list for a couple of weeks.
> This is the latest edition to catch those med-things (afaik).
> 
> --
> body AE_MEDS35 /\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o
> \s?m|n\s?e\s?t|o\s?r\s?g)\b/i
> describe AE_MEDS35 obfuscated domain in message
> scoreAE_MEDS35 5.0
> --
> 
> It works good for me.
> 


Thanks Kasper,

Also the Sanesecurity sigs for Clam catch it (thanks to Steve)




Re: Permissions Issues

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 10:35, rich...@buzzhost.co.uk wrote:
> A routine look in the logs shows me a steady warn in the logs.
> It's probably harmless - but I would like to solve it for tidiness:
>
>
> Thu Jun 18 16:45:21 2009 [12663] warn: config: created user preferences
> file: /var/lib/spamassassin/.spamassassin/user_prefs
> Tue Jun 23 16:58:42 2009 [13778] warn: config: cannot write
> to /root/.spamassassin/user_prefs: Permission denied
> Tue Jun 23 16:58:43 2009 [13778] warn: auto-whitelist: open of
> auto-whitelist file failed: locker: safe_lock: cannot create tmp
> lockfile /root/.spamassassin/auto-whitelist.lock.stinger.13778
> for /root/.spamassassin/auto-whitelist.lock: Permission denied
> Wed Jun 24 11:46:16 2009 [4734] warn: config: cannot write
> to /root/.spamassassin/user_prefs: Permission denied
> Wed Jun 24 11:46:17 2009 [4734] warn: auto-whitelist: open of
> auto-whitelist file failed: locker: safe_lock: cannot create tmp
> lockfile /root/.spamassassin/auto-whitelist.lock.stinger.4734
> for /root/.spamassassin/auto-whitelist.lock: Permission denied
> Wed Jun 24 12:08:10 2009 [4734] warn: config: cannot write
> to /root/.spamassassin/user_prefs: Permission denied
> Wed Jun 24 12:08:11 2009 [4734] warn: auto-whitelist: open of
> auto-whitelist file failed: locker: safe_lock: cannot create tmp
> lockfile /root/.spamassassin/auto-whitelist.lock.stinger.4734
> for /root/.spamassassin/auto-whitelist.lock: Permission denied
>
> I'm slightly confused as I see this;
> /var/lib/spamassassin/.spamassassin/user_prefs created,
> but then SA seems to be trying to write to /root/.spamassasin/...
>
> Probably my configuration - any pointers ?

check configs for awl path, and unset them, when its not set in user_prefs
you get permissions errors when not run as root, and not have it local to
the user trying

perldoc Mail::SpamAssassin::Conf
perldoc Mail::SpamAssassin::Plugin::AWL

tell more about what os, user you try test with

awl path must be local to each user not global for all users when sql is
not used


-- 
xpoint



Re: A difficult one to weed out?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 11:08, LuKreme wrote:
> On 24-Jun-2009, at 08:20, Roger Marquis wrote:
>> PostConf http://www.postconf.com for example.
> Looks interesting, but not FreBSD demo :/

yes freebsd does not have the above problem :)

-- 
xpoint



Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Matus UHLAR - fantomas wrote:

I'm currently working on a modular plugin for extracting text and add it  
to SA message parts.


if possible, extract images too, so the fuzzyocr and similar plugins would
be able to look at that too.


You meen extract images and add them as parts to the message?

I guess that should be doable. I know that "unrtf" can extract images 
from RTF files. I'll probably implement support for this, but I'll 
probably not implement actually doing it right away.



IIRC spammers did even put PDF's to .doc files to make the stuff harder, but
if you manage the above, it shouldn't be hard to extract PDF's too :)


This I don't understand. Do they put PDFs inside .doc files as if the 
..doc was an archive?


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: A difficult one to weed out?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 11:22, rich...@buzzhost.co.uk wrote:
> On Thu, 2009-06-25 at 03:08 -0600, LuKreme wrote:
>> On 24-Jun-2009, at 08:20, Roger Marquis wrote:
>> > PostConf http://www.postconf.com for example.
>> Looks interesting, but not FreBSD demo :/
> Webmin?
> http://www.webmin.com/

i remember one time i have shorewall/webmin combo worked nicely in some
versions, but webmin devs give up with shorewall, to much changed in each
version of shorewall of lately

so my point, make sure both is stable before use and that it does not
screewup your hobby :)

-- 
xpoint



Re: A difficult one to weed out?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 12:14, LuKreme wrote:
> I've used webmin, and have it installed. It is not luser friendly
> though.

http://www.webmin.com/index6.html usermin is for you then :)

-- 
xpoint



Filter Backscatter via DNSBL

2009-06-25 Thread Felix Buenemann
Hi,

I'd like to filter out backscatter with the DNSBL from
ips.backscatterer.org. In order not to filter out legitimate mails, but
only NDA noise and stuff, I want to limit it to mails with blank
envelope from (MAIL FROM: <>) or envelope from postmaster (MAIL FROM:
).

I'm not confident in writing meta rules and also the EnvelopeFrom pseudo
header rule is poorly documented, so I wonder if the blank envelope from
rule will hit at all or should read /<>/ instead of //.

These are the proposed rules:

header __LOCAL_ENVELOPEFROM_BLANK EnvelopeFrom //
header __LOCAL_ENVELOPEFROM_POSTMASTER EnvelopeFrom /^postmaster/

header RCVD_IN_DNSBL_IPS_BACKSCATTERER_ORG
eval:check_rbl('ips-backscatterer-org','ips.backscatterer.org.')
describe RCVD_IN_DNSBL_IPS_BACKSCATTERER_ORG Received via a relay in
ips.backscatterer.org DNSBL
tflags RCVD_IN_DNSBL_IPS_BACKSCATTERER_ORG net

meta LOCAL_BACKSCATTERER_ORG  ((__LOCAL_ENVELOPEFROM_BLANK ||
__LOCAL_ENVELOPEFROM_POSTMASTER) && RCVD_IN_DNSBL_IPS_BACKSCATTER_ORG)
describe LOCAL_BACKSCATTERER_ORG Backscatter detected via
ips.backscatterer.org DNSBL
score LOCAL_BACKSCATTERER_ORG 10.0

Regards,
   Felix Buenemann



Re: SORBS bites the dust

2009-06-25 Thread Res

On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:


1. It's 'You're' a joke - not 'your' a joke


Ah the classic sign of someone in defeat, has to nit pick someones grammer


2. You could always try setting up your Mickey Mouse 'blocked using
dnsbl.lan' restriction so it works properly LOL.


Actually, you were first blocked by a milter because your SPF record 
contains "junk" get someone with a clue to set it up for you


your internal bloack list blocks this mail servers IP anyway, 
so pot kettle black, tosser.



3. The day I give a shit about what an Australian spammer thinks of me,
will be the day hell freezes over.


oh im a spammer now am I, awww poor widdle wicky, go cry to mummy, or tell 
someone who gives a fuck.



--
Res

-Beware of programmers who carry screwdrivers


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Henrik K
On Thu, Jun 25, 2009 at 02:36:15PM +0200, Karsten Bräckelmann wrote:
> On Thu, 2009-06-25 at 13:20 +0200, Jan P. Kessler wrote:
> > Henrik K schrieb:
> 
> > > SA is trying to be too supportive for the money it receives. ;-) If you 
> > > ask
> > > me, just ditch this and all other old baggage for 3.3. If you are not 
> > > happy,
> > > you are free to keep running 3.2. Some people are even still using 3.1.
> > 
> > Good proposal, imo.
> 
> Actually, that's pretty much exactly why we brought this up in the first
> place. :)

I'm just not sure why ask in the first place. Perl 5.6.1 is old. Anyone
using such system most likely has no support. Anyone using such perl most
likely shouldn't be allowed to use it. You could be already fixing the code
and not waiting. ;)



Re: SORBS bites the dust

2009-06-25 Thread Jack Pepper
How long will this go before Godwin's law finally kicks in?  Now I'm  
just watching for the fun of it .


Quoting Res :


On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:


1. It's 'You're' a joke - not 'your' a joke


Ah the classic sign of someone in defeat, has to nit pick someones grammer


2. You could always try setting up your Mickey Mouse 'blocked using
dnsbl.lan' restriction so it works properly LOL.


Actually, you were first blocked by a milter because your SPF record  
contains "junk" get someone with a clue to set it up for you


your internal bloack list blocks this mail servers IP anyway, so pot  
kettle black, tosser.



3. The day I give a shit about what an Australian spammer thinks of me,
will be the day hell freezes over.


oh im a spammer now am I, awww poor widdle wicky, go cry to mummy,  
or tell someone who gives a fuck.



--
Res

-Beware of programmers who carry screwdrivers


--
Simple compliance is a hacker's best friend


@fferent Security Labs:  Isolate/Insulate/Innovate  
http://www.afferentsecurity.com




Re: SORBS bites the dust

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 15:08, Res wrote:
> On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:
> Actually, you were first blocked by a milter because your SPF record
> contains "junk" get someone with a clue to set it up for you

http://old.openspf.org/wizard.html?mydomain=buzzhost.co.uk&submit=Go!

remove ptr also, doom ? :)

> -Beware of programmers who carry screwdrivers

beware of apple that did not want there phones to show comodore 64 games,
i can just say "nokia connecting people" :)

-- 
xpoint



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Ned Slider

Karsten Bräckelmann wrote:

On Thu, 2009-06-25 at 13:20 +0200, Jan P. Kessler wrote:

Henrik K schrieb:



SA is trying to be too supportive for the money it receives. ;-) If you ask
me, just ditch this and all other old baggage for 3.3. If you are not happy,
you are free to keep running 3.2. Some people are even still using 3.1.

Good proposal, imo.


Actually, that's pretty much exactly why we brought this up in the first
place. :)

  guenther




Just for info, I checked Red Hat Enterprise Linux (RHEL) and CentOS, and 
have to go back to RHEL 2 (just recently End of Life) to find perl 5.6.1.


RHEL 3-5 are all 5.8.x, and are pretty popular platforms for running SA 
I would imagine :-)




Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 14:56, Henrik K wrote:
> I'm just not sure why ask in the first place. Perl 5.6.1 is old. Anyone
> using such system most likely has no support. Anyone using such perl most
> likely shouldn't be allowed to use it. You could be already fixing the
> code and not waiting. ;)

old programs is more or less also bug free unless some update the problem :)

-- 
xpoint



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Per Jessen
Henrik K wrote:

> On Thu, Jun 25, 2009 at 02:36:15PM +0200, Karsten Bräckelmann wrote:
>> On Thu, 2009-06-25 at 13:20 +0200, Jan P. Kessler wrote:
>> > Henrik K schrieb:
>> 
>> > > SA is trying to be too supportive for the money it receives. ;-)
>> > > If you ask me, just ditch this and all other old baggage for 3.3.
>> > > If you are not happy, you are free to keep running 3.2. Some
>> > > people are even still using 3.1.
>> > 
>> > Good proposal, imo.
>> 
>> Actually, that's pretty much exactly why we brought this up in the
>> first place. :)
> 
> I'm just not sure why ask in the first place. Perl 5.6.1 is old.
> Anyone using such system most likely has no support. 

And most probably doesn't need much either.  I ran a 9-10 year old
release of SuSE Linux until very recently, obviously long outdated and
out of support, but I didn't need any. 

I'm using perl 5.8 and 5.10, so upping the minimum to 5.8 would be fine
with me, but it's a very decent question to ask.  
I guess one key question is - would continued support for 5.6 hold back
development or features in SA?  If yes, it's worth upping the minimum.


/Per Jessen, Zürich



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread John Rudd
2009/6/25 Ned Slider :
> Karsten Bräckelmann wrote:
>>
>> On Thu, 2009-06-25 at 13:20 +0200, Jan P. Kessler wrote:
>>>
>>> Henrik K schrieb:
>>
 SA is trying to be too supportive for the money it receives. ;-) If you
 ask
 me, just ditch this and all other old baggage for 3.3. If you are not
 happy,
 you are free to keep running 3.2. Some people are even still using 3.1.
>>>
>>> Good proposal, imo.
>>
>> Actually, that's pretty much exactly why we brought this up in the first
>> place. :)
>>
>>  guenther
>>
>
>
> Just for info, I checked Red Hat Enterprise Linux (RHEL) and CentOS, and
> have to go back to RHEL 2 (just recently End of Life) to find perl 5.6.1.
>
> RHEL 3-5 are all 5.8.x, and are pretty popular platforms for running SA I
> would imagine :-)

Mac OS X 10.5.x = perl 5.8.8

Mac OS X 10.4.x = perl 5.8.6

(I no longer have any 10.3.x nor older Macs to check for their perl versions)


Solaris 10 (x86 and sparc) (of some patch level) =  perl 5.8.4

Solaris 9 sparc (of some patch level) = perl 5.6.1


So, for Mac it seems like a very safe assumption... for Solaris, it
assumes that they're running current (which is not always a safe
assumption; I've seen LOTS of so-focused-on-stability "if it ain't
broke, don't upgrade it" type shops in the Solaris arena ... heck,
still have a Solaris _7_ box for somewhere, for that reason ... and in
financial circles, I've even seen "if it ain't broke, don't patch it"
type shops).  If the Solaris system is running even 1 major revision
old, it might be in 5.6.x.


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Per Jessen
John Rudd wrote:

> I've seen LOTS of so-focused-on-stability "if it ain't broke, don't
> upgrade it" type shops in the Solaris arena ... 

You'll likely find that in any production environment that is concerned
about uptime.  The less change, the more uptime. 


/Per Jessen, Zürich



Re: SORBS bites the dust

2009-06-25 Thread Matus UHLAR - fantomas
On 25.06.09 12:38, Yet Another Ninja wrote:
> Could this thread be moved to spam-l ?
> Seems it has little to do with SA

spam-l was closed iirc ;-)
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
How does cat play with mouse? cat /dev/mouse


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Matus UHLAR - fantomas
> On 25-Jun-2009, at 04:15, Jan P. Kessler wrote:
>> Don't know if it's still relevant: Solaris 8
>>
>> # uname -a
>> SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250
>>
>> # perl -v
>> This is perl, version 5.005_03 built for sun4-solaris

On 25.06.09 04:37, LuKreme wrote:
> 5.00?  

5.005 is actually 5.5... yes, older than 5.6
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
BSE = Mad Cow Desease ... BSA = Mad Software Producents Desease


Re: SORBS bites the dust

2009-06-25 Thread DAve

Jack Pepper wrote:
How long will this go before Godwin's law finally kicks in?  Now I'm 
just watching for the fun of it .


Yea, this is why when my bosses ask where I get my information I tell 
them from a closed forum. If they read the adolescent ramblings that got 
posted on email/spam lists they wouldn't allow us to use half the 
software we do.


DAve



Quoting Res :


On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:


1. It's 'You're' a joke - not 'your' a joke


Ah the classic sign of someone in defeat, has to nit pick someones 
grammer



2. You could always try setting up your Mickey Mouse 'blocked using
dnsbl.lan' restriction so it works properly LOL.


Actually, you were first blocked by a milter because your SPF record 
contains "junk" get someone with a clue to set it up for you


your internal bloack list blocks this mail servers IP anyway, so pot 
kettle black, tosser.



3. The day I give a shit about what an Australian spammer thinks of me,
will be the day hell freezes over.


oh im a spammer now am I, awww poor widdle wicky, go cry to mummy, or 
tell someone who gives a fuck.



--
Res

-Beware of programmers who carry screwdrivers





--
"Posterity, you will know how much it cost the present generation to
preserve your freedom.  I hope you will make good use of it.  If you
do not, I shall repent in heaven that ever I took half the pains to
preserve it." John Quincy Adams

http://appleseedinfo.org



Re: SORBS bites the dust

2009-06-25 Thread Yet Another Ninja

On 6/25/2009 4:12 PM, Matus UHLAR - fantomas wrote:

On 25.06.09 12:38, Yet Another Ninja wrote:

Could this thread be moved to spam-l ?
Seems it has little to do with SA


spam-l was closed iirc ;-)


yes and no
it was taken over and its nice & busy


http://spam-l.com/mailman/listinfo



Got one!

2009-06-25 Thread Diffenderfer, Randy
Seems like it's gonna cost some of the big boys a little coin...

http://detroit.fbi.gov/dojpressrel/pressrel09/de062209.htm

Let's hope there are more indictments where these came from!

rnd


rudimentary gibberish

2009-06-25 Thread Isabel Billings
stopwatch sussex trait
warmup sporadic



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread John Rudd
On Thu, Jun 25, 2009 at 07:11, Per Jessen wrote:
> John Rudd wrote:
>
>> I've seen LOTS of so-focused-on-stability "if it ain't broke, don't
>> upgrade it" type shops in the Solaris arena ...
>
> You'll likely find that in any production environment that is concerned
> about uptime.  The less change, the more uptime.

Yes, _I_ know the environment that causes it, but in these days of
lots of projects that expect upgrade-itis, I usually feel the need to
explain at least a tiny bit.

(and not just environments concerned about uptime, it can instead be
concerned about service stability.  that's not necessarily about
uptime, but can instead be about consistency of user experience)


Re: rudimentary gibberish

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 18:14, Isabel Billings wrote:
> stopwatch sussex trait
> warmup sporadic

Resent-From: "Steven W. Orr" 
Resent-To: spamassassin-users 
Resent-Date: Thu, 25 Jun 2009 10:43:41 -0400
Resent-Message-Id: <4a438d1d.3070...@syslang.net>
Resent-User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.21)
Gecko/20090320 Fedora/2.0.0.21-1.fc10 Lightning/0.9 Thunderbird/2.0.0.21

dont do this

-- 
xpoint



Apache.org spam??

2009-06-25 Thread Jeremy Morton

I recently got this spam that made its way thru SpamAssassin:

http://pastebin.ca/1474274

Looks like it was received from mail.apache.org which is in the 
DNSWL.org DB, unsurprisingly.  Why would mail.apache.org send out this 
obvious spam?


Best regards,
Jeremy Morton (Jez)


Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Jonas Eckerman wrote:


You meen extract images and add them as parts to the message?

I guess that should be doable. I know that "unrtf" can extract images 
from RTF files. I'll probably implement support for this, but I'll 
probably not implement actually doing it right away.


This'll probably have to wait. Browsing the POD and source of 
Mail::SpamAssassin::Message::Node and Mail::SpamAssassin::Message I 
found no obvious way of adding new parts to a message node. Especially 
if the node is a leaf node (I'm guessing that singlepart messages only 
has a leaf node).


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: Apache.org spam??

2009-06-25 Thread Jeremy Morton

To reply to myself

I guess that was sent to the spamassassin.apache.org list and the list 
was BCC'd so it didn't get put into my list folder.  Ah well.


Best regards,
Jeremy Morton (Jez)

Jeremy Morton wrote:

I recently got this spam that made its way thru SpamAssassin:

http://pastebin.ca/1474274

Looks like it was received from mail.apache.org which is in the
DNSWL.org DB, unsurprisingly. Why would mail.apache.org send out this
obvious spam?

Best regards,
Jeremy Morton (Jez)



Re: Apache.org spam??

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 17:10, Jeremy Morton wrote:
> Looks like it was received from mail.apache.org which is in the
> DNSWL.org DB, unsurprisingly.  Why would mail.apache.org send out this
> obvious spam?

blame mozilla thunderbird for the resent headers, the problem is that one user 
release something in there quarantine and spammed
forged headers

-- 
xpoint



Re: How many people are still using perl 5.6.x?

2009-06-25 Thread jp
My oldest server has 5.8, and it's a really out of date box.
My newest out-of-date box has 5.8.8-36 (opensuse 10.2).

Antispam and email is a fast changing technology (compared to other server 
things like file and print and http), so I see no reason why people should try 
to adapt an old system to todays needs. I don't keep email servers around for 
more than three years, and that's pushing it. A lot has changed in three 
years, in every aspect, volume of email/spam, software, antivirus, processing 
demands, storage demands, etc... If a mail server is more than three years 
old, it's likely overdue for a lot more things than just a spamassassin 
update.

On Thu, Jun 25, 2009 at 09:44:08AM +0100, Justin Mason wrote:
> For the upcoming release, we're considering dropping support for that
> interpreter version.  If you're still using 5.6.x, or know of a
> (relatively recent) distro that does, please reply to highlight
> this
> 
> --j.

-- 
/*
Jason Philbrook   |   Midcoast Internet Solutions - Wireless and DSL
KB1IOJ|   Broadband Internet Access, Dialup, and Hosting 
 http://f64.nu/   |   for Midcoast Mainehttp://www.midcoast.com/
*/


Re: Apache.org spam??

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 17:20, Jeremy Morton wrote:
> I guess that was sent to the spamassassin.apache.org list and the list
> was BCC'd so it didn't get put into my list folder.  Ah well.

with sieve:

if header :contains "List-Id" "users.spamassassin.apache.org"
{
fileinto "maillists.spamassassin";
stop;
}

-- 
xpoint



Re: Plugin extracting text from docs

2009-06-25 Thread Matus UHLAR - fantomas
> Matus UHLAR - fantomas wrote:
>
>>> I'm currently working on a modular plugin for extracting text and add 
>>> it  to SA message parts.
>>
>> if possible, extract images too, so the fuzzyocr and similar plugins would
>> be able to look at that too.
>
> You meen extract images and add them as parts to the message?
>
> I guess that should be doable. I know that "unrtf" can extract images  
> from RTF files. I'll probably implement support for this, but I'll  
> probably not implement actually doing it right away.
>
>> IIRC spammers did even put PDF's to .doc files to make the stuff harder, but
>> if you manage the above, it shouldn't be hard to extract PDF's too :)

On 25.06.09 14:44, Jonas Eckerman wrote:
> This I don't understand. Do they put PDFs inside .doc files as if the  
> ..doc was an archive?

I am not sure but I think something alike was done. What I mean is to have
generic chain of format converters, where at the end would be plain image
or even text, that could be processed by classic rules like bayes,
replacetags etc.

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety. -- Benjamin Franklin, 1759


Re: Plugin extracting text from docs

2009-06-25 Thread Theo Van Dinter
On Thu, Jun 25, 2009 at 11:48 AM, Matus UHLAR -
fantomas wrote:
> I am not sure but I think something alike was done. What I mean is to have
> generic chain of format converters, where at the end would be plain image
> or even text, that could be processed by classic rules like bayes,
> replacetags etc.

Already exists, check recent list history for "set_rendered".
:)


Re: Apache.org spam??

2009-06-25 Thread SM

At 08:10 25-06-2009, Jeremy Morton wrote:

I recently got this spam that made its way thru SpamAssassin:


[non-persistent information snipped]

Looks like it was received from mail.apache.org which is in the 
DNSWL.org DB, unsurprisingly.  Why would mail.apache.org send out 
this obvious spam?


The message was sent by a mailing list subscriber to a list which 
generally discusses about spam.  It scored 4.0 on Apache.org.


Why is the message obvious spam?  What rules would you recommend to catch it?

Regards,
-sm 



Re: Apache.org spam??

2009-06-25 Thread Benny Pedersen
On Thu, June 25, 2009 17:56, SM wrote:

> What rules would you recommend to catch it?

something as this on apache.org:

header __RESENT1 exists:Resent-From
header __RESENT2 exists:Resent-To
header __RESENT3 exists:Resent-Date
header __RESENT4 exists:Resent-Message-Id

meta NO_RESENT_MAIL (__RESENT1 && __RESENT2 && __RESENT3 && __RESENT4)
describe NO_RESENT_MAIL Meta: please dont resend mail to maillists
score NO_RESENT_MAIL 3.0

if i cant fix others problems but imho apache.org need the above :)

-- 
xpoint






Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Matus UHLAR - fantomas wrote:

This I don't understand. Do they put PDFs inside .doc files as if the  
..doc was an archive?


I am not sure but I think something alike was done.


Considering that an OpenXML format is basically a zip file with XML 
files inside and that the actual document can contain hyperlinks I guess 
it could be possible to do something like that. Don't know enough about 
the format to know though.



What I mean is to have
generic chain of format converters, where at the end would be plain image
or even text, that could be processed by classic rules like bayes,
replacetags etc.


If I manage to figure out how to add new parts to a message from within 
the "post_message_parse" method, that should work just fine.


An extractor plugin can return a list of parts to be added to the 
message, and my module will keep looping through the message parts if 
new parts are added.


So, if a Word extractor extracts a PDF and returns it, the PDF woudl be 
added to a new part, and in the next loop the PDF part will be sent to a 
PDF extractor if that exists. And so on. I'm running 
"post_message_parse" at priority -1 so any added image parts should be 
available to plugins like FuzzyOCR as well as plugins running 
"post_message_parse" at default priority.


The missing parts are:

1: How do I add a new part to a parsed message (including a singlepart 
one). This is of course the main problem.


2: The actual extractor plugin that extracts whatever files are included 
in the word document. Antiword only extracts text, and my extractor for 
OpenXML is little more than an extremely basic XML remover.


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Chris Hoogendyk



Per Jessen wrote:

John Rudd wrote:

  

I've seen LOTS of so-focused-on-stability "if it ain't broke, don't
upgrade it" type shops in the Solaris arena ... 



You'll likely find that in any production environment that is concerned
about uptime.  The less change, the more uptime. 


As far as Solaris goes, I typically update my core utilities like perl 
and put them in /usr/local. I also change the $PATH in /etc/profile so 
that /usr/local/bin comes first. That gives me control over what I and 
my users see.


I replaced Solaris 7 with 8 seems like 9 or 10 years ago. Solaris 7 was 
too hackable. Now, I haven't used Solaris 8 in about 4 years and am 
currently replacing my Solaris 9 boxes with Solaris 10 boxes. However, 
even in the newest, I still typically update my core utilities like 
perl. I simply need more control over them and need them to be more 
up-to-date, whether I compile them myself or get them from sunfreeware.


As far as down time ;) , earlier this week I updated a couple of my 
Solaris 10 boxes. I went from Solaris 10 5/08 U5 to Solaris 10 5/09 U7. 
I did the update during peak hours and also applied the latest 
recommended and security patches. Since I did it using Live Upgrade, 
users were totally unaware, and services continued as though nothing 
were going on. Then after the end of the work day, I issued an `init 6`. 
When the server came back up a minute or two later, I checked all the 
services, checked the update status, and then went home myself. If there 
had been a problem, I could have reverted and booted off the original 
image, leaving me right where I had started.


Gone are the days when you totally avoided upgrades because of the time, 
hassle and risk involved.


Note also that Solaris 9 is now entering EOL. In the second stage of EOL 
(where 8 is now, I believe), they no longer provide patches. This can be 
a serious problem. If, for example, a serious bug is found in ssh that 
allows a hack through ssh, then you are simply vulnerable unless you 
upgrade your system or build and replace ssh on your own. If you are on 
a private net behind a firewall, you may still be vulnerable, especially 
if there is a flotilla of windows machines sitting around waiting to get 
infected with whatever.



--
---

Chris Hoogendyk

-
  O__   Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~ - University of Massachusetts, Amherst 




--- 


Erdös 4




cas...@snigelpost.org bounces?

2009-06-25 Thread John Hardin
Is anybody else getting bounces on mail they send to the list from 
cas...@snigelpost.org?


If so, can we get him unsubscribed?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Phobias should not be the basis for laws.
---
 9 days until the 233rd anniversary of the Declaration of Independence


Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Theo Van Dinter wrote:


I am not sure but I think something alike was done. What I mean is to have
generic chain of format converters, where at the end would be plain image
or even text, that could be processed by classic rules like bayes,
replacetags etc.



Already exists, check recent list history for "set_rendered".
:)


I though that was for text only.

In any case, any plugin looking for images, or a PDF, will most likely 
look at MIME type and/or file name, and then use the "decode" method to 
get the data, and AFAICT the "set_rendered" method doesn't have any 
impact on any of that.


I can't see how "set_rendered" would help in creating a fucntioning 
chain where one converter could put an arbitrary extracted object 
(image, pdf, whatever) where another converter could have a go at it.


Since the "set_rendered" method seems very undocumented I could of 
course be wrong here. In that case I hope to be verbosely corrected. :-)


/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Arvid Picciani

John Hardin wrote:
Is anybody else getting bounces on mail they send to the list from 
cas...@snigelpost.org?


Yep. I wish backscatter.org had a reporting and educating form.  Ie 
automaticaly inform the postmaster of that system of the listing 
incuding educational material how to fix it.


Btw, somone got a webpage that has information for the most common MTAs?
I started blocking some backscattering hosts and would like to inform 
them  how to fix the issue.


Re: SORBS bites the dust

2009-06-25 Thread Arvid Picciani

Jack Pepper wrote:
> How long will this go before Godwin's law finally kicks in? 


It already did.

> 1. It's 'You're' a joke - not 'your' a joke



> Now I'm just watching for the fun of it

Try IRC :-P




Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 19:09, John Hardin wrote:
> Is anybody else getting bounces on mail they send to the list from
> cas...@snigelpost.org?
>
> If so, can we get him unsubscribed?

here i have seen 25 of this bouncers, i have added his sender ip into postfwd 
client_address until its resolved, i belive you can
make a body rule in milter.regex ? :)

-- 
xpoint



Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Karsten Bräckelmann
On Thu, 2009-06-25 at 10:09 -0700, John Hardin wrote:
> Is anybody else getting bounces on mail they send to the list from 
> cas...@snigelpost.org?

Taking care of that, already poked the almighty admins.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: cas...@snigelpost.org bounces?

2009-06-25 Thread John Hardin

On Thu, 25 Jun 2009, Benny Pedersen wrote:



On Thu, June 25, 2009 19:09, John Hardin wrote:

Is anybody else getting bounces on mail they send to the list from
cas...@snigelpost.org?

If so, can we get him unsubscribed?


here i have seen 25 of this bouncers, i have added his sender ip into 
postfwd client_address until its resolved, i belive you can make a body 
rule in milter.regex ? :)


Sure, but that doesn't help anybody else that posts to the list.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Phobias should not be the basis for laws.
---
 9 days until the 233rd anniversary of the Declaration of Independence


Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 19:34, John Hardin wrote:
> Sure, but that doesn't help anybody else that posts to the list.

it will if admins at remote read there logs, but yes we can only wait now

-- 
xpoint



Re: Apache.org spam??

2009-06-25 Thread SM

At 09:13 25-06-2009, Benny Pedersen wrote:

something as this on apache.org:

header __RESENT1 exists:Resent-From
header __RESENT2 exists:Resent-To
header __RESENT3 exists:Resent-Date
header __RESENT4 exists:Resent-Message-Id

meta NO_RESENT_MAIL (__RESENT1 && __RESENT2 && __RESENT3 && __RESENT4)
describe NO_RESENT_MAIL Meta: please dont resend mail to maillists
score NO_RESENT_MAIL 3.0

if i cant fix others problems but imho apache.org need the above :)


Nice.  The above rules cannot be applied for all apache.org traffic 
as it's not only for mailing lists.


Regards,
-sm 



Re: cas...@snigelpost.org bounces? [RESOLVED]

2009-06-25 Thread Karsten Bräckelmann
On Thu, 2009-06-25 at 19:32 +0200, Karsten Bräckelmann wrote:
> Taking care of that, already poked the almighty admins.

FYI, they took care about this issue. Quite speedy. :)


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Apache.org spam??

2009-06-25 Thread John Hardin

On Thu, 25 Jun 2009, SM wrote:


At 09:13 25-06-2009, Benny Pedersen wrote:

something as this on apache.org:

header __RESENT1 exists:Resent-From
header __RESENT2 exists:Resent-To
header __RESENT3 exists:Resent-Date
header __RESENT4 exists:Resent-Message-Id

meta NO_RESENT_MAIL (__RESENT1 && __RESENT2 && __RESENT3 && __RESENT4)
describe NO_RESENT_MAIL Meta: please dont resend mail to maillists
score NO_RESENT_MAIL 3.0

if i cant fix others problems but imho apache.org need the above :)


Nice.  The above rules cannot be applied for all apache.org traffic as it's 
not only for mailing lists.


I point out that I've had legitimate reason in the past to resend messages 
to the SA list.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Phobias should not be the basis for laws.
---
 9 days until the 233rd anniversary of the Declaration of Independence


Re: cas...@snigelpost.org bounces? [RESOLVED]

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 19:48, Karsten Bräckelmann wrote:
> On Thu, 2009-06-25 at 19:32 +0200, Karsten Bräckelmann wrote:
>> Taking care of that, already poked the almighty admins.
> FYI, they took care about this issue. Quite speedy. :)

so now thay using postfix ?, fixing valid recipient maps is dangerous :)

-- 
xpoint



Re: Apache.org spam??

2009-06-25 Thread Benny Pedersen

On Thu, June 25, 2009 19:48, John Hardin wrote:
> I point out that I've had legitimate reason in the past to resend messages
> to the SA list.

test my rules better, will it hit a resend from you ? :)

well repost is not a resend, so it might still not hit

-- 
xpoint



Re: cas...@snigelpost.org bounces? [RESOLVED]

2009-06-25 Thread Karsten Bräckelmann
> > FYI, they took care about this issue. Quite speedy. :)
> 
> so now thay using postfix ?, fixing valid recipient maps is dangerous :)

What are you talking about, Benny?  The ASF admins have removed the
offending address from the list's subscribers.

Anyway, this horse is now dead. Please stop beating it.

  guenther

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: SORBS bites the dust

2009-06-25 Thread J.D. Falk

DAve wrote:


Jack Pepper wrote:

How long will this go before Godwin's law finally kicks in? Now I'm
just watching for the fun of it .


Yea, this is why when my bosses ask where I get my information I tell
them from a closed forum. If they read the adolescent ramblings that got
posted on email/spam lists they wouldn't allow us to use half the
software we do.


One of my co-workers was recently talking as if he thought SpamAssassin was 
some businesslike organization we could negotiate with.  I've been tempted 
to send him this thread.


(Not sure what he wanted to negotiate /for/, either.)

--
J.D. Falk
Return Path Inc
http://www.returnpath.net/


Re: Plugin extracting text from docs

2009-06-25 Thread Theo Van Dinter
On Thu, Jun 25, 2009 at 1:12 PM, Jonas Eckerman wrote:
>> Already exists, check recent list history for "set_rendered".
>
> I though that was for text only.

It is only for text.

> In any case, any plugin looking for images, or a PDF, will most likely look
> at MIME type and/or file name, and then use the "decode" method to get the
> data, and AFAICT the "set_rendered" method doesn't have any impact on any of
> that.

Of course.  There are three states for the data in a Message::Node object:
  - raw: whatever the email had originally.  may be encoded, etc.
  - decoded: the raw content, decoded (ie: base64 or
quoted-printable).  may be binary.
  - rendered: the text content.  if it was a text part, it's the same
as decoded.  if it was a html part, the decoded data gets "rendered"
into text.  if it's anything else, the rendered text is blank because
nothing else is supported.

The goal with the plugin calls and set_rendered is to allow other
plugins to find parts that they understand how to convert into text,
and set the rendered version of the part to whatever as appropriate.
So if you want to do OCR on image/*, you can do that.  If you want to
convert PDF/DOC/whatever to text, you can do that.

I would comment that plugins should probably skip parts they want to
render that already has rendered text available.

Rules, Bayes, etc, then take all the rendered parts and use them.

> I can't see how "set_rendered" would help in creating a fucntioning chain
> where one converter could put an arbitrary extracted object (image, pdf,
> whatever) where another converter could have a go at it.

Well, you wouldn't do that because there's no point. ;)   (feel free
to disagree with me though)
If a plugin wants to get image/* parts and do something with the
contents, they can do that already.
If a plugin wants to get application/octet-stream w/ filename "*.pdf"
and do something with the contents, they can do that already.

If you want to have a plugin do some work on a part's contents, then
store that result and let another plugin pick up and continue doing
other work ...  There's no official method to do that.  You can store
data as part of the Node object.  You could potentially also write a
tempfile, though you'll want to be careful to clean up the tempfile as
necessary.

But what would be a use case for that?  I guess something like
converting a PDF to a TIFF, then OCR the TIFF?
I'd probably implement that as a single plugin w/ "ocr" as a function
that gets called from both the PDF and TIFF handlers.
Arguably, there could be multiple people developing plugins for
different types, but you'd need some coordination for the
register_method_priority calls to figure out who goes in what order.
(btw: I just found the register_method_priority() method. \o/)

Note: Do not try to add or remove parts in the tree.  The tree is
meant to represent the mime structure of the mail, and each node
relates to that specific mime part.  The tree is not meant to be a
temporary data storage mechanism.


Hope this helps.


Re: rudimentary gibberish

2009-06-25 Thread David B Funk
On Thu, 25 Jun 2009, Isabel Billings wrote:
> Received: from syslang.net (localhost.localdomain [127.0.0.1])
> by saturn.syslang.net (8.14.3/8.14.3) with ESMTP id n5PEGcRM032298;
> Thu, 25 Jun 2009 10:16:39 -0400
> Received: from domenico32832c ([217.202.8.48])
> by saturn.syslang.net (8.14.3/8.14.3) with SMTP id n5PEGPqs032265;
> Thu, 25 Jun 2009 10:16:28 -0400
> Date: Thu, 25 Jun 2009 16:14:51 +
> From: Isabel Billings 
> Subject: rudimentary gibberish
> To: , ,
> , ,
> 
> Message-id: <9768176595.2008185...@toddlertruestories.com>
> Organization: toddlertruestories.com
> MIME-version: 1.0
> Content-type: text/plain; charset=us-ascii
> Content-transfer-encoding: 7bit
> X-Priority: 3 (Normal)
> X-Virus-Scanned: ClamAV 0.94.2/9506/Thu Jun 25 07:52:03 2009 on 
> saturn.syslang.net
> X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on saturn.syslang.net
> X-Virus-Checked: Checked by ClamAV on apache.org
> X-Old-Spam-Status: No, score=4.0 required=5.0 
> tests=BAYES_50,RCVD_IN_PBL,RDNS_NONE,
> RELAYCOUNTRY_IT autolearn=no version=3.2.5 country=IT
>
> stopwatch sussex trait
> warmup sporadic
>

first of all, I agree with Benny, don't bounce, it obscures the question
you're asking.

I -assume- you're asking for help catching that kind of "rudimentary
gibberish". Suggestions:

1) botnet plugin (but adjust scoring to work with your mail stream).
2) train your bayes.
3) consider using RBLs at the MTA level, zen/pbl at the MTA level
   would outrgiht block that garbage.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Got one!

2009-06-25 Thread David B Funk
On Thu, 25 Jun 2009, Diffenderfer, Randy wrote:

> Seems like it's gonna cost some of the big boys a little coin...
>
> http://detroit.fbi.gov/dojpressrel/pressrel09/de062209.htm
>
> Let's hope there are more indictments where these came from!
>
> rnd

Yes, but Ralsky's been making millions for years, a $1M fine is
chicken feed for him. If they do hit him with real jail-time in
a real slammer (not the fed "country club" big-wig jail) then
it might have a deterrent effect.

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Theo Van Dinter wrote:


I would comment that plugins should probably skip parts they want to
render that already has rendered text available.


Ah. That's a good idea. Now I'll have to search for a nice way to check 
that. :-)



I can't see how "set_rendered" would help in creating a fucntioning chain
where one converter could put an arbitrary extracted object (image, pdf,
whatever) where another converter could have a go at it.



If a plugin wants to get image/* parts and do something with the
contents, they can do that already.


Not if the image/* parts are actually inside a document.


If you want to have a plugin do some work on a part's contents, then
store that result and let another plugin pick up and continue doing
other work ...  There's no official method to do that.


I guessed as much. This however is what me and Matus were talking about.


You can store
data as part of the Node object.



But what would be a use case for that?


Matus example was a Word document that contained as PDF wich (might in 
turn contain an image). A plugin that knows how to read word document 
could extract th text of the word document and then use "set_rendered" 
to make that avaiölable to SA. It cannot currently extract the PDF and 
make it available to any plugins that knows how tpo read PDFs though.


Matus idea about chains would be that in this example the the plugin 
reading the Word document would store any other objects somehow. In this 
case a PDF. After that, any plugin that knows how to handle PDFs will 
get to look at the PDF and extract text and other stuff from it. In case 
it extracts an image, it would then store it the same way, and any image 
handling plugins would find it.


I really don't know how common that is. I have never seen a Word 
document with a PDF inside it myself.


I have however seen many documents that contain images, and I think it 
would be a good idea to make those images available to things like 
FuzzyOCR and ImageInfo.



Arguably, there could be multiple people developing plugins for
different types, but you'd need some coordination for the
register_method_priority calls to figure out who goes in what order.


For some stuff coordination would be needed, yes. But not for what I'm 
thinking of.


The text extraction plugin I'm working on (wich started this) itself 
have simple extractor plugins. These plugins will be able to return 
arbitrary objects as well as text, and my plugin will check the return 
objects the same way it checks the original message parts. This way, all 
the extractors that are tied into my plugins will be able to extract 
stuff from objects extracted by other extractors. So far so good.


The most common thing to extract apart from text will most likely be 
images. Any OCR text extractor tied into my plugin would get to see 
those images, but any OCR SA plugins run after my plugin won't. It might 
be good to make extracted images available to those, and other image 
handling plugins.


My plugin is called after the message is parsed, wich is very good for a 
text extractor. FuzzyOCR (as an example) however works by scoring OCR 
output (wich may well be very different from the text in the image as we 
see it), and therefore has to be called at a later stage. The same gioes 
for ImageInfo.


It might therefore be a good idea to make the extracted images and other 
objects available to scoring plugins as well.


> I just found the register_method_priority() method. \o/)

It's nice, isn't it? :-)

I'm using it in my URLRedirect plugin.


Note: Do not try to add or remove parts in the tree.  The tree is
meant to represent the mime structure of the mail, and each node
relates to that specific mime part.  The tree is not meant to be a
temporary data storage mechanism.


Ok. That makes things easier and less easy for me. I know that I'll have 
to implement my own list of stuff to loop though when extractors return 
additional parts in my plugin. That's the easy part.


The difficult part is how to make extracted stuff available to other 
plugins in a way they understand. I see two main ways to do this:


1: Invent a new way. This would require modifications of any plugins 
that should check the extracted objects.


2: Add a container part somewhere that "find_parts" would find, but wich 
is not actually a member of the message tree, and then add a simple way 
to add parts to that container. This would require modification of 
Mail::SpamAssassin::Message, but not of the plugins.


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 05:20, Jan P. Kessler wrote:

Henrik K schrieb:
SA is trying to be too supportive for the money it receives. ;-) If  
you ask
me, just ditch this and all other old baggage for 3.3. If you are  
not happy,
you are free to keep running 3.2. Some people are even still using  
3.1.



Good proposal, imo.


Seconded. If it's useful to drop support for older perl, I have no  
problem with requiring 5.10 for SA 3.3. or 5.10-threaded even.



--




Re: Got one!

2009-06-25 Thread LuKreme

On 25-Jun-2009, at 13:20, David B Funk wrote:

On Thu, 25 Jun 2009, Diffenderfer, Randy wrote:
Seems like it's gonna cost some of the big boys a little coin...


http://detroit.fbi.gov/dojpressrel/pressrel09/de062209.htm

Let's hope there are more indictments where these came from!


Yes, but Ralsky's been making millions for years, a $1M fine is
chicken feed for him. If they do hit him with real jail-time in
a real slammer (not the fed "country club" big-wig jail) then
it might have a deterrent effect.


No it won't. The majority of spammers are immune to US Law.


--
Just give us a kiss to celebrate here, today.



Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Charles Gregory

On Thu, 25 Jun 2009, Arvid Picciani wrote:
I started blocking some backscattering hosts and would like to inform 
them how to fix the issue.


I still welcome suggestions for handling the few remaining cases where my 
procmail chokes on a mailbox limit. Probably more of a PM question than an 
SA question, but seeing how the cause for concern is backscatter from 
'full mailbox' DSN's I'm figuring the answer is here, if anywhere


- C


Re: cas...@snigelpost.org bounces?

2009-06-25 Thread Charles Gregory

On Thu, 25 Jun 2009, Benny Pedersen wrote:

On Thu, June 25, 2009 19:34, John Hardin wrote:

Sure, but that doesn't help anybody else that posts to the list.

it will if admins at remote read there logs, but yes we can only wait now


If they do, they don't act very quickly. I've been rejecting these at my 
SMTP gate since they first appeared.


- C


Re: Plugin extracting text from docs

2009-06-25 Thread Theo Van Dinter
On Thu, Jun 25, 2009 at 3:41 PM, Jonas Eckerman wrote:
> Matus example was a Word document that contained as PDF wich (might in turn
> contain an image). A plugin that knows how to read word document could
> extract th text of the word document and then use "set_rendered" to make
> that avaiölable to SA. It cannot currently extract the PDF and make it
> available to any plugins that knows how tpo read PDFs though.

My view would be that if someone is going to try making things so
convoluted such as that, a) we've won because no one is going to go
through the trouble of opening that doc, b) the convolution is a
fingerprint that you could write a rule for and then you don't care
what the content actually is.  For example, you'd render something
like "doc_pdf_jpg", which would make an obvious Bayes token.  In the
same way for a zip file, you could do "zip_pdf zip_jpg zip_txt", etc,
and they'd all be different tokes.

But yes, you're right, the Message/Message::Node stuff wasn't designed
with the idea of supporting multiple independent data objects from a
single mime part.  I can see the argument for "treat embeded files
similar to multipart", but I still lean towards mime structure only.

> For some stuff coordination would be needed, yes. But not for what I'm
> thinking of.

Why not?  If you have no coordination, you would possibly look for
images first, then pdfs, then word docs, and end up not getting
anywhere.  If it's all your plugin, you can configure the order.  If
it's not, you need coordination.  For example, as from above, if
there's zip file with a doc which has a pdf which has a jpg, and your
plugin doesn't handle zip but another one does ...

> The most common thing to extract apart from text will most likely be images.
> Any OCR text extractor tied into my plugin would get to see those images,
> but any OCR SA plugins run after my plugin won't. It might be good to make
> extracted images available to those, and other image handling plugins.

But yours already ran, so who cares about the others?

Seriously.

If you're expending the resources to OCR the same image in an email
multiple times ...  You clearly either have a lot of hardware or not a
lot of mail.


Re: A difficult one to weed out?

2009-06-25 Thread Roger Marquis

LuKreme wrote:

PostConf http://www.postconf.com for example.


Looks interesting, but not FreBSD demo :/


Waiting only for a postfix port with an "overwrites-base" option.

The code itself works with any postfix home directory.

Roger Marquis


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread John Rudd
On Thu, Jun 25, 2009 at 10:09, Chris Hoogendyk wrote:
>
> Gone are the days when you totally avoided upgrades because of the time,
> hassle and risk involved.


Time and hassle, maybe.  Risk, no.  Risk is not a binary, it's a
balancing act.  Live updates don't remove risk, they simply alter the
risk balance.  There will always be applications and environments
where risk is high enough that will cause you to wait.

For example, your 2 minutes of downtime... on wall street that could
cost you millions of dollars of stalled or canceled transactions.
(well, not lately, but before the crash...)  So, your CFO will ask
you: is the risk of upgrading vs not upgrading worth a couple million
dollars?  If the upgrade isn't worth it, then they will likely choose
to avoid it.  Like I said "if isn't broken, don't upgrade", which
translates to "don't upgrade until the cost of not upgrading exceeds
the lost revenue of your outage window".

(and redundant systems may OR MAY NOT mitigate that)


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Yet Another Ninja

On 6/25/2009 11:27 PM, John Rudd wrote:

On Thu, Jun 25, 2009 at 10:09, Chris Hoogendyk wrote:

Gone are the days when you totally avoided upgrades because of the time,
hassle and risk involved.



Time and hassle, maybe.  Risk, no.  Risk is not a binary, it's a
balancing act.  Live updates don't remove risk, they simply alter the
risk balance.  There will always be applications and environments
where risk is high enough that will cause you to wait.

For example, your 2 minutes of downtime... on wall street that could
cost you millions of dollars of stalled or canceled transactions.
(well, not lately, but before the crash...)  So, your CFO will ask
you: is the risk of upgrading vs not upgrading worth a couple million
dollars?  If the upgrade isn't worth it, then they will likely choose
to avoid it.  Like I said "if isn't broken, don't upgrade", which
translates to "don't upgrade until the cost of not upgrading exceeds
the lost revenue of your outage window".

(and redundant systems may OR MAY NOT mitigate that)


can we get back to Spamassassin and a sane update cycle context? .-)


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Theo Van Dinter
Well, the point is that if it works, don't break it.
Yes, you can totally avoid upgrades, depending on your environment.
Sometimes you have no choice and continue to run old versions of
software or firmware or ...
Get over it. :)

If you want to continue debating system administration issues, there
are several lists to do so (go to sage or lopsa, for example).  The
goal for this thread is to get a sense of how many people are still
running SA on Perl 5.6 and therefore how disruptive would it be to the
user base to require a newer version of Perl for newer versions of SA.


On Thu, Jun 25, 2009 at 5:35 PM, Yet Another Ninja wrote:
> On 6/25/2009 11:27 PM, John Rudd wrote:
>> On Thu, Jun 25, 2009 at 10:09, Chris Hoogendyk
>> wrote:
>>> Gone are the days when you totally avoided upgrades because of the time,
>>> hassle and risk involved.
>>
>> Time and hassle, maybe.  Risk, no.  Risk is not a binary, it's a
>> balancing act.  Live updates don't remove risk, they simply alter the
>> risk balance.  There will always be applications and environments
>> where risk is high enough that will cause you to wait.
> can we get back to Spamassassin and a sane update cycle context? .-)


backscatter (was Re: cas...@snigelpost.org bounces?)

2009-06-25 Thread Arvid Picciani

Charles Gregory wrote:

On Thu, 25 Jun 2009, Arvid Picciani wrote:
I started blocking some backscattering hosts and would like to inform 
them how to fix the issue.


I still welcome suggestions for handling the few remaining cases where 
my procmail chokes on a mailbox limit. Probably more of a PM question 
than an SA question, but seeing how the cause for concern is 
backscatter from 'full mailbox' DSN's I'm figuring the answer is here, 
if anywhere


- C
I didn't exactly understand which of the two possible questions you 
asked (yeah, not native speaker :/ ) so i'll try both:


1)  your MTA bounces, becouse your users mailboxes are full.
Defer (temporary  reject) the message at smtp time, so the sending MTA  
retrys a few times and ultimatly   gives up informing the REAL sender. 
(you could also reject permanently,  if you want that)
If you absolutely can't fix the MTA, at least check the SPF before 
bouncing. If the SPF doesn't match the sender,  don't send a bounce. 
Same for dkim. Also don't bounce spam.
Note that backscatter can actually get you blacklisted if you bounce to 
traps.


2) You're receiving backscatter and you get "mailbox full" DSNs
I find it impossible to parse DSNs.  There is no standard and its 
supposed to be human readable.
For now i block mail from postmaster/bounce-*/MAILERDAMEON/...  from 
listed (known misconfigured) hosts. I had to firewall two very 
aggressive hosts though ("normal" hosts!)

This blogs legitime DSNs so it might not be the solution for everyone.
Backscatter.org is far from complete, so i'm working on a trap. Thanks 
to one of our domain beeing joe jobbed (and not receiving legitime DSN, 
since we dont use it anymore) i can get around 100 hosts per day listed.
Unfortunatly i lack the infrastructure to make it usefull for the 
public, and backscatter.org has no report form.


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Chris Hoogendyk



Yet Another Ninja wrote:

On 6/25/2009 11:27 PM, John Rudd wrote:
On Thu, Jun 25, 2009 at 10:09, Chris 
Hoogendyk wrote:
Gone are the days when you totally avoided upgrades because of the 
time,

hassle and risk involved.


Time and hassle, maybe.  Risk, no.  Risk is not a binary, it's a
balancing act.  Live updates don't remove risk, they simply alter the
risk balance.  There will always be applications and environments
where risk is high enough that will cause you to wait.

For example, your 2 minutes of downtime... on wall street that could
cost you millions of dollars of stalled or canceled transactions.
(well, not lately, but before the crash...)  So, your CFO will ask
you: is the risk of upgrading vs not upgrading worth a couple million
dollars?  If the upgrade isn't worth it, then they will likely choose
to avoid it.  Like I said "if isn't broken, don't upgrade", which
translates to "don't upgrade until the cost of not upgrading exceeds
the lost revenue of your outage window".

(and redundant systems may OR MAY NOT mitigate that)


can we get back to Spamassassin and a sane update cycle context? .-) 


nah. I think we should get back to SORBS bites, and so does res, and so 
does so and so, etc. ;-)


actually, my point was that there is not much excuse for not having a 
more up-to-date perl these days, so yeah, go ahead and boot 5.6.x.  If 
there are legacy or OS things that require the older perl, you can 
actully have your cake and eat it too. My Solaris 9 installs still have 
/usr/bin/perl, which is 5.6.1, and the OS stuff from Solaris can still 
use that. I have 5.8.7 in /usr/local/bin/perl on the Solaris 9 systems, 
and SpamAssassin uses that. It's easy to manage $PATH and the #! lines 
of scripts.


So, go for it.


--
---

Chris Hoogendyk

-
  O__   Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~ - University of Massachusetts, Amherst 




--- 


Erdös 4




Re: SORBS bites the dust

2009-06-25 Thread John Rudd
On Thu, Jun 25, 2009 at 14:41, mouss wrote:
> James Wilkinson a écrit :

>> If you mean “IP address that should not have been in the PBL but was”,
>> that’s one thing. It’s a consistent definition, but not very useful for
>> stopping spam.
>>
>
> yes, the PBL may list blocks that contain networks which "want" to send
> mail directly, and which in principle, should be able to do so. but
> whatever decision you taéke here is difficult. if you say, I will only
> block those who I am certain are criminals, then some criminals will get
> in.

I think part of the point, though, is that the PBL isn't _directly_
about stopping spam.  The PBL is about stopping portions of the
internet from sending email directly to hosts outside off their own
organizations.  The "policy" that is the P in PBL is (someone's)
policy about who should or shouldn't be sending email directly to the
internet at large.

The PBL indirectly fights spam by keeping botnets from being able to
spew to the internet, and creating choke-points in each organization
through which that email will/should flow.  But this is an indirect
result.  There will be plenty of things that the PBL blocks that are
NOT spam, but are also not PBL false positives (in the sense that
"they are listed in the PBL and SHOULD be listed in the PBL, by the
definition of what the PBL says it will list").

People who complain that the PBL is blocking things that aren't spam
kind of don't get the point of the PBL.  The PBL's definition means
that it will block non-spam.  It should also block a lot of spam, but
the fact that it will block ham is not an indictment of the PBL.  It
just means that people who complain about that fact don't understand
the PBL.

(and, people who block or score against PBL addresses in Received
headers, instead of only doing it against direct MTA connections,
probably also don't fully get the PBL)

Anyway, my point in reply to you is that it's not a difficult
stand/decision, as long as you understand what you're getting into.
You don't target PBL hosts to block/score spam, you block the PBL
hosts to enforce policies about who submits messages to whom.

If you agree with that policy concept, it's an easy decision (you use it).
If you don't agree with that policy concept, it's an easy decision
(you don't use it).

If you don't understand the policy concept, and you're just trying to
use it to "block spam and not block ham" then the difficulty is that
you're not using the right tool for the task at hand.  That's not a
difficult decision, that's a difficulty understanding the world in
which you operate :-)


Re: SORBS bites the dust

2009-06-25 Thread mouss
James Wilkinson a écrit :
> mouss wrote (about the PBL):
>> stop spreading FUD. if you know of false positives, show us so that we
>> see what you exactly mean.
>>
>> a lot of people, including $self, use the PBL at smtp time.
> 
> As usual, it depends on your definition of “false positive”.
> 

fully agreed.

I personally find it bad to block any "non spamming network". but
sometimes, the only reasonable way to do this is via whitelists, and
unfortunatley, you can't whitelist unknown senders. so yes, I do block
some networks because I "think" they are too spammy (they may contain
"legitimate" IPs).

> If you mean “IP address that should not have been in the PBL but was”,
> that’s one thing. It’s a consistent definition, but not very useful for
> stopping spam.
> 
> If you mean “solicited and/or non-bulk email that would have been
> stopped by the PBL”, then I’ve seen a number of small Indian and Chinese
> companies who are unaware of a lot of things, including the existence of
> the PBL and that it’s a Good Thing to send email through a smart host
> with a consistent IP address and reverse DNS.¹
> 

yes, the PBL may list blocks that contain networks which "want" to send
mail directly, and which in principle, should be able to do so. but
whatever decision you taéke here is difficult. if you say, I will only
block those who I am certain are criminals, then some criminals will get
in.

whether you use them or not, lists that put some pressure on ISPs,
networks, .. are good, and are necessary. some time ago, open relay was
ok. now, you won't here much people saying "but I want the freedom to
relay... ".

yes, spammers are making us crazy ;-p

> Obviously, everyone’s email stream is different. Mine includes a
> commercially-significant amount of email from small companies in those
> two countries, and probably doesn’t include email from other countries
> where this takes place.
> 

just to make things clear. while I do use zen, my setup is not what one
would call aggressive (I do complain about some networks, but I don't
block them. but I do block snowshoe spammers "too easily"). I do get
"alien" mail from some networks (and not even from Asia!), and while I
have thought of comibing checks (x AND y AND z), I found solicited mail
that matches every bad thing I wanted to mix in the rule!

> But by this definition, false positives do occur, and my company’s
> SpamAssassin installation has to try to handle them.
> 
> James.
> 
> ¹ Fortunately, they’re also unaware that signatures should be removed
> when replying. That, a standard corporate signature including company
> registration data, a standard domain in each Message-ID that doesn’t
> appear in public DNS, a few negatively-scored custom rules to detect
> these, and the AWL mean that once someone has responded to one of our
> emails, they get automatically whitelisted. So at least existing
> correspondents don’t get blocked.
> 



Re: Plugin extracting text from docs

2009-06-25 Thread Jonas Eckerman

Theo Van Dinter wrote:


the convolution is a
fingerprint that you could write a rule for and then you don't care
what the content actually is.  For example, you'd render something
like "doc_pdf_jpg", which would make an obvious Bayes token.  In the
same way for a zip file, you could do "zip_pdf zip_jpg zip_txt", etc,
and they'd all be different tokes.


That's really a good idea. Put the chains of extraction in a 
pseudoheader that can be tested in rules and seen as a token by bayes.


I'm putting that in the todo for the plugin.


The most common thing to extract apart from text will most likely be images.
Any OCR text extractor tied into my plugin would get to see those images,
but any OCR SA plugins run after my plugin won't. It might be good to make
extracted images available to those, and other image handling plugins.



But yours already ran, so who cares about the others?


Because they work very differently?

A OCR plugin that adds the rendered text to the message for bayes and 
text rules is very different from one that does it's own scoring based 
on the OCRed text.



If you're expending the resources to OCR the same image in an email
multiple times ...  You clearly either have a lot of hardware or not a
lot of mail.


*I* don't use any OCR at all. We don't have the resources for that 
(beeing a small non-profit NGO), and so far I haven't seen any need for 
OCR either since we never had much image spam slip through anyway.


So I will not implement a OCR extractor for my plugin. I'll leave that 
for others. This is actually one of the reasons I'd like to let existing 
OCR plugins have access to any images extracted by my plugin. So that 
those who allready do use OCR can get a benefit from the extraction.


I'm not going to spend much time on it though. I'm happy just extracting 
text. :-) And it does extract text (currently from Word, OpenXML, 
OpenDocument and RTF documents). :-)


I actually hadn't even thought about this image/OCR etc stuff before 
Matus suggested it.


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/


Re: SORBS bites the dust

2009-06-25 Thread jdow

From: "Res" 
Sent: Thursday, 2009/June/25 06:08



On Thu, 25 Jun 2009, rich...@buzzhost.co.uk wrote:



3. The day I give a shit about what an Australian spammer thinks of me,
will be the day hell freezes over.


oh im a spammer now am I, awww poor widdle wicky, go cry to mummy, or tell 
someone who gives a fuck.


And, Res, profanity is the effort of a weak mind to express itself.

Now all of you pull your keyboard's plug.

{^_^} 



  1   2   >