Re: Spam or Not Spam :)

2009-02-13 Thread Matt Kettler
cnone wrote:
> I have some mails that I know they are spam but spamassassin gives secores
> below 5.0(generally zero) for some of them. I updated the rules,changed the
> score threshold but spamassassin still sees them as normal emails. Am I
> missing something or is this normal? I have like 1800 emails but it sees
> only 5 of them as spam.
>   
Well, you could start off by training the bayes database to know that
they are spam, making use of the sa-learn tool to do so.


$man sa-learn


You might also want to make sure none of those messages are matching
ALL_TRUSTED. If any do, then you probably need to configure your
trusted_networks manually. (The trust-path auto-guesser gets confused if
your MX is NATed, or otherwise has a non-routable IP)




Spam or Not Spam :)

2009-02-13 Thread cnone

I have some mails that I know they are spam but spamassassin gives secores
below 5.0(generally zero) for some of them. I updated the rules,changed the
score threshold but spamassassin still sees them as normal emails. Am I
missing something or is this normal? I have like 1800 emails but it sees
only 5 of them as spam.
-- 
View this message in context: 
http://www.nabble.com/Spam-or-Not-Spam-%3A%29-tp22008849p22008849.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Two servers, one database. A question

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 18:11 -0500, Kris Deugau wrote:
> I would bet on Bayes/userpref queries being more efficient than the 
> spamc/spamd traffic.

I think we have a consensus here :-)  I didn't get any definitive
answers here but the folks who responded made me think about the problem
a little more intelligently.

Thanks!

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread Kris Deugau

John Hardin wrote:

If I may try:

The question is which is better, sending the message body (spamc <-> 
spamd traffic) or database queries (spamd <-> mysql traffic) over the 
expensive link?


Yeah, after going back and forth I think I've finally got that.  

I would bet on Bayes/userpref queries being more efficient than the 
spamc/spamd traffic.


-kgd


Re: Two servers, one database. A question - a correction.

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 16:51 -0600, Lindsay Haisley wrote:
> Scenario 2:  spamc on box A communicates with a _local_ spamd, which
> accesses local config files but uses a MySQL connection _over the
> network_ to box A to access the Bayes/userpref database.

Sorry, this should read:

Scenario 2:  spamc on box A communicates with a _local_ spamd, which
accesses local config files but uses a MySQL connection _over the
network_ to box >>B<< to access the Bayes/userpref database.
-

My bad.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 17:26 -0500, Kris Deugau wrote:
> *nod*  I don't know what kind of data size the Bayes SQL queries run, 
> but it probably averages out somewhere close to a order of magnitude 
> less than the full email.
> 
> I think I misread your original email, and I'm still not sure I 
> understand exactly what your current configuration is, and what you're 
> trying to achieve though.

Currently I have two servers, A and B.  B is the older of the two and
currently hosts _most_ of the mail accounts.  They are functionally
identical boxes.

Currently _both_ are running spamd and _both_ have AWL/Bayes/userpref
database tables on MySQL which are accessed locally and identically by
the spamd instance on each box.

My objective is only to unify the database tables supporting Bayes and
user preferences so that there's only one set of MySQL tables for the
users on both boxes.  Whether this involves the use of two spamd daemons
or one is the question.

Scenario 1:  spamc on box A communicates _over the network_ with spamd
on box B, which uses its _local_ config and Bayes/usrpref database to do
its work.

Scenario 2:  spamc on box A communicates with a _local_ spamd, which
accesses local config files but uses a MySQL connection _over the
network_ to box A to access the Bayes/userpref database.

Sorry if I wasn't entirely clear before.  I hope this clarifies the
choice, which looks at this point as if I'd be better off with #2.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 14:27 -0800, John Hardin wrote:
> If I may try:
> 
> The question is which is better, sending the message body (spamc <-> spamd 
> traffic) or database queries (spamd <-> mysql traffic) over the expensive 
> link?

Implicit point well make :-)  I think I agree with you.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread John Hardin

On Fri, 13 Feb 2009, Kris Deugau wrote:


 Although I appreciate your advice, my question here is not _whether_ I
 should do the integration, but which of the two methods of integrating
 the databases will be most efficient of bandwidth and other resources.


I'm getting confused again.  What components do you 
have running on which systems, and what are you trying to consolidate?


If I may try:

The question is which is better, sending the message body (spamc <-> spamd 
traffic) or database queries (spamd <-> mysql traffic) over the expensive 
link?


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Insofar as the police deter by their presence, they are very, very
  good. Criminals take great pains not to commit a crime in front of
  them. -- Jeffrey Snyder
---
 9 days until George Washington's 277th Birthday


Re: Two servers, one database. A question

2009-02-13 Thread Kris Deugau

Lindsay Haisley wrote:

On Fri, 2009-02-13 at 15:24 -0600, Lindsay Haisley wrote:

Although I appreciate your advice, my question here is not _whether_ I
should do the integration, but which of the two methods of integrating
the databases will be most efficient of bandwidth and other resources.


After thinking about it, Kris, I do think you're right about the choice,
although not for the reasons you gave.  spamc must pass an entire copy
of each email over the Internet to spamd on the 2nd box.  If I keep the
SA configurations synchronized between boxes, then the only thing which
needs to be shared across the Internet is Bayes processing, plus several
per-user choices as represented in the userpref table.  This _seems_ on
the face of it more efficient that passing off the entire email traffic,
which would have to transit the Internet connection between the boxes
twice.


*nod*  I don't know what kind of data size the Bayes SQL queries run, 
but it probably averages out somewhere close to a order of magnitude 
less than the full email.


I think I misread your original email, and I'm still not sure I 
understand exactly what your current configuration is, and what you're 
trying to achieve though.


-kgd


Re: Two servers, one database. A question

2009-02-13 Thread Kris Deugau

Lindsay Haisley wrote:

I think you misunderstand me.  If spamc on machine A is invoked with -d
 then spamc will use whatever databases and
configurations are in effect for spamd on machine B.  This is what the
-d option is for.  The "actual processing" is done by spamd, whichever
instance (machine A or B) is addressed by the spamc client, so I do have
a choice here, and that's what I want to decide on.  spamc is basically
just a passive client which reads and writes emails and passes off the
job of spam processing to spamd, wherever it may be.

If spamc on machine B uses it's local spamd instance (the same one
machine A is using) as a server, then the task I'm trying to do is
accomplished since both machines are ultimately using the same database.


   Ah, I think I see what you're asking.

I read that you were asking about whether/how to consolidate two 
separate MySQL instances each serving a local spamd on the same machine, 
to a single MySQL instance serving both machines' spamd.



The current load on what I've defined above as "machine B" and is quite
manageable, and this is the box that's now handling over 90% of traffic
to probably a couple of hundred mailboxes on the system.  The MySQL
tables used by SA are at well less than a gig on a box that has close to
half a TB of drive space on it, and SA has been running there for over a
year.  The system load avg runs consistently under 1 except when
cron-initiated maintenance happens.


Ah.  "hardware status == overkill"  


Although I appreciate your advice, my question here is not _whether_ I
should do the integration, but which of the two methods of integrating
the databases will be most efficient of bandwidth and other resources.


I'm getting confused again.  What components do 
you have running on which systems, and what are you trying to consolidate?


-kgd


Re: Two servers, one database. A question

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 15:24 -0600, Lindsay Haisley wrote:
> Although I appreciate your advice, my question here is not _whether_ I
> should do the integration, but which of the two methods of integrating
> the databases will be most efficient of bandwidth and other resources.

After thinking about it, Kris, I do think you're right about the choice,
although not for the reasons you gave.  spamc must pass an entire copy
of each email over the Internet to spamd on the 2nd box.  If I keep the
SA configurations synchronized between boxes, then the only thing which
needs to be shared across the Internet is Bayes processing, plus several
per-user choices as represented in the userpref table.  This _seems_ on
the face of it more efficient that passing off the entire email traffic,
which would have to transit the Internet connection between the boxes
twice.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 15:21 -0500, Kris Deugau wrote:
> Lindsay Haisley wrote:
> > I have two servers.  Currently they're both running instances of spamd
> > with separate mysql databases, however I'd like run both instances from
> > the same database on one of the servers. There are two ways to do this:
> > 
> > 1.  I can give the -d option to spamc where it's invoked in the mail
> > system, with the target being spamd on the master spamassassin server
> > via the VPN that connects the two boxes.  spamd is already configured to
> > listen to it.
> 
> Mm, I don't think this does what you're hoping.  spamd on any given 
> system will use the configured database (local or otherwise) - this is 
> **NOT** something the client can request.
> 
>  From man spamc:
> 
> -d host[,host2], --dest=host[,host2]
> In TCP/IP mode, connect to spamd server on given host
> (default: localhost).  Several hosts can be specified
> if separated by commas.
> 
> This only affects which spamd server the client asks to process the 
> message;  it doesn't affect any aspect of the actual processing.

I think you misunderstand me.  If spamc on machine A is invoked with -d
 then spamc will use whatever databases and
configurations are in effect for spamd on machine B.  This is what the
-d option is for.  The "actual processing" is done by spamd, whichever
instance (machine A or B) is addressed by the spamc client, so I do have
a choice here, and that's what I want to decide on.  spamc is basically
just a passive client which reads and writes emails and passes off the
job of spam processing to spamd, wherever it may be.

If spamc on machine B uses it's local spamd instance (the same one
machine A is using) as a server, then the task I'm trying to do is
accomplished since both machines are ultimately using the same database.

> > Does anyone with some experience with spamassassin know which of these
> > two approaches would be better?  Which would be fastest?  Which would be
> > most conservative of bandwidth between the boxes?
> 
> A lot depends on the hardware you're using.  If you're trying to squeeze 
> some last bits of performance out of a heavily-loaded system by 
> eliminating the SQL duplication, you'll probably have to tune the spamd 
> instances differently as well (eg, the system running MySQL won't be 
> able to support as many spamd children as the other one).  You haven't 
> said what's in MySQL for SA;  IME anything more than a couple of hundred 
> users suck up too much IO for per-user Bayes and/or AWL (not to mention 
> the staggering disk requirements - even at today's disk prices).

The current load on what I've defined above as "machine B" and is quite
manageable, and this is the box that's now handling over 90% of traffic
to probably a couple of hundred mailboxes on the system.  The MySQL
tables used by SA are at well less than a gig on a box that has close to
half a TB of drive space on it, and SA has been running there for over a
year.  The system load avg runs consistently under 1 except when
cron-initiated maintenance happens.

> The cluster I'm doing most of my SA tuning on these days currently has 3 
> machines running spamd, and a fourth running MySQL (and some other 
> unrelated services, otherwise it would run spamd as well).  Each machine 
> has the same SA config pointing to the same database on that fourth 
> machine - but clients don't see this, and can't affect it.
> 
> If the machines are not on the same local Ethernet segment, you're 
> probably better off leaving well enough alone, because any gains you 
> make in eliminating the SQL duplication will be lost waiting for data to 
> move across the network.  Or worse.

My intention here is to optimize administration, both for migration and
for those parts of SA for which I've programmed customer UIs.
Considering the number of checks involved in email by the MTA, what with
top level RBL checking (done by the MTA) and hitting SA twice, I don't
think waiting for one more transaction will be problematic.

Although I appreciate your advice, my question here is not _whether_ I
should do the integration, but which of the two methods of integrating
the databases will be most efficient of bandwidth and other resources.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Last-5-percent tuning

2009-02-13 Thread John Hardin

On Fri, 13 Feb 2009, Lindsay Haisley wrote:


On Fri, 2009-02-13 at 12:43 -0600, McDonald, Dan wrote:

On Fri, 2009-02-13 at 12:20 -0600, Lindsay Haisley wrote:

On Fri, 2009-02-13 at 17:43 +, Martin Gregorie wrote:

I've heard it said that IPV6 will...

You can always spoof an IP address of any type.  The only email header
you can trust absolutely is the topmost Received header in an email.
This address can't be spoofed.


Never say never or always, since never will always get you in trouble...


Oooh, good point :-)  Pigs _may_ someday fly.


Don't taunt the genetic engineers in the audience, please.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The one political issue that strips all politicians bare is
  individual gun rights.
---
 9 days until George Washington's 277th Birthday


Re: URI with spaces are not recognized

2009-02-13 Thread McDonald, Dan
On Fri, 2009-02-13 at 15:43 -0500, Kevin Parris wrote:
> Artificial intelligence will never overcome natural stupidity (or the
> clever ingenuity of criminals) ... if people actually DO that (copy
> the "url" and remove the spaces) there is some temptation to say they
> get what they deserve ... but on the other hand most of the spam/scam
> stuff out there is based on the premise that plenty of people are
> greedy, gullible, uninformed, overly trusting, stupid, or some
> combination of the above.


Whether they are clickable or not, they are still annoying. My hands are
only good for about 50,000 clicks per day, I don't want to waste any of
those on individual spams


-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com



signature.asc
Description: This is a digitally signed message part


Re: URI with spaces are not recognized

2009-02-13 Thread Wolfgang Zeikat
I think the discussion is getting carried in a direction where we are 
missing a point: spam detection.


Kevin Parris wrote:

Artificial intelligence will never overcome natural stupidity (or the
clever ingenuity of criminals) ... if people actually DO that (copy
the "url" and remove the spaces) there is some temptation to say they
get what they deserve ... but on the other hand most of the spam/scam
stuff out there is based on the premise that plenty of people are
greedy, gullible, uninformed, overly trusting, stupid, or some
combination of the above.


Franz Schwartau  02/13/09 2:18 PM >>>

You won't solve a problem by defining there is no problem.

In these spams people are requested to remove the spaces when
entering the given string ("url") in their browser.


IMHO, the point here is:
how can these obfuscated URI be detected as such and be submitted to 
URI(BL) rules, so that those mails can more easily be classified as what 
they are: spam - no matter what final recipients might "deserve" or do 
with them (or not).


Regards,

wolfgang




Re: Spamassassin not working after upgrade

2009-02-13 Thread Karsten Bräckelmann
> :0wf
> | /usr/bin/spamassassin

If there is even the slightest chance for a mail surge -- you probably
should add a lock file to that recipe.  (Not to mention using spamc
again, which you appear to already have switched to. ;)


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Spamassassin not working after upgrade

2009-02-13 Thread Karsten Bräckelmann
> This seems to have been SELinux related.  When I temporarily disable it,
> procmail is able to execute spamc and properly filter incoming messages. 
> Thanks for the suggestion.  This is a huge relief!

Ah, goodie. :)  Please file a bug with RH against SELinux, for both
permission denied issues (spamassassin and spamc) when called from
procmail.


> Karsten Bräckelmann wrote:
> >> procmail: Executing "/usr/bin/spamassassin"
> >> /bin/sh: /usr/bin/spamassassin: Permission denied
> >> procmail: Program failure (126) of "/usr/bin/spamassassin"
> >> procmail: Rescue of unfiltered data succeeded
> > 
> > RHEL5. Any chance this problem is SELinux related?

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: URI with spaces are not recognized

2009-02-13 Thread Kevin Parris
Artificial intelligence will never overcome natural stupidity (or the clever 
ingenuity of criminals) ... if people actually DO that (copy the "url" and 
remove the spaces) there is some temptation to say they get what they deserve 
... but on the other hand most of the spam/scam stuff out there is based on the 
premise that plenty of people are greedy, gullible, uninformed, overly 
trusting, stupid, or some combination of the above.

>>> Franz Schwartau  02/13/09 2:18 PM >>>
C'mon...

Patient: "Doctor, if I press down here it really hurts..."
Doctor: "Don't press there then."

You won't solve a problem by defining there is no problem.

In these spams people are requested to remove the spaces when entering the 
given string ("url") in their browser.

Benny Pedersen wrote:
> On Thu, February 12, 2009 18:26, Franz Schwartau wrote:
>> www . abcdef .  net
>>
>> After reading the source for a while I found that $schemelessRE in
>> line 1720 of Mail::SpamAssassin::PerMsgStatus.pm seems to be
>> responsible for that. Unfortunally this regexp doesn't care
>> about whitespaces.
> 
> give me a url to a browser that can show above url is simple :)
> 
> even my firefox in my nokia phone wont show this, did i miss another
> one ?
> 
>> Has anyone a solution?
> 
> none so far have a problem ?
> 
>> Would be fine if I could use the "uri" directive
>> or even some uribl on this kind of "urls".
> 
> it will if there was a problem




Re: Last-5-percent tuning

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 12:43 -0600, McDonald, Dan wrote:
> On Fri, 2009-02-13 at 12:20 -0600, Lindsay Haisley wrote:
> > On Fri, 2009-02-13 at 17:43 +, Martin Gregorie wrote:
> > > I've heard it said that IPV6 will...
> > You can always spoof an IP address of any type.  The only email header
> > you can trust absolutely is the topmost Received header in an email.
> > This address can't be spoofed.  
> 
> Never say never or always, since never will always get you in trouble...

Oooh, good point :-)  Pigs _may_ someday fly.

> > If it were, it would have been
> > technically impossible to send the email.
> 
> It might be hard to spoof, but not impossible if you are able to
> intercept the data path somewhere along the way.  Otherwise, there would
> be no reason to block bogons...

You can block a bogon, but you can't carry on a IP dialog using it
because by definition a bogon is an IP packet claiming to be from an
un-allocated IP address.  If an SMTP request comes in to your server
with a bogus originating address then there's no way to carry on an SMTP
exchange with the client on the other end, and hence no email.  QED.
DoS packets frequently use bogus origination addresses but these aren't
intended to establish two-way communication.

Yes, you can intercept the path and re-originate the IP traffic, which
is what firewalls often do, but in this case the originating IP address
is indeed a true address, and if the traffic is malicious, then said
address is implicated, either through intent or technical compromise
(hacked!).

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-13 Thread Kris Deugau

Lindsay Haisley wrote:

I have two servers.  Currently they're both running instances of spamd
with separate mysql databases, however I'd like run both instances from
the same database on one of the servers. There are two ways to do this:

1.  I can give the -d option to spamc where it's invoked in the mail
system, with the target being spamd on the master spamassassin server
via the VPN that connects the two boxes.  spamd is already configured to
listen to it.


Mm, I don't think this does what you're hoping.  spamd on any given 
system will use the configured database (local or otherwise) - this is 
**NOT** something the client can request.


From man spamc:

   -d host[,host2], --dest=host[,host2]
   In TCP/IP mode, connect to spamd server on given host
   (default: localhost).  Several hosts can be specified
   if separated by commas.

This only affects which spamd server the client asks to process the 
message;  it doesn't affect any aspect of the actual processing.



2.  I can let spamc invoke spamd on the local system but set the various
dsn params in secrets.cf to point to the MySQL database on the master
spamassassin server.  The mysql server on this box is already listening
for queries from the other system via the VPN that connects them.


If all you're looking to do is use a single MySQL instance, then this is 
your only choice.



Does anyone with some experience with spamassassin know which of these
two approaches would be better?  Which would be fastest?  Which would be
most conservative of bandwidth between the boxes?


A lot depends on the hardware you're using.  If you're trying to squeeze 
some last bits of performance out of a heavily-loaded system by 
eliminating the SQL duplication, you'll probably have to tune the spamd 
instances differently as well (eg, the system running MySQL won't be 
able to support as many spamd children as the other one).  You haven't 
said what's in MySQL for SA;  IME anything more than a couple of hundred 
users suck up too much IO for per-user Bayes and/or AWL (not to mention 
the staggering disk requirements - even at today's disk prices).


The cluster I'm doing most of my SA tuning on these days currently has 3 
machines running spamd, and a fourth running MySQL (and some other 
unrelated services, otherwise it would run spamd as well).  Each machine 
has the same SA config pointing to the same database on that fourth 
machine - but clients don't see this, and can't affect it.


If the machines are not on the same local Ethernet segment, you're 
probably better off leaving well enough alone, because any gains you 
make in eliminating the SQL duplication will be lost waiting for data to 
move across the network.  Or worse.


-kgd


Re: URI with spaces are not recognized

2009-02-13 Thread John Hardin

On Fri, 13 Feb 2009, McDonald, Dan wrote:


On Fri, 2009-02-13 at 11:55 -0800, John Hardin wrote:

On Fri, 13 Feb 2009, Franz Schwartau wrote:


So, does anyone know a more general solution for this kind of spam
instead of individual body rules?


You might try a rule like:

  body URI_SPC_OBFU_SPC 
/\bwww\s{1,20}\.\s{1,20}\w{5,20}\s{1,20}\.\s{1,20}net\b/i


I'd go a little further:

/\bwww\s{1,10}\.\s{1,10}\w{5,20}\s{1,10}\.\s{1,10}(?:com|net|org)\b/i


Well, yeah, that's of course possible. Only Franz knows the character of 
the domain TLDs he's seeing, though.


info and biz are two other much-abused TLDs.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Public Education: the bureaucratic process of replacing
  an empty mind with a closed one.  -- Thorax
---
 9 days until George Washington's 277th Birthday


Re: URI with spaces are not recognized

2009-02-13 Thread McDonald, Dan
On Fri, 2009-02-13 at 11:55 -0800, John Hardin wrote:
> On Fri, 13 Feb 2009, Benny Pedersen wrote:
> 
> > So, does anyone know a more general solution for this kind of spam
> > instead of individual body rules?
> 
> You might try a rule like:
> 
>   body URI_SPC_OBFU_SPC 
> /\bwww\s{1,20}\.\s{1,20}\w{5,20}\s{1,20}\.\s{1,20}net\b/i

I'd go a little further:

/\bwww\s{1,10}\.\s{1,10}\w{5,20}\s{1,10}\.\s{1,10}(?:com|net|org)\b/i

-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com



signature.asc
Description: This is a digitally signed message part


Re: URI with spaces are not recognized

2009-02-13 Thread John Hardin

On Fri, 13 Feb 2009, Benny Pedersen wrote:


On Fri, February 13, 2009 18:12, John Hardin wrote:

If a URI rule works, what's wrong with a body rule?


nothing wroung making bad rules either, point is that if bad rules
is needed one have also bad behaving browser problem


Why should the fact that a mail client won't render that URI as a 
clickable link mean there shouldn't be a rule for it? Spammers have been 
obfuscating URIs in this manner for a long time. There's nothing wrong 
with rules for obfuscated URIs.


OT: Benny, could you refrain from setting your Reply-To to the email 
address of the original poster? Setting it to the mailing list address is 
fine, but setting it to the original poster is just passive-aggressive 
rudeness.


On Fri, 13 Feb 2009, Franz Schwartau wrote:


So, does anyone know a more general solution for this kind of spam
instead of individual body rules?


You might try a rule like:

 body URI_SPC_OBFU_SPC /\bwww\s{1,20}\.\s{1,20}\w{5,20}\s{1,20}\.\s{1,20}net\b/i

I think it would be risky to make the URI parser attempt too much 
deobfuscation; however, accepting \s+\.\s+ as \. might be justified. 
Perhaps \s+dot\s+ as well.


If the spammer uses something more complex they're reducing the likelihood 
the recipient will bother to deobfuscate the URI, and it's more likely to 
be caught by bayes, so I'd suggest the ROI to SA for making it more 
aggressive isn't large enough.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Windows Vista: Windows ME for the XP generation.
---
 9 days until George Washington's 277th Birthday


Re: URI with spaces are not recognized

2009-02-13 Thread Benny Pedersen

On Fri, February 13, 2009 20:18, Franz Schwartau wrote:
> C'mon...

france

> Patient: "Doctor, if I press down here it really hurts..."
> Doctor: "Don't press there then."

thats real life, not email

> You won't solve a problem by defining there is no problem.

where is the problem ?, 40 cm from the screen or so ?

> In these spams people are requested to remove the spaces when
> entering the given string ("url") in their browser.

such users ask for problems :)

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: URI with spaces are not recognized

2009-02-13 Thread Franz Schwartau
C'mon...

Patient: "Doctor, if I press down here it really hurts..."
Doctor: "Don't press there then."

You won't solve a problem by defining there is no problem.

In these spams people are requested to remove the spaces when entering
the given string ("url") in their browser.

Benny Pedersen wrote:
> On Thu, February 12, 2009 18:26, Franz Schwartau wrote:
>> www . abcdef .  net
>>
>> After reading the source for a while I found that $schemelessRE in
>> line 1720 of Mail::SpamAssassin::PerMsgStatus.pm seems to be
>> responsible for that. Unfortunally this regexp doesn't care
>> about whitespaces.
> 
> give me a url to a browser that can show above url is simple :)
> 
> even my firefox in my nokia phone wont show this, did i miss another
> one ?
> 
>> Has anyone a solution?
> 
> none so far have a problem ?
> 
>> Would be fine if I could use the "uri" directive
>> or even some uribl on this kind of "urls".
> 
> it will if there was a problem



Re: URI with spaces are not recognized

2009-02-13 Thread Franz Schwartau
Hi John,

thanks for your answer. Probably I should have written more about my
problem.

We're getting a lot of spam with obfuscated urls in the form

www . domain .  net

The domain part changes quite often (about daily). The number of domains
is nearly 100 by now. Of course we have body rules for each domain/url
similar to your rule but our time to detect new domains/urls is too slow
(actually our customer has to tell us, that spam got through, which is
quite bad). All these "urls" point to the same content, resolve to the
same ip and are listed in some url black lists. Since spamassassin
doesn't recognize these obfuscated urls, url specific rules don't match.

So, does anyone know a more general solution for this kind of spam
instead of individual body rules?

Best regards
Franz

John Hardin wrote:
> On Fri, 13 Feb 2009, Benny Pedersen wrote:
> 
>> On Thu, February 12, 2009 18:26, Franz Schwartau wrote:
>>> www . abcdef .  net
>>> Would be fine if I could use the "uri" directive
> 
> If a URI rule works, what's wrong with a body rule?
> 
> body URI_SPC_OBFU_nn
> /\bwww\s{1,20}\.\s{1,20}abcdef\s{1,20}\.\s{1,20}net\b/i



Re: Last-5-percent tuning

2009-02-13 Thread McDonald, Dan
On Fri, 2009-02-13 at 12:20 -0600, Lindsay Haisley wrote:
> On Fri, 2009-02-13 at 17:43 +, Martin Gregorie wrote:
> > I've heard it said that IPV6 will...
> You can always spoof an IP address of any type.  The only email header
> you can trust absolutely is the topmost Received header in an email.
> This address can't be spoofed.  

Never say never or always, since never will always get you in trouble...

> If it were, it would have been
> technically impossible to send the email.

It might be hard to spoof, but not impossible if you are able to
intercept the data path somewhere along the way.  Otherwise, there would
be no reason to block bogons...


-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com



signature.asc
Description: This is a digitally signed message part


Re: Last-5-percent tuning

2009-02-13 Thread Kurt Buff
IPv6 will not banish NAT. It's too useful for other purposes.

On Fri, Feb 13, 2009 at 9:43 AM, Martin Gregorie  wrote:
> On Fri, 2009-02-13 at 18:01 +0100, Benny Pedersen wrote:
>> On Thu, February 12, 2009 19:29, John Hardin wrote:
>> > Ultimately that's what you have to do. The only way to automatically
>> > filter 100% of spam is to unplug your MTA from the 'net.
>>
>> unless one implement policyd to whitelist known senders and greylist
>> the rest and or whois sender ip and or sender domain, shame its not
>> pr recipient anywhere, in a perfect world there was no spam then
>>
> I've heard it said that IPV6 will put paid to privacy for
> whistle-blowers etc because, with that fully implemented, NAT will
> vanish and all IPs will be unique. By implication they'd be unspoofable,
> though I'm not sure I believe that. However, if that's true it will also
> leave the spammers out in the open.
>
> Martin
>
>
>


Re: URI with spaces are not recognized

2009-02-13 Thread Benny Pedersen

On Fri, February 13, 2009 18:12, John Hardin wrote:
> If a URI rule works, what's wrong with a body rule?

nothing wroung making bad rules either, point is that if bad rules
is needed one have also bad behaving browser problem

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: Last-5-percent tuning

2009-02-13 Thread Lindsay Haisley
On Fri, 2009-02-13 at 17:43 +, Martin Gregorie wrote:
> I've heard it said that IPV6 will put paid to privacy for
> whistle-blowers etc because, with that fully implemented, NAT will
> vanish and all IPs will be unique.

Mail servers, of necessity, _do_ use unique IPs, whether v4 or v6.  

>  By implication they'd be unspoofable,
> though I'm not sure I believe that.

If you want to learn more about IPv6, I suggest "IPv6 Essentials" by
Silvia Hagen, pub. by O'Reilly & Assoc.

You can always spoof an IP address of any type.  The only email header
you can trust absolutely is the topmost Received header in an email.
This address can't be spoofed.  If it were, it would have been
technically impossible to send the email.

-- 
Lindsay Haisley   | "Everything works|Accredited
FMP Computer Services |   if you let it" |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Last-5-percent tuning

2009-02-13 Thread Martin Gregorie
On Fri, 2009-02-13 at 18:01 +0100, Benny Pedersen wrote:
> On Thu, February 12, 2009 19:29, John Hardin wrote:
> > Ultimately that's what you have to do. The only way to automatically
> > filter 100% of spam is to unplug your MTA from the 'net.
> 
> unless one implement policyd to whitelist known senders and greylist
> the rest and or whois sender ip and or sender domain, shame its not
> pr recipient anywhere, in a perfect world there was no spam then
> 
I've heard it said that IPV6 will put paid to privacy for
whistle-blowers etc because, with that fully implemented, NAT will
vanish and all IPs will be unique. By implication they'd be unspoofable,
though I'm not sure I believe that. However, if that's true it will also
leave the spammers out in the open.

Martin




Re: URI with spaces are not recognized

2009-02-13 Thread John Hardin

On Fri, 13 Feb 2009, Benny Pedersen wrote:


On Thu, February 12, 2009 18:26, Franz Schwartau wrote:

www . abcdef .  net
Would be fine if I could use the "uri" directive


If a URI rule works, what's wrong with a body rule?

body URI_SPC_OBFU_nn /\bwww\s{1,20}\.\s{1,20}abcdef\s{1,20}\.\s{1,20}net\b/i

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The fetters imposed on liberty at home have ever been forged out
  of the weapons provided for defense against real, pretended, or
  imaginary dangers from abroad.   -- James Madison, 1799
---
 9 days until George Washington's 277th Birthday


Re: Last-5-percent tuning

2009-02-13 Thread Benny Pedersen

On Thu, February 12, 2009 19:29, John Hardin wrote:
> Ultimately that's what you have to do. The only way to automatically
> filter 100% of spam is to unplug your MTA from the 'net.

unless one implement policyd to whitelist known senders and greylist
the rest and or whois sender ip and or sender domain, shame its not
pr recipient anywhere, in a perfect world there was no spam then

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: URI with spaces are not recognized

2009-02-13 Thread Benny Pedersen

On Thu, February 12, 2009 18:26, Franz Schwartau wrote:
> www . abcdef .  net
>
> After reading the source for a while I found that $schemelessRE in
> line 1720 of Mail::SpamAssassin::PerMsgStatus.pm seems to be
> responsible for that. Unfortunally this regexp doesn't care
> about whitespaces.

give me a url to a browser that can show above url is simple :)

even my firefox in my nokia phone wont show this, did i miss another
one ?

> Has anyone a solution?

none so far have a problem ?

> Would be fine if I could use the "uri" directive
> or even some uribl on this kind of "urls".

it will if there was a problem

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: Spamassassin not working after upgrade

2009-02-13 Thread nycsurf

This seems to have been SELinux related.  When I temporarily disable it,
procmail is able to execute spamc and properly filter incoming messages. 
Thanks for the suggestion.  This is a huge relief!

Best,
Greg




Karsten Bräckelmann-2 wrote:
> 
>> I recently upgraded to spamassassin-3.2.5-1.el5 using up2date and
>> spamassassin is no longer filtering messages. Spamassassin correctly
>> identifies the sample spam message when I do
> [...]
>> I've googled extensively to see if anyone else is having this problem and
>> what possible solutions might be, but nothing that I've tried (changing
>> config files, restarting spamd, etc.) has worked.
> 
> Uhm, according to your procmail logs below, you are not using spamd
> anyway. I do however strongly recommend to do so -- that is, in procmail
> use spamc instead of 'spamassassin'.
> 
> This will result in less load on the server and faster mail processing,
> since spamassassin doesn't have to be started for each mail. The spamd
> daemon needs to be running for that.  (Yes, this isn't related to the
> issue at hand.)
> 
> 
>> Here is the relevant part of the log file for a sample email after
>> turning
>> the verbose option on in .procmailrc:
> 
>> procmail: Executing "/usr/bin/spamassassin"
>> /bin/sh: /usr/bin/spamassassin: Permission denied
>> procmail: Program failure (126) of "/usr/bin/spamassassin"
>> procmail: Rescue of unfiltered data succeeded
> 
> RHEL5. Any chance this problem is SELinux related?
> 
> 
> -- 
> char
> *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
> }}}
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Spamassassin-not-working-after-upgrade-tp21982029p21999350.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Cyrillic charsets normalization

2009-02-13 Thread Makoev Alan
Here was recently a discussion on "charset normalization" feature (see e.g. 
http://markmail.org/message/hvdtbca6lm5tsjtm?q=list:org.apache.spamassassin.users+date:200901+&page=42)
I ran a simple check of results Encode::Detect::Detector facility yields.
I selected manually a set of 39 spam messages in Russian (those that were not 
MIME-encoded so I could check them by just tapping F3 in mc) - 32 with KOI8-R 
encoding, 6 with CP-1251 and 1 UTF-8. After that I ran the a simple script that 
feeds message body to Encode::Detect::Detector::detect, and got the following:
- among 6 CP-1251 messages 1 was detected as Mac-Cyrillic (which might be 
pardonable when making texts for humans, since these encodings differ only in 2 
letters, but it may affect negatively text analysis results) and 1 was not 
recognized at all (Encode::Detect::Detector::detect returned "undef");
- among 32 KOI8-R messages 3 were detected as CP-1255 (Hebrew);
- 1 UTF-8 message was detected correctly.
Of course, this set is by no means representative, but it illustrates possible 
drawbacks in using "normalize_charset" option.
Strictly speaking, one could expect such result since the tricks widely used by 
spammers (replacing cyrillic letters with similar-looking latin ones, replacing 
digits with letters that look similar to digits and vice versa, adding random 
letter sequences to poison bayes, etc.) should affect the detection result.
And despite that SA ignores "charset=" statement in "Content-type:" header 
field. So my question is: Is it just due to developers' time shortage, or there 
are some reasons for avoiding using the charset indicated in the header field 
as a source charset for normalization?



Re: Two servers, one database. A question

2009-02-13 Thread Andre


On Thu, 12 Feb 2009, Lindsay Haisley wrote:

> I have two servers.  Currently they're both running instances of spamd
> with separate mysql databases, however I'd like run both instances from
> the same database on one of the servers. There are two ways to do this:
>
> 1.  I can give the -d option to spamc where it's invoked in the mail
> system, with the target being spamd on the master spamassassin server
> via the VPN that connects the two boxes.  spamd is already configured to
> listen to it.

I'd prefer the above for the following reason: you only need to worry
about a single spamassassin server (as long as it can hold up to the
load). You prevent inconsistencies when upgrading etc.

>
> 2.  I can let spamc invoke spamd on the local system but set the various
> dsn params in secrets.cf to point to the MySQL database on the master
> spamassassin server.  The mysql server on this box is already listening
> for queries from the other system via the VPN that connects them.
>
> Does anyone with some experience with spamassassin know which of these
> two approaches would be better?  Which would be fastest?  Which would be
> most conservative of bandwidth between the boxes?

'Fastest' depends on the load on the servers.
Bandwidth will depend on how large your average message is, and what you
store in the database (user prefs, awl, bayes...)

-andre

>
> --
> Lindsay Haisley   | "Everything works|Accredited
> FMP Computer Services |   if you let it" |  by the
> 512-259-1190  |(The Roadie)  |   Austin Better
> http://www.fmp.com|  |  Business Bureau
>