Bayes using Redis backend

2013-06-19 Thread Axb
Is there anybody using SA's bayes with the Redis backend? If yes, please raise your hands. (I know of three, you can keep your hand down :) An imminent SA RC release will require a DB format change. Questions & comments welcome.

Re: New rule for HTML spam, using comments?

2013-06-19 Thread cepheid
Hi John, See the following example: http://pastebin.com/DAYJ7NnJ Lots of style gibberish for sure, but it failed to hit your rule (sa-update ran at 4am today so it should have picked up anything published). I'm guessing it's the parentheses. Whack the mole! =)

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Axb
On 06/19/2013 10:11 PM, ceph...@3phase.com wrote: Hi John, See the following example: http://pastebin.com/DAYJ7NnJ Lots of style gibberish for sure, but it failed to hit your rule (sa-update ran at 4am today so it should have picked up anything published). I'm guessing it's the parentheses.

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir Caspi
Another, nearly identical example I saw today , but which used trailing slashes (/ or //) instead of parentheses. http://pastebin.com/6XRwcjm3 Enjoy. =) --- Amir On Wed, June 19, 2013 2:11 pm, ceph...@3phase.com wrote: > Hi John, > > See the follo

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir Caspi
On Wed, June 19, 2013 2:33 pm, Axb wrote: > imo, it makes little sense to write rules to catch these hashbusters. As If the rule is sufficiently broad, it will catch them. If the rule is so strict that it catches only one trailing slash or something, then yes, it makes little sense... but I think

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Axb
On 06/19/2013 10:54 PM, Amir Caspi wrote: Perhaps SA should include a module/plugin to "unmunge" MailScanner munging? Has anyone written one, or if not, would anyone like to? ;-) (Since MailScanner is open-source perl, I imagine it should be relatively straightforward to find the munging code, w

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir 'CG' Caspi
On Wed, June 19, 2013 3:14 pm, Axb wrote: > iirc, MailScanner munges the URL befor SA sees it so unless your plugin > idea involves a crystal ball, it's not possible. Yes, MailScanner gets to it before SA does, unless SA is called from within MailScanner (which it isn't, on my setup, but that is a

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Axb
On 06/19/2013 11:30 PM, Amir 'CG' Caspi wrote: Yes, MailScanner gets to it before SA does, unless SA is called from within MailScanner (which it isn't, on my setup, but that is a possible setup). However, the complete original URL is still contained within the munged one. It's in the alt attri

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir 'CG' Caspi
On Wed, June 19, 2013 3:47 pm, Axb wrote: > SA's URIBL plugin doesn't and shouldn't look in the alt attribute. Why not, exactly? I wouldn't look at it for _all_ img tags, only for ones that are clearly MailScanner-munged. That is, one would look for the patterns that MailScanner uses for munging