[SAtalk] Re: Bigevil and thoughts....

2004-01-30 Thread Scott A Crosby
On Fri, 23 Jan 2004 12:30:13 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > I received a report of an FP in bigevil. The domain was > playaudiomessage.com. A quick google shows tons of hits in > news.admin.net-abuse.sightings. It had been my hope the bigevil > would be ZERO fp. However I'm no

[SAtalk] Re: Bigevil and thoughts....

2004-01-30 Thread Scott A Crosby
On Thu, 29 Jan 2004 14:44:36 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > > I'm not saying that the domain should be forgotten, but that iit > > should at least be in a different list. > > > > 'Bigevil.cf' -- never once seen in ham. > > 'Maybeevil.cf' -- a small number of hits in ham > >

[SAtalk] Re: [RD] spammer reactions to antidrug (humorous)

2004-01-30 Thread Scott A Crosby
On Fri, 30 Jan 2004 10:55:07 -0500, Matt Kettler <[EMAIL PROTECTED]> writes: > Today I got an interesting form of obfuscation, apparently to avoid > antidrug.cf. > > > I'm not sure wether to bother with adding rules for this, or be > satisfied that the obfuscations are so severe that the message

[SAtalk] Re: Can someone explain this?

2004-01-30 Thread Scott A Crosby
On Fri, 30 Jan 2004 10:42:31 -0600, "Chris Barnes" <[EMAIL PROTECTED]> writes: > I'm confused. A spam message got through and had this in the header: > > > X-Spam-Status: No, hits=5.0 required=5.0 > tests=HTML_60_70,HTML_IMAGE_ONLY_04, > HTML_MESSAGE,HTML_WEB_BUGS,LOCAL_PERLMX_TAG_80,MSGID_FRO

[SAtalk] Re: This spam scores too low

2004-01-21 Thread Scott A Crosby
On 21 Jan 2004 12:13:40 -0600, Scott A Crosby <[EMAIL PROTECTED]> writes: > On Wed, 21 Jan 2004 12:57:55 +0100, Ralf Vitasek <[EMAIL PROTECTED]> writes: > > > Hi Jürgen! > > > > you need some rules for SA which can detect obfuscated spellings of > >

[SAtalk] Re: This spam scores too low

2004-01-21 Thread Scott A Crosby
On Wed, 21 Jan 2004 12:57:55 +0100, Ralf Vitasek <[EMAIL PROTECTED]> writes: > Hi Jürgen! > > you need some rules for SA which can detect obfuscated spellings of > those keywords like vagira, cilais a.s.o. > > heres a sample rule i normally use for such words > > body MY_OBF1 > /((?!*censored*)

[SAtalk] Re: More obfuscation

2004-01-20 Thread Scott A Crosby
On Tue, 20 Jan 2004 16:37:27 -0500 (EST), Charles Gregory <[EMAIL PROTECTED]> writes: > I'm starting to see mail with TEXT obfuscation, such as: >I heard you need viagrPa. > Note the capital P thrown in to our favorite 'v' word. > It is really beginning to look like we need a genuine spelling

[SAtalk] V-drug spam gets *0* hits on SA 2.55

2004-01-20 Thread Scott A Crosby
Read it and weep. :( Next question, how was it sent? The Received headers look relatively legit, so was this sent from a trojaned AOL user? I have *got* to implement that fuzzy matching algorithm. Scott --- Begin Message --- pronounce, How Vigras works. And you can better understand, what Vig

[SAtalk] Re: Matching a list of strings quickly.

2004-01-20 Thread Scott A Crosby
On Mon, 19 Jan 2004 22:47:07 -0800, "Mitch (WebCob)" <[EMAIL PROTECTED]> writes: > Question - your from doens't match your to in the final example - right? Yes. I thought that pasting in a 300 line exerpt would be counterproductive. Scott ---

[SAtalk] Matching a list of strings quickly.

2004-01-19 Thread Scott A Crosby
A few weeks ago I described a technique to automatically convert a list of strings into a factored regexp for faster matching. You know, from foobat foobang fooziit to foo(bat|bang|ziit) Well, I've got a prototype complete and available here: http://www.cs.rice.edu/~scrosby/datami

[SAtalk] Re: Looking for comments on this rule: EMAIL in URL

2004-01-18 Thread Scott A Crosby
On Sun, 18 Jan 2004 23:51:00 -0500, Tim B <[EMAIL PROTECTED]> writes: > ack just shoot my copy and past cleanup. > > uri MY_EMAILINURL_1/https?:([EMAIL PROTECTED])/i This an be subject to a mild denial of service attack. You probably mean to use '[EMAIL PROTECTED]' and '[^.]' instead of '.'

[SAtalk] Re: common patterns / improving bigevil

2004-01-18 Thread Scott A Crosby
On Sun, 18 Jan 2004 17:41:00 +0100, PieterB <[EMAIL PROTECTED]> writes: > Hi, > > I have an idea, similar to Scott A Crosby's datamining application. > I didn't use a datamining/analysis program, but used the Bayes > database. For example if you use: > > sa-learn --dump all | grep "^0\.999

[SAtalk] A new automatic tool for finding common patterns in spam

2004-01-18 Thread Scott A Crosby
I'm putting up a demo/prototype of some new techniques I'm building for datamining and analysis. This tool scans two large corpi of 500mb or more of email to identify any substrings that occurs frequently in one but infrequently in the other. You can choose the limits for 'frequently' and 'infrequ

[SAtalk] Re: Ann: "Rules De Jour": An automated way to keep up with the latest rulesets

2004-01-17 Thread Scott A Crosby
On Sat, 17 Jan 2004 10:15:02 -0700, [EMAIL PROTECTED] (Bob Proulx) writes: > Chris Thielen wrote: > > "Rules De Jour": An automated way to keep up with the latest rulesets. > > http://www.exit0.us/index.php/RulesDeJour > > # Get latest SpamAssassin rules. Runs at 4:28AM every day. > 28 4 * *

[SAtalk] Re: Spam Collecting

2004-01-16 Thread Scott A Crosby
On Fri, 16 Jan 2004 08:51:34 -0800, cube <[EMAIL PROTECTED]> writes: > Does anyone have a good way of collecting ham for the bayesian > filters. I can collect spam quite easily, but mixed in with my ham > is all kinds of spam. (There is a buttload of spam with less hits > than 1.) I manually cl

[SAtalk] Re: Korean Spam

2004-01-16 Thread Scott A Crosby
On Fri, 16 Jan 2004 16:42:03 +0100, jean-christophe valiere <[EMAIL PROTECTED]> writes: > Hi, > > Does somebody recieve korean spam or more generally asian spam. > One of my customer recieve about 60 asian spam per day and around > 10 of them are nor stopped by spamassassi

[SAtalk] Re: New HTML spam body obfuscation.

2004-01-13 Thread Scott A Crosby
On Tue, 13 Jan 2004 16:02:36 -0600, "Dallas L. Engelken" <[EMAIL PROTECTED]> writes: > > body JAVASCRIPT_ENCODING_1 /\b(?:\d{1,3}[\s\,]+){8}/ > > describe JAVASCRIPT_ENCODING_1 Contains comma seperated > > ascii representations score 0.1 # you can score this by > > itself if you want. > > >

[SAtalk] Eurika? A baysean model for dealing with bayes poison.

2004-01-08 Thread Scott A Crosby
On Thu, 08 Jan 2004 20:50:17 -0800, Chris Petersen <[EMAIL PROTECTED]> writes: > > Is there a > > SpamAssassin fix for it or some test I can increase to fix it?? > > Check the "lots of random words" thread from today for a couple of > (probably short-term) solutions... Other than that, I'm hopi

[SAtalk] Re: Making bigevil faster by finding common prefixes

2004-01-07 Thread Scott A Crosby
On Wed, 7 Jan 2004 11:03:35 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > > Are you having trouble doing the conversion automatically? > > Yup ;) > > > I can > > describe the algorithm to transform the regexps and to find > > maximum-size prefixes if you (or someone else) wants to > > imp

[SAtalk] Re: Spell Checking the Subject Header (RESULTS)

2004-01-07 Thread Scott A Crosby
On Tue, 6 Jan 2004 20:34:13 -0800, Robert Menschel <[EMAIL PROTECTED]> writes: > I have just updated my masscheck script, so future reports should look > more like: > > score RM_u_UnsubscribePHP3.000 # Dec 2003; 218s/0h of 81383 corpus > > (65609s/15774h) Thanks! Scott --

[SAtalk] Re: Spell Checking the Subject Header (RESULTS)

2004-01-06 Thread Scott A Crosby
On Tue, 30 Dec 2003 13:48:17 -0600, "Dallas L. Engelken" <[EMAIL PROTECTED]> writes: > > # SUBJ_SPELLING_00 -- 2283s/1850h of 10971 corpus, 2003-12-30 > # > This doesn't tell

[SAtalk] Making bigevil faster by finding common prefixes

2004-01-06 Thread Scott A Crosby
On Wed, 24 Dec 2003 10:59:50 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > Updated from last few days. Rules 20-23 have been played with a little. > Attempting to make the ruleset faster. I have some issues with doing the > rules this way, so I'm testing them out. Are you having trouble do

[SAtalk] Re: Bigevil 2.05d posted and regex question....

2003-12-27 Thread Scott A Crosby
On Mon, 22 Dec 2003 15:16:34 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > Updated from this weekends spam. That one Guy selling the Vdrug had about 8 > more domains. > > If I work the regex even further so it reads: > > (?:domain1|domain2|domain3)\.com > > rather then: > > (?:domain1\.

[SAtalk] Re: We have big evil now we need big good...

2003-12-20 Thread Scott A Crosby
On Sat, 20 Dec 2003 09:52:00 -0800, "Gary Smith" <[EMAIL PROTECTED]> writes: > So we implemented SA some time ago because our clients were getting > too much spam. Lately we have found that several html marked up > emails have been getting marked as spam. These ones are clearly > fp's. > > Some

[SAtalk] Re: checking outgoing mail

2003-12-20 Thread Scott A Crosby
On Sat, 20 Dec 2003 12:51:20 -0600, David Gibbs <[EMAIL PROTECTED]> writes: > Jeff Koch wrote: > > Good grief. What a 'holier than thou' attitude. > > Not in the slightest ... you didn't mention you had customers that might > be spammers (I won't touch that). > > Based on your original post, it

[SAtalk] Re: Clever spam (first of many, I'm afraid...)

2003-12-15 Thread Scott A Crosby
On Mon, 15 Dec 2003 14:38:59 -0600, Brad Koehn <[EMAIL PROTECTED]> writes: > Any spammer worth his salt runs his message through SA and other > popular anti-spam tools as best he can. Most of SA is relatively > static and slow to respond to changes in message content. The problem > comes in a few

[SAtalk] Re: RD: "justified" HTML

2003-12-15 Thread Scott A Crosby
's a Perl efficiency thing. If you are confident that your rule > works OK, you may want to change it slightly to avoid the braces. Look > in the archives for recent postings from Scott A Crosby. > <http://search.gmane.org/search.php?group=gmane.mail.spam.spamassassin.general&que

[SAtalk] Re: Clever spam (first of many, I'm afraid...)

2003-12-14 Thread Scott A Crosby
On Sun, 14 Dec 2003 14:23:21 -0500 (EST), "Carl R. Friend" <[EMAIL PROTECTED]> writes: >I've got a prototype eval() that looks at the incidence of > what I tentatively call "smallwords" (the "glue" words like "is", > "a", "the", "and", &c.) that hold the English language together > and flags a

[SAtalk] Re: Clever spam (first of many, I'm afraid...)

2003-12-14 Thread Scott A Crosby
On 14 Dec 2003 12:47:58 -0500, Rubin Bennett <[EMAIL PROTECTED]> writes: > And, has anyone given any thought to working the SA engine up in C or > something faster than Perl? I've seen many issues with system resources > and SA, and the answer keeps coming back as one of two responses: It is not

[SAtalk] Re: [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-11 Thread Scott A Crosby
On Thu, 11 Dec 2003 08:42:16 -0800, "Gary Funck" <[EMAIL PROTECTED]> writes: > > > > > One implementation might be to convert the rewrite rules into an > > > equivalent flex description, and let flex generate the automaton in > > > C. Compile the C, and build a Perl binding to it. > > Scott repl

[SAtalk] Re: Detecting strings of Gibberish

2003-12-11 Thread Scott A Crosby
On Thu, 11 Dec 2003 09:49:48 -0600, Larry Starr <[EMAIL PROTECTED]> writes: > I have noticed that many SPAM emails, end with seversl lines of gibberish, > such as: > > lvwpdfobv qkviylqr qlmwacbc hpimhdty > mdmrkb lvivhdc xovwul wpcxeqj > lhaxomaje vrucjj ybxegs > > > Has any

[SAtalk] Re: [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-11 Thread Scott A Crosby
On Thu, 11 Dec 2003 07:31:10 -0800, "Gary Funck" <[EMAIL PROTECTED]> writes: > > The major catch with this particular implementation is that it cannot > > deal with nondeterministic transformations. What this means is that > > any consequent for a substitute rule must be a single character. ( '4

[SAtalk] Re: [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-11 Thread Scott A Crosby
On 11 Dec 2003 08:11:43 +0200, [EMAIL PROTECTED] writes: > Getting back on topic, the problem with a stepwise normalization of > the message is that you sort of assume that transformations are > applied consistently and mechanically. What would be really neat would > be to have an automaton which

[SAtalk] Re: Habeas test

2003-12-09 Thread Scott A Crosby
On Tue, 09 Dec 2003 10:18:35 -0800, Kelson Vibber <[EMAIL PROTECTED]> writes: > Charles Gregory wrote: > > There is nothing technically 'magical' about the Habeas > > headers. They could simply be faked. Habeas Corp says that they make > > use of copyright laws to pursue legal action against spamm

[SAtalk] Re: Generic V-whatever drug with no GV rule hits (fwd)

2003-12-09 Thread Scott A Crosby
On Mon, 8 Dec 2003 18:25:49 -0600 (CST), David B Funk <[EMAIL PROTECTED]> writes: > > > ie: V...i..a.gr..a > > > > > > As I suggested in my email, there's lots of combinations that spammers > > > can do to avoid the original rule. There's also lots of ways to > > > construct the rule to get a broa

[SAtalk] Re: Generic V-whatever drug with no GV rule hits (fwd)

2003-12-08 Thread Scott A Crosby
On Mon, 08 Dec 2003 16:43:15 -0500, Matt Kettler <[EMAIL PROTECTED]> writes: > At 04:33 PM 12/8/2003, David B Funk wrote: > >Small enhancement suggestion, modify each one of those '\W' with '?' > >thus making successive obfuscating characters optional. With your > >rule there -must- be an obfuscat

[SAtalk] Re: New spammer trick (HTML tables)?

2003-12-08 Thread Scott A Crosby
On Mon, 8 Dec 2003 10:23:25 -0500, Pedro Sam <[EMAIL PROTECTED]> writes: > > Personally, I think the fundamental problem is HTML. HTML is too > > powerful of a display language to be filtered, and thats before > > JavaScript is added into the mix. Just look at the URL above. Almost > > all of thos

[SAtalk] Re: New spammer trick (HTML tables)?

2003-12-08 Thread Scott A Crosby
On 08 Dec 2003 11:00:22 +0200, [EMAIL PROTECTED] writes: > On 06 Dec 2003 17:21:54 -0600, Scott A Crosby <[EMAIL PROTECTED]> > posted to spamassassin-talk: > It would be good to have a rule to match the general pattern. It's > probably too much work to generate that s

[SAtalk] Re: New spammer trick (HTML tables)?

2003-12-06 Thread Scott A Crosby
On Sat, 6 Dec 2003 22:04:15 + (GMT), Martin Radford <[EMAIL PROTECTED]> writes: > Hi all, > > I don't know how new this trick is, but I've not seen it before -- the > spammer is using HTML tables to break up the message content. Also, > most of the interesting words are mis-spelled. It does

[SAtalk] Re: Simplifying BigEvilList rules

2003-12-05 Thread Scott A Crosby
On Fri, 5 Dec 2003 10:21:31 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > Basically you see the rules now in Alpha order. This is because I cat >> all > my lists together for the last few months, sorted, and ran uniq. My scrpits > for writing the rules work with 2 formats: > > 1 domain per

[SAtalk] Re: What is this? Bayes poison?

2003-12-05 Thread Scott A Crosby
On Fri, 05 Dec 2003 02:07:01 -0500, Bryan Hoover <[EMAIL PROTECTED]> writes: > Buy hard to get V-a-l-i-u-m, [EMAIL PROTECTED], P.r.o.z.a.c and much more > on.line!! Look for: V[. /_,-]*a[. /_,-]*l[. /_,-]*i[. /_,-]*u[. /_,-]*m P[. /_,-]*r[. /_,-]*o[. /_,-]*z[. /_,-]*a[. /_,-]*c F[. /_,

[SAtalk] Re: Simplifying BigEvilList rules

2003-12-04 Thread Scott A Crosby
On Thu, 04 Dec 2003 16:21:14 -0800, Greg Webster <[EMAIL PROTECTED]> writes: > Excellent. I am in agreement. > > I've sent a raw list of all the urls in the rules to Chris Santerre wish > a promise that one I find some time I'll write up some perl code to > clean up and form rules out of them. >

[SAtalk] Re: Simplifying BigEvilList rules

2003-12-04 Thread Scott A Crosby
On Thu, 04 Dec 2003 11:43:30 -0800, Greg Webster <[EMAIL PROTECTED]> writes: > Seems like it would be much better to simplify and shorten these rules > with better regexp. > > Samples: > rawbody BigEvilList_22 > /\b(?:agnitum\.com|ahamembership\.com|aicpa-eca\.org|aic > pa\.org|aih01\.com|ai\.h

[SAtalk] Re: BIG HUGE EVIL RULE NEWS!!!!

2003-12-04 Thread Scott A Crosby
On Thu, 4 Dec 2003 16:39:35 -0500, Chris Santerre <[EMAIL PROTECTED]> writes: > These are definetley in SPAM and HAM. Same goes for the xmr3.com domain I > was talking about earlier. So Now I have a problem. When I said I wanted > zero FPs, I didn't forsee the fact the spammers were using some of

[SAtalk] Re: LOL - ascii art spam

2003-12-03 Thread Scott A Crosby
On Wed, 3 Dec 2003 09:18:10 -0200, Marcio Merlone <[EMAIL PROTECTED]> writes: > On Tue, 2 Dec 2003 21:34:19 -0600 > "Yackley, Matt" <[EMAIL PROTECTED]> wrote: > > > Damn are we back in the good old BBS days? :) > > > > I received this one today and damn near fell of my chair laughing. > > >

[SAtalk] Re: gibberish hook?

2003-11-27 Thread Scott A Crosby
On 27 Nov 2003 10:06:46 +0200, [EMAIL PROTECTED] writes: > On 27 Nov 2003 01:13:04 -0600, Scott A Crosby <[EMAIL PROTECTED]> > posted to spamassassin-devel and spamassassin-talk: > > On Wed, 26 Nov 2003 14:17:30 +0600, Alexander Litvinov > > <[EMAIL PROTECTED]>

[SAtalk] Re: gibberish hook?

2003-11-27 Thread Scott A Crosby
On Wed, 26 Nov 2003 14:17:30 +0600, Alexander Litvinov <[EMAIL PROTECTED]> writes: > > Solution is to learn a monogram, bigram and trigram character model > > for the ham you recieve. Mix the statistics together (to account for > > partial information) and that'll be very good at detecting gibberi

[SAtalk] Re: Re[2]: New to Spamassassin

2003-11-26 Thread Scott A Crosby
On Wed, 26 Nov 2003 16:02:05 -0800, Robert Menschel <[EMAIL PROTECTED]> writes: > Before you go through the pain of intensive Bayes learning with all that > transfering around (and work needed to keep full headers and such), you > may want to just let Bayes auto-learn work. > > Widen the auto-le

[SAtalk] Re: A faster and more scalable matching engine.

2003-11-26 Thread Scott A Crosby
On Wed, 26 Nov 2003 16:39:24 -0500, Pedro Sam <[EMAIL PROTECTED]> writes: > On November 26, 2003 12:58 pm, Kris Deugau wrote: > > Which is next to useless for even 30-40 rules.  You would have to have > > n![1] states for n rules- there is NO way to determine which individual > > rules will trigge

[SAtalk] Re: A faster and more scalable matching engine.

2003-11-26 Thread Scott A Crosby
On Wed, 26 Nov 2003 12:58:19 -0500, Kris Deugau <[EMAIL PROTECTED]> writes: > Pedro Sam wrote: > > hehe, we can give a state to each possible combination of HITS for the > > rules. So if rules 1, 3, 5, 7 hit, we give that a state, and if 2, 4, > > 6, 8 hit, we give it another state, and so on...

[SAtalk] Re: A faster and more scalable matching engine.

2003-11-25 Thread Scott A Crosby
On Tue, 25 Nov 2003 23:04:03 -0500, Pedro Sam <[EMAIL PROTECTED]> writes: > On November 25, 2003 10:31 pm, Alexander Litvinov wrote: > > Heh, it seems it would be nice to make SA scan messages fatser. If I > > undersand your idea correctly, you want not to run regexp one by one, but > > write th

[SAtalk] Re: A faster and more scalable matching engine.

2003-11-25 Thread Scott A Crosby
On Wed, 26 Nov 2003 09:31:36 +0600, Alexander Litvinov <[EMAIL PROTECTED]> writes: > Heh, it seems it would be nice to make SA scan messages fatser. If I > undersand your idea correctly, you want not to run regexp one by > one, but write the state machine for all regepes and walk on this > states

[SAtalk] Re: An Open Letter to the SA-talk forum

2003-11-25 Thread Scott A Crosby
To everyone here, give the guy a break. Logan is being honest. He did screw up. It wasn't fair to compare the classification accuracy of a program almost a year old with ones that get updated weekly. It also might be a bit unfair to claim that there is no support or updates. I'm sure a check for a

[SAtalk] Re: A faster and more scalable matching engine.

2003-11-25 Thread Scott A Crosby
On Tue, 25 Nov 2003 13:11:10 -0500, Roger Merchberger <[EMAIL PROTECTED]> writes: > At 03:27 11/25/2003 -0600, Scott A Crosby wrote: > >An automata based regexp engine is one that can compile a set of > >regular expressionns down into an automata, then run the automata. The

[SAtalk] A faster and more scalable matching engine.

2003-11-25 Thread Scott A Crosby
I mentioned this about a year ago, but now that people are starting to write rulesets with hundreds to thousands of new rules, I thought I'd bring it up again. How happy are people with the performance of SA, especially with all of thee new rules? The reason I ask is that I'm on-again, off-again w

[SAtalk] Re: What level to delete at?

2003-11-25 Thread Scott A Crosby
On Mon, 24 Nov 2003 22:07:56 -0500, "Greg Cirino - Cirelle Enterprises" <[EMAIL PROTECTED]> writes: > Scott wrote: > > | Having a single spam folder is a very bad decision > > | I never delete automatically, but the high catagory gets a 10 second > | glance every week, medium gets 20 seconds ev

[SAtalk] Re: What level to delete at?

2003-11-24 Thread Scott A Crosby
On Mon, 24 Nov 2003 16:58:19 -0500, Matt Chapman <[EMAIL PROTECTED]> writes: > Hello, > > I have been deleting at a score of 5 via Mimedefang. I notice that > some spam is scoring at 3.5 and 4ish. Is is better to tag at say > 3-4.9 and delete if it is any higher? > What are some of the setups

[SAtalk] Re: Sanity checking new uri rules?

2003-11-18 Thread Scott A Crosby
On Mon, 17 Nov 2003 13:22:49 -0500 (EST), William Stearns <[EMAIL PROTECTED]> writes: > So if I read you correctly, adding 4800 rules essentially triples > the cpu time needed to process a given message or collection of messages. > Are there ways to improve the performance of the ch

[SAtalk] Re: Re[2]: Sanity checking new uri rules?

2003-11-18 Thread Scott A Crosby
On Mon, 17 Nov 2003 19:19:00 -0800, Robert Menschel <[EMAIL PROTECTED]> writes: > Possibility 1: combine rules. If you can combine 10 tests into a single > rule, > > uri rulename /(?:spammer1|spammer2|s3|s4|s5|s6|s7|s8|s9|s10)\.com/i > then you'll have only 480 rules, not 4800. I don't know if th

[SAtalk] Re: holy cow, FN city

2003-10-08 Thread Scott A Crosby
On 08 Oct 2003 17:34:44 -0700, Daniel Quinlan <[EMAIL PROTECTED]> writes: > Scott A Crosby <[EMAIL PROTECTED]> writes: > > > Sure. The goal of that is to add in new tokens that are unique and > > have never been seen before. Those can bias an email toward neutr

[SAtalk] Re: holy cow, FN city

2003-10-08 Thread Scott A Crosby
On Wed, 8 Oct 2003 08:34:46 -0700 (PDT), [EMAIL PROTECTED] writes: > Wow... 10 false negatives this morning. =/ > > Is 2.60's bayes really a lot better than 2.55's? > Here's an example of a FN that came through this morning: > Notice the gobbledygook text at the end - Sure. The goal of that i

[SAtalk] Re: How to detect *only* obfuscated strings?

2003-08-05 Thread Scott A Crosby
On Sat, 2 Aug 2003 13:29:54 -0700, "Gary Funck" <[EMAIL PROTECTED]> writes: > Simple example: > > body REMOVE_OBFUSCATE > /(Rem(o|0)ve|Delete).{0,10}y(o|0)ur.{0,10}(e[-]?mai(l|1)|address)/i > describe REMOVE_OBFUSCATE Remove y0ur e-mail > > Above, this pattern will match (among other thing

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-28 Thread Scott A Crosby
On Mon, 28 Jul 2003 14:54:08 -0700, [EMAIL PROTECTED] (Justin Mason) writes: > >Also, forcing the victim to burn a second for every 2kb is still > >interesting. There's nothing that keeps the attacker from repeating > >this sort of thing every paragraph, so a 60kb email takes >30 seconds. > > yea

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-28 Thread Scott A Crosby
On Mon, 28 Jul 2003 11:36:35 -0700, [EMAIL PROTECTED] (Justin Mason) writes: > Scott A Crosby writes: > >Even in the case of perl, O(n^2) is noticable. Here, I show the number > >of '.''s and the corresponding runtime. Observe: > > > > > >1000 el

[SAtalk] Re: spam funny

2003-07-28 Thread Scott A Crosby
On Sat, 26 Jul 2003 11:18:39 +0700, Alexander Litvinov <[EMAIL PROTECTED]> writes: > > Of course, in theory spammers could start including things that look like > > PGP signatures. But since most people don't use PGP or GnuPG, we don't > > have to worry about this. > > > > Later of, if spammers s

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-28 Thread Scott A Crosby
On Fri, 18 Jul 2003 15:49:04 -0400, Vivek Khera <[EMAIL PROTECTED]> writes: > SAC> 2 '[EMAIL PROTECTED](?:[\-.0-9A-Z_a-z]+\.)+\w+' > > SAC> Feed it a bunch of dot's followed by a non-word... > > SAC> Say... '[EMAIL PROTECTED]' > > SAC> and, on some regexp interpreters, that line will take a f

[SAtalk] Re: SpamAssassin, Perl 5.8.1-rc2 and has randomization (was: Movie FILTER THIS VIRUS ALREADY!!!)

2003-07-17 Thread Scott A Crosby
On Fri, 18 Jul 2003 00:55:06 +0200, "Malte S. Stretz" <[EMAIL PROTECTED]> writes: > Found it in perlrun. I'm currently doing some runs, but till now no > problems. I can't think of any code in there which relys on the ordering on > hashes anyway... Thats what they thought, until the fireworks b

[SAtalk] Re: SpamAssassin, Perl 5.8.1-rc2 and has randomization (was: Movie FILTER THIS VIRUS ALREADY!!!)

2003-07-17 Thread Scott A Crosby
On Thu, 17 Jul 2003 21:26:46 +0200, "Malte S. Stretz" <[EMAIL PROTECTED]> writes: > On Thursday 17 July 2003 20:56 CET Scott A Crosby wrote: > > In any case, if 5.8.1 it goes out with the fix, its going to be > > interesting how many latent bugs the fix exposes. T

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-17 Thread Scott A Crosby
On Thu, 17 Jul 2003 14:22:56 -0400, Vivek Khera <[EMAIL PROTECTED]> writes: > >>>>> "SAC" == Scott A Crosby writes: > > SAC> that list been set up to deny non-subscribers from posting. And Perl > SAC> is being changed to be robust against th

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-17 Thread Scott A Crosby
On Thu, 17 Jul 2003 09:45:47 -0700, [EMAIL PROTECTED] (Justin Mason) writes: > Scott A Crosby writes: > >I just had a 100+ message thread on perl5-porters discussing the > >impact of my recent research on perl. I would not have had that had > >that list been set up to deny

[SAtalk] Re: Re[2]: Re: Movie FILTER THIS VIRUS ALREADY!!!

2003-07-17 Thread Scott A Crosby
On Thu, 03 Jul 2003 21:02:42 -0700, [EMAIL PROTECTED] (Justin Mason) writes: > Anyway, I've set up "non-subscribers cannot post", which should help > a little. I think this is the wrong solution. Some people who post rarely may read the archives elsewhere, and now they can't post. A user may have

[SAtalk] This is interesting... Any idea where to report this?

2002-11-01 Thread Scott A Crosby
I got the following message in my inbox. Its not spam per-se, but something I probably do want to filter. *** From: Yahoo!Member Services <[EMAIL PROTECTED]> Subject: [Newsletter] Registration confirmation - Yahoo! Mail To: [EMAIL PROTECTED] Date: Fri, 1 Nov 2002 15:13:13 -0800 (PST) Reply-To: [EM

[SAtalk] Re: URL blacklist

2002-10-16 Thread Scott A Crosby
On Tue, 1 Oct 2002 10:56:31 -0500, Robert Strickler <[EMAIL PROTECTED]> writes: > The biggest problem with converting a url to a checksum, is it wont take > long for the more sophisticated die-hard spammers to hack Apache (if it does > not currently support the capability) to allow a randomized

[SAtalk] A gem that looks like a perfectly normal reminder.

2002-09-12 Thread Scott A Crosby
Now, how do we detect something like this gem? -- X-From-Line: [EMAIL PROTECTED] Thu Sep 12 22:32:50 2002 Return-Path: <[EMAIL PROTECTED]> Received: from localhost (localhost [127.0.0.1]) by cs.rice.edu (Postfix) with ESMTP id 7C8874A9B8 for <[EMAIL PROTECTED]>; Thu, 12 Sep 2002