On Monday 22 August 2016 at 16:34:00, Marc Perkel wrote: > On 08/22/16 07:28, Dianne Skoll wrote: > > > What percentage of emails using your algorithm are actually > > decidable? > > Almost 100% if you look at a wide variety of tokens from multiple > attributes. Subject, body, content flags, header structure, combinations > of all domains reference, php scripts, name part of from addresses, > behavior flags.
I would have said that a very large number of the words used in spam mails are the same as the words used in ham mails, so I suspect I'm confused about what constitutes a "token". I fail to see how the "name part of from addresses" are unlikely to match ham, for example, since I see quite a lot of spam apparently from myself. Antony. -- Never automate fully anything that does not have a manual override capability. Never design anything that cannot work under degraded conditions in emergency. Please reply to the list; please *don't* CC me.