On Monday 22 August 2016 at 16:34:00, Marc Perkel wrote:

> On 08/22/16 07:28, Dianne Skoll wrote:
> 
> > What percentage of emails using your algorithm are actually
> > decidable?
> 
> Almost 100% if you look at a wide variety of tokens from multiple
> attributes. Subject, body, content flags, header structure, combinations
> of all domains reference, php scripts, name part of from addresses,
> behavior flags.

I would have said that a very large number of the words used in spam mails are 
the same as the words used in ham mails, so I suspect I'm confused about what 
constitutes a "token".

I fail to see how the "name part of from addresses" are unlikely to match ham, 
for example, since I see quite a lot of spam apparently from myself.


Antony.

-- 
Never automate fully anything that does not have a manual override capability. 
Never design anything that cannot work under degraded conditions in emergency.

                                                   Please reply to the list;
                                                         please *don't* CC me.

Reply via email to