On Fri, 29 Aug 2003, Michael Stevens wrote:

> On Fri, Aug 29, 2003 at 12:22:25PM +0100, Roger Burton West wrote:
> > On Fri, Aug 29, 2003 at 12:17:28PM +0100, Michael Stevens wrote:
> >
> > >I'm fully in favour of banning text/html only messages, but
> > >I'm quite happy with multipart/alternative, as they usually render
> > >perfectly well in decent console MUAs.
> >
> > Spammers have noticed that people are banning text/html only; quite a
> > bit of the spam I've seen recently has been of the form:
> >
> > multipart/alternative
> >  text/plain (blank)
> >  text/html (body of spam)
>
> Okay, possibly a more intelligent filter that can spot this case
> as well? Harder to implement as a system filter in exim, though, I'd
> guess.

But then to get around that you've got

    multipart/alternative
     text/plain (body of gibberish)
     text/html (body of spam)

At which point it seems attractive to at least take a stab at making sure
that the /plain and /html branches resemble each other. But how? It
doesn't seem feasible to embed an HTML parser in a high volume mail
scanner ("meet the new SpamAssassin 3.0 -- now with Gecko!"). Checksums
seem unlikely to help. You could guess with metrics such as "the average
html branch will be N times the plain branch, give or take X standard
deviations" but ...yuck.

A less painful approach might just be to queue multipart messages for
moderator review. As has been noted, there have only been a handful of
these in the past six months, not all of which were meant to go to the
list anyway.


-- 
Chris Devers    [EMAIL PROTECTED]

ISO, n. [Origin: possibly Greek iso "equal" but now presumed acronym
  for International Standards Organization.]
A meta-standards organization set up in 1947 in order to establish
standards for the setting up of standard organizations. See also ANSI;
ASCII; STANDARD.

    -- from _The Computer Contradictionary_, Stan Kelly-Bootle, 1995

Reply via email to