Theppitak Karoonboonyanan, 2008-09-08 21:53:47 +0700 :

> I think this case is difficult for spam filter. The text is
> deliberately shuffled with HTML tables so that the final rendered
> page becomes readable to human readers, but the source is completely
> meaningless to the filter. It's then ended with a meaningful
> paragraph to pass it.
>
> Any trick to capture the final rendered page before passing it to
> the filter for analysis? I don't know.

  Maybe by extracting the thumbnail-generating code from Chrome
(unless it depends on X11) and piping that to OCR software...

Roland,
tired of the arms-race too.
-- 
Roland Mas

Qu'est-ce qui est jaune, qui pèse deux cents kilos et qui chante ?
Un sumotori dans sa salle de bains.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to