You sent the message to the list:

Received: from [202.154.34.135] (HELO v6.i6x.org) (202.154.34.135)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Aug 2005 22:59:21 -0700

The spam message header you showed:

> Date: Wed, 31 Aug 2005 08:59:56 -0700

The Date header on that mail is some 9 hours after the time you posted your
question.
Hence:

>          *  1.3 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after
> Received: date

Assuming you sent the mail to the list not long after you received it, the
Date header on that mail still shows it being between 6 and 12 hours in the
future from when you received it.


> 3. I have train hundreds (or thousands) spam/ham mail to sa-learn but it
> seems it still not quite good detecting non-english mail.

>          *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
>          *      [score: 1.0000]

This tells me that bayes is 100% sure that the message is spam.  That sounds
pretty good to me, unless this isn't spam.  However, the date header being
mucked up, and the date header and first Received headers showing timezones
that are 12 hours apart, makes me think this is spam.

SA is written primarily by English speakers, and the rules are primarily
aimed at detecting English-language spam.  There are some add-on rulesets to
detect spam in other languages, but they generally aren't that well
maintained.  They would have to be maintained by contributions from people
who can write rules for spam in other languages.  Few people that might be
able to write such rules seem to contribute them.

Bayes should be pretty good about detecting spam in most languages that do
not require double-byte characters.  The current release of SA may have some
problems with double-byte characters that could make Bayes less effective
than it could be.

        Loren

Reply via email to