I think most people who never come across chinese don't know how chinese
works.
The chinese I am talking about is BIG5,GB2312, or UTF-8 (better).
Before MS$ enterprises the whole world, I don't think we will give up GB2312
and BIG5 charset.
If you look at BIG5, and GB2312, they are using the same mapping table, or
most likely the charset occupy the same address on the table.
They are all 2-bytes ASCII code sharing the same charset address but with
different encoding only.
There is no way for Dspam to see which is GB2312 and BIG5...
I cannot just spam GB2312 email since the BIG5 email will be "spamed" too.
I have seen too many cases of dspam failure to detect the correct encoding.
BTW, the most problem is the re-train process... no one here can tell...
Good luck
Patrick
----- Original Message -----
From: "Dov Zamir" <[EMAIL PROTECTED]>
To: "Patrick T. Tsang" <[EMAIL PROTECTED]>
Cc: "Kent Tong" <[EMAIL PROTECTED]>; <[email protected]>
Sent: Saturday, January 27, 2007 3:42 PM
Subject: Re: [dspam-users] won't learn?
ציטוט Patrick T. Tsang:
Hello Kent,
I have the same problem.
And, I give up Dspam already. The result is not good, and the maintenance
is too difficult to deal with.
No one here can answer me the problem of re-learn...
I think Dspam got its good idea to handle spam, but it is not designed
for chinese.
Patrick,
I don't think that is correct. DSPAM tokenizes the email, there is no
concept of language. It works just fine with Hebrew for my setup, so why
would it not work with Chinese?
Good luck
Patrick
----- Original Message ----- From: "Kent Tong" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Saturday, January 27, 2007 11:21 AM
Subject: Re: [dspam-users] won't learn?
Marcin Krol wrote:
1. Try looking up the DSPAM factors in the message headers,
(you can view full message by pressing Ctrl-U in Thunderbird
or F9 in The Bat), the headers may give you some clue?
I just found out even for a spam correctly identified as spam, if
I classify it again, it will say it's innocent. If I delete the
headers generated by dspam (including the "Received by:" headers
it and Cyrus generated), then it will classify it as spam.
However, for a spam that wasn't identified, even after training it,
dspam is still classifying its header-removed version as innocent.
2. Have you changed the default spam-probability algorithms
in dspam.conf? You could tweak those and see what changes.
No.
--
Kent Tong
Useful news for CIO's at http://www2.cpttm.org.mo/cyberlab/cio-news
_________________________________________________________________________
This message has been scanned by Kibbutz Beit Kama's Anti Virus software,
and is believed to be clean of any viruses.
_________________________________________________________________________
!DSPAM:500,45bad9fc220911640823458!
_________________________________________________________________________
This message has been scanned by Kibbutz Beit Kama's Anti Virus software,
and is believed to be clean of any viruses.
_________________________________________________________________________