Update of /cvsroot/spambayes/spambayes/pspam
In directory sc8-pr-cvs8.sourceforge.net:/tmp/cvs-serv30085/pspam
Modified Files:
scoremsg.py
Log Message:
Add simple parts of [ 824651 ] Multibyte (CJK etc.) message support
(Lets extractmessages and scoremsg work with charsets other than us-ascii, and
lets Outlook plug-in handle tokens that aren't in the right encodng).
Index: scoremsg.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/pspam/scoremsg.py,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** scoremsg.py 18 Dec 2003 06:41:51 -0000 1.4
--- scoremsg.py 10 Jun 2006 04:57:11 -0000 1.5
***************
*** 4,7 ****
--- 4,9 ----
import sys
import email
+ import locale
+ from types import UnicodeType
import ZODB
***************
*** 20,23 ****
--- 22,29 ----
def main(fp):
+ charset = locale.getdefaultlocale()[1]
+ if not charset:
+ charset = 'us-ascii'
+
db = pspam.database.open()
r = db.open().root()
***************
*** 32,35 ****
--- 38,43 ----
print "-----"
for clue, prob in evidence:
+ if isinstance(clue, UnicodeType):
+ clue = clue.encode(charset, 'replace')
print clue, prob
## print
_______________________________________________
Spambayes-checkins mailing list
[email protected]
http://mail.python.org/mailman/listinfo/spambayes-checkins