Update of /cvsroot/spambayes/spambayes/pspam
In directory sc8-pr-cvs8.sourceforge.net:/tmp/cvs-serv30085/pspam

Modified Files:
        scoremsg.py 
Log Message:
Add simple parts of [ 824651 ] Multibyte (CJK etc.) message support

(Lets extractmessages and scoremsg work with charsets other than us-ascii, and 
lets Outlook plug-in handle tokens that aren't in the right encodng).

Index: scoremsg.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/pspam/scoremsg.py,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** scoremsg.py 18 Dec 2003 06:41:51 -0000      1.4
--- scoremsg.py 10 Jun 2006 04:57:11 -0000      1.5
***************
*** 4,7 ****
--- 4,9 ----
  import sys
  import email
+ import locale
+ from types import UnicodeType
  
  import ZODB
***************
*** 20,23 ****
--- 22,29 ----
  
  def main(fp):
+     charset = locale.getdefaultlocale()[1]
+     if not charset:
+         charset = 'us-ascii'
+ 
      db = pspam.database.open()
      r = db.open().root()
***************
*** 32,35 ****
--- 38,43 ----
      print "-----"
      for clue, prob in evidence:
+         if isinstance(clue, UnicodeType):
+             clue = clue.encode(charset, 'replace')
          print clue, prob
  ##    print

_______________________________________________
Spambayes-checkins mailing list
[email protected]
http://mail.python.org/mailman/listinfo/spambayes-checkins

Reply via email to