Cool, I don't mind being wrong.

I am glad you found time.
The patch you sent fails for some reason.
I changed that one line manually but not sure if there was much else to
it.

It worked though!

Thanks.
Sam
 


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Jim Ramsay
Sent: Tuesday, May 04, 2004 12:01 PM
To: [EMAIL PROTECTED]
Subject: Re: Unicode Error - TMDA-CGI 0.13 pending list


Samuel Hill wrote:

> Reading the message in pico shows the same thing, no \xa character at 
> all. The way you see it below is the way it is, there is not character

> like that at all in the message itself.
> I have attached the message in an attachment though.
> If I place exactly what is in this text file (the attachment) in my
> pending it dumps.

The text file does indeed have some funny \xa0 character in the Subject.

  Reading in in Vim shows a funny blue '| ' character right where I'd 
expect the \xa0 to be.  Also, an 'od -c 1083676255.82872.msg | grep 
0001440' shows the character as octal 240 (which is 0xA0).

> I believe that the /xa0 is actually coming from Unicode.py or 
> something right before it.

Sorry, I'll have to disagree here... the character is in the Subject 
line of the message you attached to that last post of yours.  Just most 
other email apps and/or text editors are smarter than tmda-cgi about it
:)

> There seems to be a substitution piece of code in Unicode.py to maybe 
> prevent dumping on reading the message with tmda-cgi with certain 
> characters but instead is doing harm. So tmda-cgi is trying to do a 
> substitution? I know it is not tmda itself because the message sitting

> in pending is all good, it is only on read in tmda-cgi.
> 
> Example...
> AltChar  = re.compile("[\x80-\xFF]")
> 
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> 
> def Iso8859(Str):
>   RetVal = u""
>   while 1:   
>     Match = AltChar.search(Str)
>     if Match:
>       RetVal += Str[:Match.start()] + Xlate(Match.group(0))
>       Str = Str[Match.end():]
>     else:
>       break
>   RetVal += Str

This is only called if the character set requested is 'iso-8859-1' or 
'us-ascii'... According to your traceback, the character set requested 
is 'us_ascii' - so this is never called.  This character set is 
requested because the email itself says, lower down:

Content-Type: text/plain;
         charset="US_ASCII"

I have to apologise, I lied - I actually found time to do this and have 
fixed this in CVS now, please try the following patch and let me know if

it works for you:

--- start here ---
diff -u -r1.6 -r1.7
--- Unicode.py  18 Feb 2004 15:10:48 -0000      1.6
+++ Unicode.py  4 May 2004 15:45:18 -0000       1.7
@@ -72,7 +72,7 @@
    CharSet = CS.input_charset

    # Find appropriate decoder
-  if CharSet in ("iso-8859-1", "us-ascii"):
+  if CharSet in ("iso-8859-1", "us-ascii", "us_ascii" ):
      Decoder = Iso8859
    else:
      try:
--- end here ---

Thanks for helping me track down this bug!

-- 
Jim Ramsay
"Me fail English?  That's unpossible!"

_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users

_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users

Reply via email to