Re: Internationalised email subjects

2007-06-25 Thread bugmagnet
I'm an idiot!  Gabriel, you're right!  Turns out the ISP was running
Python 2.3, which has known issues with the GB2312 codec.  They've
upgraded to 2.4 and now everything runs smoothly!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-22 Thread bugmagnet
Thanks Martin,

The Some Chinese characters are loaded from a MySQL table and are
encoded in GB2312 format.

I've added the following line at the top of the code:

# -*- coding: GB2312 -*-

I've also added the following line into the code:

h = Header(subject.encode('GB2312'), 'GB2312')

Note that the 'subject' variable consists of GB2312 encoded text, so I
am not sure if it is necessary to call the subject.encode('GB2312')
method.  When I try to execute this code, I get the following error:

File /home/web88/html/app/test.py, line 17,
in Header(subject.encode('GB2312'), 'GB2312')
LookupError: unknown encoding: GB2312

Any idea what may be wrong?


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-22 Thread bugmagnet
Thanks Richie,

I've tried removing the encode('GB2312') line, so the code looks like
this:

h = Header(subject, 'GB2312')

However, this line still causes the following error message:

Traceback (most recent call last):
File /home/web88/html/app/sendmail.py, line 314, in
h = Header(subject, 'GB2312')
File /usr/lib/python2.2/email/Header.py, line 188, in __init__
self.append(s, charset, errors)
File /usr/lib/python2.2/email/Header.py, line 272, in append
ustr = unicode(s, incodec, errors)
LookupError: unknown encoding: gb2312 )

Any ideas?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-22 Thread Gabriel Genellina
En Fri, 22 Jun 2007 06:49:22 -0300, [EMAIL PROTECTED] escribió:

 I've tried removing the encode('GB2312') line, so the code looks like
 this:

 h = Header(subject, 'GB2312')

 However, this line still causes the following error message:

 Traceback (most recent call last):
 File /home/web88/html/app/sendmail.py, line 314, in
 h = Header(subject, 'GB2312')
 File /usr/lib/python2.2/email/Header.py, line 188, in __init__
 self.append(s, charset, errors)
 File /usr/lib/python2.2/email/Header.py, line 272, in append
 ustr = unicode(s, incodec, errors)
 LookupError: unknown encoding: gb2312 )

It appears that you don't have the gb2312 codec - maybe it was not  
available with your rather old Python version (2.2). Upgrading to a newer  
version may help.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-21 Thread Martin v. Löwis
[EMAIL PROTECTED] schrieb:
 That's really strange.  The chinese characters I am inputing into the
 post are not being displayed.  Basically, what I am doing is this:
 
 h = Header('(Some Chinese characters inserted here', 'GB2312')

What encoding do Some Chinese characters have at that point?

1. Don't try this at the interactive prompt. It will completely confuse
   you. Instead, use IDLE.
2. In IDLE, put
  # -*- coding: utf-8 -*-
  into the top of the source code file.
3. Write the header as a Unicode string, i.e. with a u prefix
4. Explicitly encode it, such as

h = Header(u'(Some Chinese characters inserted here'.encode('GB2312'),
'GB2312')

If you are *not* inserting the characters from the Python source
code directly, go back to my original question: What are the
characters encoded in?

HTH,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-21 Thread bugmagnet
Seems some characters are missing from my last post.  The line that
says:

h = Header('  ', 'GB2312')

should say:

h = Header('  ', 'GB2312')


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-21 Thread bugmagnet
That's really strange.  The chinese characters I am inputing into the
post are not being displayed.  Basically, what I am doing is this:

h = Header('(Some Chinese characters inserted here', 'GB2312')

And when I run this code, I receive the following error message:

UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3:
illegal multibyte sequence

Any idea what I may be doing wrong?  How do I convert Chinese
characters into something like p\xf6stal in the original code posted
by Martin?  Can someone point me in the right direction?  I'm not even
sure what class/method to look into for this.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-21 Thread Gabriel Genellina
En Thu, 21 Jun 2007 06:23:43 -0300, [EMAIL PROTECTED] escribió:

 That's really strange.  The chinese characters I am inputing into the
 post are not being displayed.  Basically, what I am doing is this:

 h = Header('(Some Chinese characters inserted here', 'GB2312')

 And when I run this code, I receive the following error message:

 UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3:
 illegal multibyte sequence

If you execute: print some chinese characters, do you get the right  
results?
Are you sure your system is using gb2312? In case you don't know and don't  
trust autodetection, try something like this:

py from unicodedata import *
py name(á.decode(latin-1))
'NO-BREAK SPACE'
py name(á.decode(cp850))
'LATIN SMALL LETTER A WITH ACUTE'

The first attempt shows the wrong name, so my console *cannot* be using  
latin-1. With cp850 I got the right results, so it *might* be cp850 (it  
may also be another encoding that happens to match this single character).  
Further tests may reveal that it is actually cp850.
You should try with some chinese characters and see if your encoding is  
actually gb2312.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-21 Thread Evan Klitzke
On 6/21/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 That's really strange.  The chinese characters I am inputing into the
 post are not being displayed.  Basically, what I am doing is this:

You're not sending your email in UTF-8 (or another encoding that would
permit Chinese characters). Your email header shows:

Content-Type: text/plain; charset=us-ascii

You probably need to reconfigure your mail client to send Chinese characters.

-- 
Evan Klitzke [EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Google breaks international charset messages (was: Internationalised email subjects)

2007-06-21 Thread Ben Finney
[EMAIL PROTECTED] writes:

 Seems some characters are missing from my last post.  The line that
 says:

 h = Header('  ', 'GB2312')

 should say:

 h = Header('  ', 'GB2312')

Your message has this field in the header:

Content-Type: text/plain; charset=us-ascii

which is why the non-ASCII characters don't appear. This is the fault
of Google's charset munging.

Please, people who use Google for mail and Usenet, kick them until
they present utf-8 as the default encoding, instead of downgrading
to us-ascii.

-- 
 \   I lost a button-hole.  -- Steven Wright |
  `\   |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Google breaks international charset messages (was: Internationalised email subjects)

2007-06-21 Thread Evan Klitzke
On 6/21/07, Ben Finney [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] writes:

  Seems some characters are missing from my last post.  The line that
  says:
 
  h = Header('  ', 'GB2312')
 
  should say:
 
  h = Header('  ', 'GB2312')

 Your message has this field in the header:

 Content-Type: text/plain; charset=us-ascii

 which is why the non-ASCII characters don't appear. This is the fault
 of Google's charset munging.

 Please, people who use Google for mail and Usenet, kick them until
 they present utf-8 as the default encoding, instead of downgrading
 to us-ascii.

Ironically, you're sending out us-ascii encoded emails as well. Like
it or not, 7-bit ASCII is the standard for SMTP, so it's a reasonable
default character encoding to send MIME encoded messages in -- and
it's trivial to change the outgoing character set to UTF-8 in
Gmail/Google Apps.

-- 
Evan Klitzke [EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Internationalised email subjects

2007-06-20 Thread bugmagnet
I am writing a simple email program in Python that will send out
emails containing Chinese characters in the message subject and body.
I am not having any trouble getting the email body displayed correctly
in Chinese inside the email client, however the email subject and
sender name (which are also in Chinese) are garbled and are not
displayed correctly in the email client.

Here is the code snippet:

writer = MimeWriter.MimeWriter(out)
headers = {From: senderName + ' ' + senderName + '', To:
recipientEmail, Reply-to: senderEmail}

writer.addheader(Subject, subject)
writer.addheader(MIME-Version, 1.0)
writer.addheader('From', headers['From'])
writer.addheader('To', headers['To'])
writer.addheader('Reply-to', headers['Reply-to'])

I'm quite new to Python (and programming in general) and am having a
hard time wrapping my head around the internationalization functions
of Python, so was hoping someone could point me in the right
direction.  Is there a different method I need to use in order for
the  sender name and subject to be displayed correctly?  Is there an
extra step I am missing?  Some sample code would be very helpful.

Thanks!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-20 Thread Martin Skou
From:
http://docs.python.org/lib/module-email.header.html

  from email.message import Message
  from email.header import Header
  msg = Message()
  h = Header('p\xf6stal', 'iso-8859-1')
  msg['Subject'] = h
  print msg.as_string()
Subject: =?iso-8859-1?q?p=F6stal?=

/Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Internationalised email subjects

2007-06-20 Thread bugmagnet
Thanks Martin, I actually have read that page before.  The part that
confuses me is the line:

h = Header('p\xf6stal', 'iso-8859-1')

I have tried using:

h = Header('  ', 'GB2312')

but when I run the code, I get the following error:

UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3:
illegal multibyte sequence

Is there something I need to do in order to encode the Chinese
characters into the GB2312 character set?

-- 
http://mail.python.org/mailman/listinfo/python-list