"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> Is there a string function to trim all non-ascii characters out of a
> string?
> Let say I have a string in python (which is utf8 encoded), is there a
> python function which I can convert that to a string which composed of
> only ascii characters?
> 
> Thank you.

Yes, just decode it to unicode (which you should do as the first thing for 
any encoded strings) and then encode it back to ascii with error handling 
set how you want:

>>> s = '\xc2\xa342'
>>> s.decode('utf8').encode('ascii', 'replace')
'?42'
>>> s.decode('utf8').encode('ascii', 'ignore')
'42'
>>> s.decode('utf8').encode('ascii', 'xmlcharrefreplace')
'&#163;42'
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to