"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > Hi, > > Is there a string function to trim all non-ascii characters out of a > string? > Let say I have a string in python (which is utf8 encoded), is there a > python function which I can convert that to a string which composed of > only ascii characters? > > Thank you.
Yes, just decode it to unicode (which you should do as the first thing for any encoded strings) and then encode it back to ascii with error handling set how you want: >>> s = '\xc2\xa342' >>> s.decode('utf8').encode('ascii', 'replace') '?42' >>> s.decode('utf8').encode('ascii', 'ignore') '42' >>> s.decode('utf8').encode('ascii', 'xmlcharrefreplace') '£42' -- http://mail.python.org/mailman/listinfo/python-list