[EMAIL PROTECTED] wrote:

> How can I convert encoded string
> 
> sEncodedHtmlText = 'привет
> питон'
> 
> into human readable:
> 
> sDecodedHtmlText  == 'привет питон'

How about:

>>> sEncodedHtmlText = 'text: 
приветпито&#108
5;'
>>> def unescape(m):
    return unichr(int(m.group(0)[2:-1]))

>>> print re.sub('&#[0-9]+;', unescape, sEncodedHtmlText)
text: ???????????

I'm afraid my newsreader couldn't cope with either your original text or my 
output, but I think this gives the string you wanted. You probably also 
ought to decode sEncodedHtmlText to unicode first otherwise anything which 
isn't an entity escape will be converted to unicode using the default ascii 
encoding.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to