On May 30, 8:53 am, Tommy Nordgren <[EMAIL PROTECTED]> wrote: > On 29 maj 2007, at 17.52, Clodoaldo wrote: > > > > > I was looking for a function to transform a unicode string into > > htmlentities. Not only the usual html escaping thing but all > > characters. > > > As I didn't find I wrote my own: > > > # -*- coding: utf-8 -*- > > from htmlentitydefs import codepoint2name > > > def unicode2htmlentities(u): > > > htmlentities = list() > > > for c in u: > > if ord(c) < 128: > > htmlentities.append(c) > > else: > > htmlentities.append('&%s;' % codepoint2name[ord(c)]) > > > return ''.join(htmlentities) > > > print unicode2htmlentities(u'São Paulo') > > > Is there a function like that in one of python builtin modules? If not > > is there a better way to do it? > > > Regards, Clodoaldo Pinto Neto > > In many cases, the need to use html/xhtml entities can be avoided by > generating > utf8- coded pages.
Sure. All my pages are utf-8 encoded. The case I'm dealing with is an email link which subject has non ascii characters like in: <a href=mailto:[EMAIL PROTECTED]>Mail to</a> Somehow when the user clicks on the link the subject goes to his email client with the non ascii chars as garbage. And before someone points that I should not expose email addresses, the email is only linked with the consent of the owner and the source is obfuscated to make it harder for a robot to harvest it. Regards, Clodoaldo -- http://mail.python.org/mailman/listinfo/python-list