On Nov 27, 5:08 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Nov 28, 8:45 am, [EMAIL PROTECTED] wrote: > > > > > > > > > On Nov 27, 3:35 pm, Martin Landa <[EMAIL PROTECTED]> wrote: > > > > Hi all, > > > > sorry for a newbie question. I have unicode string (or better say > > > latin2 encoding) containing non-ascii characters, e.g. > > > > s = "Ukázka_monosti_vyuití_programu_OpenJUMP_v_SOA" > > > > I would like to convert this string to plain ascii (using some lookup > > > table for latin2) > > > > to get > > > > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA > > > > Thanks for any hits! Regards, Martin Landa > > > With a little googling, I found this: > > >http://www.peterbe.com/plog/unicode-to-ascii > > and if the OP has the patience to read *ALL* the comments on that blog > entry, he will find that comment[-2] points to > > http://effbot.python-hosting.com/file/stuff/sandbox/text/unaccent.py > > and comment[-1] (from the blog owner) is "Brilliant! Thank you." > > The bottom line is that there is no universal easy solution; you need > to handcraft a translation table suited to your particular purpose > (e.g. do you want u-with-umlaut to become u or ue?). The > unicodedata.normalize function is useful for off-line preparation of a > set of candidate mappings for that table; it should not be applied > either on-line or blindly. > > Cheers, > John
Sorry...I didn't know about translation tables or I would have mentioned that instead. My bad. Mike -- http://mail.python.org/mailman/listinfo/python-list