[EMAIL PROTECTED] wrote: > danielx wrote: >> [EMAIL PROTECTED] wrote: >>> Here is my script: >>> >>> from mechanize import * >>> from BeautifulSoup import * >>> import StringIO >>> b = Browser() >>> f = b.open("http://www.translate.ru/text.asp?lang=ru") >>> b.select_form(nr=0) >>> b["source"] = "hello python" >>> html = b.submit().get_data() >>> soup = BeautifulSoup(html) >>> print soup.find("span", id = "r_text").string >>> >>> OUTPUT: >>> привет >>> питон >>> ---------- >>> In russian it looks like: >>> "привет питон" >>> >>> How can I translate this using standard Python libraries?? >>> >>> -- > > Thank you for response. > It doesn't matter what is 'BeautifulSoup'...
However, the best solution is to ask BeautifulSoup to do that for you. if you do soup = BeautifulSoup(your_html_page, convertEntities="html") you should not be worrying about the problem you had. this converts all the html entities (the five you see as soup.entitydefs) and all the "&#xxx;" stuff to their python unicode string. yichun > General question is: > > How can I convert encoded string > > sEncodedHtmlText = 'привет > питон' > > into human readable: > > sDecodedHtmlText == 'привет питон' > -- http://mail.python.org/mailman/listinfo/python-list