[OT] Re: a unicode question?

2006-04-11 Thread Peter Otten
John Machin wrote: > ... and yes Peter, info travels faster also from China that it does > from Armenia :-()) Q: Can info travel faster from Armenia than from China? Radio Yerevan: In principle, yes. Just make sure that it doesn't go the other way round the globe or meets some friends on the way.

Re: a unicode question?

2006-04-10 Thread John Machin
E, it get's worse: not only is the title written in Chinese, it is encoded as gb2312 -- here is the repr() of the first few chunks: "\n\n\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) : \xc4\xd a\xb2\xbf\xc8\xcb\xd4\xb1\xb3\xd6\xb9\xc9 - \xcb\xd1\xba\xfc\xb9\xc9\xc6\xb1\n\n" and here is wha

Re: a unicode question?

2006-04-10 Thread Serge Orlov
[EMAIL PROTECTED] wrote: > Mr. John Machin > > This question come form the flow codes. I use the PyXml to build a DOM > tree. > > from xml.dom.ext.reader import HtmlLib > doc = > HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') > title_elem = doc.documentElement.getElem

Re: a unicode question?

2006-04-09 Thread zdwang
Mr. John Machin This question come form the flow codes. I use the PyXml to build a DOM tree. from xml.dom.ext.reader import HtmlLib doc = HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') title_elem = doc.documentElement.getElementsByTagName("TITLE")[0] title_string = t

Re: a unicode question?

2006-04-09 Thread zdwang
Mr. John Machin, Thank you very much! -- http://mail.python.org/mailman/listinfo/python-list

Re: a unicode question?

2006-04-09 Thread John Machin
What do you mean by "ansi string"? Here is a superficially not-unreasonable answer to your more specific question: # >>> s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s3 = s1.encode('latin1') # >>> s2 == s3 # True But what are you

a unicode question?

2006-04-09 Thread zdwang
Hello, There is a unicode string, I want to change it to ansi string. but it raise an exception. Could you help me? ## I want to change s1 to s2. s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' -- http://mail.python.org/mailman

Re: Once again a unicode question

2005-03-26 Thread Nicolas Evrard
* Serge Orlov [23:45 26/03/05 CET]: Nicolas Evrard wrote: Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? Seems like the parser is in the broken state after the first exception. Fe

Re: Once again a unicode question

2005-03-26 Thread Serge Orlov
Nicolas Evrard wrote: > Hello, > > I'm puzzled by this test I made while trying to transform a page in > html to plain text. Because I cannot send unicode to feed, nor str so > how can I do this ? Seems like the parser is in the broken state after the first exception. Feed only binary strings to i

Once again a unicode question

2005-03-26 Thread Nicolas Evrard
Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? [EMAIL PROTECTED]:~$ python2.4 .Python 2.4.1c2 (#2, Mar 19 2005, 01:04:19) .[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2 .Type "help", "