[issue6611] HTMLParser cannot deal with mixture of arbitrary data and character reference

2009-08-01 Thread Liu DongMiao
Liu DongMiao liudongm...@gmail.com added the comment: i think this should not be a bug. as we dont know the encoding of str, so we cannt deal with str and unicode together. in my example, str is in utf-8, so i need to convert unicode to str in utf-8. i will takes bones' suggestion

[issue6611] HTMLParser cannot deal with mixture of arbitrary data and character reference

2009-07-31 Thread Liu DongMiao
New submission from Liu DongMiao liudongm...@gmail.com: HTMLParser (Python 2.6.2) Cannot deal with mixture of arbitrary data and character reference. In line 365-373, replaceEntities(s) returns unichr(charref) in unicode, which cannot be a mixture with arbitrary data in str. A fix way