Johannes Bauer wrote:
Hello group,
I'm trying to use a htmllib.HTMLParser derivate class to parse a website
which I fetched via
httplib.HTTPConnection().request().getresponse().read(). Now the problem
is: As soon as I pass the htmllib.HTMLParser UTF-8 code, it chokes. The
code is something like
On Fri, 10 Oct 2008 00:13:36 +0200, Johannes Bauer wrote:
Terry Reedy schrieb:
I believe you are confusing unicode with unicode encoded into bytes
with the UTF-8 encoding. Having a problem feeding a unicode string,
not 'UFT-8 code', which in Python can only mean a UTF-8 encoded byte
string.
Hello group,
I'm trying to use a htmllib.HTMLParser derivate class to parse a website
which I fetched via
httplib.HTTPConnection().request().getresponse().read(). Now the problem
is: As soon as I pass the htmllib.HTMLParser UTF-8 code, it chokes. The
code is something like this:
prs =
Johannes Bauer wrote:
Hello group,
I'm trying to use a htmllib.HTMLParser derivate class to parse a website
which I fetched via
httplib.HTTPConnection().request().getresponse().read(). Now the problem
is: As soon as I pass the htmllib.HTMLParser UTF-8 code, it chokes. The
code is something like
Terry Reedy schrieb:
Johannes Bauer wrote:
Hello group,
I'm trying to use a htmllib.HTMLParser derivate class to parse a website
which I fetched via
httplib.HTTPConnection().request().getresponse().read(). Now the problem
is: As soon as I pass the htmllib.HTMLParser UTF-8 code, it chokes.
On Thu, Oct 9, 2008 at 4:54 PM, Johannes Bauer [EMAIL PROTECTED] wrote:
Hello group,
Now when I take website directly from the parser, everything is fine.
However I want to do some modifications before I parse it, namely UTF-8
modifications in the style:
website = website.replace(uföö,
Johannes Bauer wrote:
Terry Reedy schrieb:
Johannes Bauer wrote:
Hello group,
I'm trying to use a htmllib.HTMLParser derivate class to parse a website
which I fetched via
httplib.HTTPConnection().request().getresponse().read(). Now the problem
is: As soon as I pass the htmllib.HTMLParser