On Feb 10, 11:09 am, kj <no.em...@please.post> wrote: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: > ordinal not in range(128) >
You'll have to understand some terminology first. "codec" is a description of how to encode and decode unicode data to a stream of bytes. "decode" means you are taking a series of bytes and converting it to unicode. "encode" is the opposite---take a unicode string and convert it to a stream of bytes. "ascii" is a codec that can only describe 0-127 with bytes 0-127. "utf-8", "utf-16", etc... are other codecs. There's a lot of them. Only some of them (ie, utf-8, utf-16) can encode all unicode. Most (ie, ascii) can only do a subset of unicode. In this case, you've fed a stream of bytes with 128 as one of the bytes to the decoder. Since the decoder thinks it's working with ascii, it doesn't know what to do with 128. There's a number of ways to fix this: (1) Feed it unicode instead, so it doesn't try to decode it. (2) Tell it what encoding you are using, because it's obviously not ascii. > > FWIW, I'm using Python 2.6. The example above happens to come from > a script that extracts data from HTML files, which are all in > English, but they are a daily occurrence when I write code to > process non-English text. The script uses Beautiful Soup. I won't > post a lot of code because, as I said, what I'm after is not so > much a way around this specific error as much as the tools and > techniques to troubleshoot it and fix it on my own. But to ground > the problem a bit I'll say that the exception above happens during > the execution of a statement of the form: > > x = '%s %s' % (y, z) > > Also, I found that, with the exact same values y and z as above, > all of the following statements work perfectly fine: > > x = '%s' % y > x = '%s' % z > print y > print z > print y, z > What are y and z? Are they unicode or strings? What are their values? It sounds like someone, probably beautiful soup, is trying to turn your strings into unicode. A full stacktrace would be useful to see who did what where. -- http://mail.python.org/mailman/listinfo/python-list