> (1) what is produced on Anjanesh's machine >>> sys.getdefaultencoding() 'utf-8'
> (2) it looks like a small snippet from a Python source file! Its a file containing just JSON data - but has some unicode characters as well as it has data from the web. > Anjanesh, Is it a .py file Its a .json file. I have a bunch of these json files which Im parsing. using json library. > Instead of "something like", please report exactly what is there: > > print(ascii(open('the_file', 'rb').read()[10442-20:10442+21])) >>> print(ascii(open('the_file', 'rb').read()[10442-20:10442+21])) b'":42,"query":"0 1\xc2\xbb\xc3\x9d \\u2021 0\\u201a0 \\u2' > Trouble with cases like this is as soon as they become interesting, the OP > often snatches somebody's one-liner that "works" (i.e. doesn't raise an exception), makes a quick break for the county line, and they're not seen again :-) Actually, I moved the files to my Ubuntu PC which has Python 2.5.2 and didnt give the encoding issue. I just couldnt spend that much time on why a couple of these files had encoding issues in Py3 since I had to parse a whole lot of files. -- http://mail.python.org/mailman/listinfo/python-list