Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-08-04 Thread Yateen
Hi Bill, thanks for the valuable inputs. I could hit a better solution and I believe that is simplest one. Better, the solution is on the application side and not on the DJango side. What I did was this - When my parser starts reading data from files (for which I don't know the encoding), it first

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-23 Thread Bill Freeman
I don't really have enough context (or, at the moment, time) to do a serious review. It may well be that you are safe. iri_to_uri () looks like the key, since you almost certainly will trip over a value that isn't eligible as a url (clients cut and paste from MS Word and equivalent all the time).

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-22 Thread Yateen
Ok, I did some changes and things look to be working. My intention was to receive URLs, parse them to get the base URL, put them in database (Postgres), and then through a http query, through Django interface through psycopg2, retrieve these URLs and display those to the user on the browser in a t

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-06 Thread Yateen
Hi Bill, Thanks. You were right. The Postgres encoding and Django encoding are different. The parser is Python. Postgres encoding was SQL_ASCII. I changed it to UTF8, and the parser failed to insert in DB!! I believe I need to fix it first. -- You received this message because you are subscribe

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-06 Thread Bill Freeman
I doubt that you can fault Postgres. It doesn't need to care about the encoding of contents, other than it must find the end of the string (and the conventions used may depend on the interface and connection settings). When you say contents of a url, do you mean the url itself, or the page referr

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-05 Thread Yateen
Thanks Bill. Do you mean even Postgres also should have thrown errors? My worry is different here. The characters that I am getting are valid contents of a HTTP URL, and my parser is able to parse them and put in database. However, the Django interface is not able to read it. If I am required to a

Re: UnicodeDecodeError: 'utf8' codec can't decode

2010-07-01 Thread Bill Freeman
(most recent call last): >  File "", line 1, in >  File "/firstpool/yjoshi/permanent/starbi/python2.6.1/lib/python2.6/ > encodings/utf_8.py", line 16, in decode >    return codecs.utf_8_decode(input, errors, True) > UnicodeDecodeError: 'utf8' codec can't deco

UnicodeDecodeError: 'utf8' codec can't decode

2010-07-01 Thread Yateen
ib/python2.6/ encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 7-10: invalid data Can anyone please throw some light on this? why is this occurring? what is the solution.