Re: catch UnicodeDecodeError

2012-08-29 Thread Robert Miles
On 7/26/2012 5:51 AM, Jaroslav Dobrek wrote: And the cool thing is: you can! :) In Python 2.6 and later, the new Py3 open() function is a bit more hidden, but it's still available: from io import open filename = "somefile.txt" try: with open(filename, encoding="utf-8")

Re: catch UnicodeDecodeError

2012-07-26 Thread Philipp Hagemeister
On 07/26/2012 02:24 PM, Stefan Behnel wrote: > Read again: "*code* line". The OP was apparently failing to see that > the error did not originate in the source code lines that he had > wrapped with a try-except statement but somewhere else, thus leading to the misguided impression that the exceptio

Re: catch UnicodeDecodeError

2012-07-26 Thread Stefan Behnel
Philipp Hagemeister, 26.07.2012 14:17: > On 07/26/2012 01:15 PM, Stefan Behnel wrote: >>> exits with a UnicodeDecodeError. >> ... that tells you the exact code line where the error occurred. > > Which property of a UnicodeDecodeError does include that information? > > On cPython 2.7 and 3.2, I se

Re: catch UnicodeDecodeError

2012-07-26 Thread Philipp Hagemeister
On 07/26/2012 01:15 PM, Stefan Behnel wrote: >> exits with a UnicodeDecodeError. > ... that tells you the exact code line where the error occurred. Which property of a UnicodeDecodeError does include that information? On cPython 2.7 and 3.2, I see only start and end, both of which refer to the nu

Re: catch UnicodeDecodeError

2012-07-26 Thread jaroslav . dobrek
> that tells you the exact code line where the error occurred. No need to > look around. You are right: try: for line in f: do_something() except UnicodeDecodeError: do_something_different() does exactly what one would expect it to do. Thank you very much for pointing this out

Re: catch UnicodeDecodeError

2012-07-26 Thread Stefan Behnel
Jaroslav Dobrek, 26.07.2012 12:51: >>> try: >>> for line in f: # here text is decoded implicitly >>>do_something() >>> except UnicodeDecodeError(): >>> do_something_different() > > the code above (without the brackets) is semantically bad: The > exception is not caught. Sure it is

Re: catch UnicodeDecodeError

2012-07-26 Thread Jaroslav Dobrek
On Jul 26, 12:19 pm, wxjmfa...@gmail.com wrote: > On Thursday, July 26, 2012 9:46:27 AM UTC+2, Jaroslav Dobrek wrote: > > On Jul 25, 8:50 pm, Dave Angel wrote: > > > On 07/25/2012 08:09 AM, jaroslav.dob...@gmail.com wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: catch UnicodeDecodeError

2012-07-26 Thread Jaroslav Dobrek
> And the cool thing is: you can! :) > > In Python 2.6 and later, the new Py3 open() function is a bit more hidden, > but it's still available: > >     from io import open > >     filename = "somefile.txt" >     try: >         with open(filename, encoding="utf-8") as f: >             for line in f:

Re: catch UnicodeDecodeError

2012-07-26 Thread wxjmfauth
On Thursday, July 26, 2012 9:46:27 AM UTC+2, Jaroslav Dobrek wrote: > On Jul 25, 8:50 pm, Dave Angel wrote: > > On 07/25/2012 08:09 AM, jaroslav.dob...@gmail.com wrote: > > > > > > > > > > > > > > > > > > > > > On Wednesday, July 25, 2012 1:35:09 PM UTC+2, Philipp Hagemeister > w

Re: catch UnicodeDecodeError

2012-07-26 Thread Chris Angelico
On Thu, Jul 26, 2012 at 5:46 PM, Jaroslav Dobrek wrote: > My problem is solved. What I need to do is explicitly decode text when > reading it. Then I can catch exceptions. I might do this in future > programs. Apologies if it's already been said (I'm only skimming this thread), but ISTM that you

Re: catch UnicodeDecodeError

2012-07-26 Thread Stefan Behnel
Jaroslav Dobrek, 26.07.2012 09:46: > My problem is solved. What I need to do is explicitly decode text when > reading it. Then I can catch exceptions. I might do this in future > programs. Yes, that's the standard procedure. Decode on the way in, encode on the way out, use Unicode everywhere in be

Re: catch UnicodeDecodeError

2012-07-26 Thread Jaroslav Dobrek
On Jul 25, 8:50 pm, Dave Angel wrote: > On 07/25/2012 08:09 AM, jaroslav.dob...@gmail.com wrote: > > > > > > > > > > > On Wednesday, July 25, 2012 1:35:09 PM UTC+2, Philipp Hagemeister wrote: > >> Hi Jaroslav, > > >> you can catch a UnicodeDecodeError just like any other exception. Can > >> you pr

Re: catch UnicodeDecodeError

2012-07-25 Thread Dave Angel
On 07/25/2012 08:09 AM, jaroslav.dob...@gmail.com wrote: > On Wednesday, July 25, 2012 1:35:09 PM UTC+2, Philipp Hagemeister wrote: >> Hi Jaroslav, >> >> you can catch a UnicodeDecodeError just like any other exception. Can >> you provide a full example program that shows your problem? >> >> This w

Re: catch UnicodeDecodeError

2012-07-25 Thread jaroslav . dobrek
On Wednesday, July 25, 2012 1:35:09 PM UTC+2, Philipp Hagemeister wrote: > Hi Jaroslav, > > you can catch a UnicodeDecodeError just like any other exception. Can > you provide a full example program that shows your problem? > > This works fine on my system: > > > import sys > open('tmp', 'wb').

Re: catch UnicodeDecodeError

2012-07-25 Thread Philipp Hagemeister
Hi Jaroslav, you can catch a UnicodeDecodeError just like any other exception. Can you provide a full example program that shows your problem? This works fine on my system: import sys open('tmp', 'wb').write(b'\xff\xff') try: buf = open('tmp', 'rb').read() buf.decode('utf-8') except Uni

Re: catch UnicodeDecodeError

2012-07-25 Thread Andrew Berg
On 7/25/2012 6:05 AM, jaroslav.dob...@gmail.com wrote: > What I really want to do is use something like > > try: > # open file, read line, or do something else, I don't care > except UnicodeDecodeError: > sys.exit("Found a bad char in file " + file + " line " + str(line_number) > > Yet, n

catch UnicodeDecodeError

2012-07-25 Thread jaroslav . dobrek
Hello, very often I have the following problem: I write a program that processes many files which it assumes to be encoded in utf-8. Then, some day, I there is a non-utf-8 character in one of several hundred or thousand (new) files. The program exits with an error message like this: UnicodeDec