Re: [Tutor] Assistance with UnicodeDecodeError
Actually, it's more likely that the char you are grabbing is UTF-16 not UTF-8 which is moving into the double byte... * An assumption based on the following output: >>> u = u'\u2014' >>> s = u.encode("utf-16") >>> print(s) ■¶ >>> s = u.encode("utf-32") >>> print(s) ■ ¶ >>> s = u.encode("utf-16LE") >>> print(s) ¶ >>> s = u.encode("utf-16BE") >>> print(s) ¶ See https://en.wikipedia.org/wiki/Character_encoding to help with the understanding of character encoding, code pages and why they are important. James ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Assistance with UnicodeDecodeError
> > I am trying to scrap text from a website using Python 2.7 in windows 8 and > i am getting this error *"**UnicodeDecodeError: 'charmap codec can't encode > character u'\u2014 in position 11231 character maps to "* > > For starters, move away from Python 2 unless you have a good reason to use it. Unicode is built into Python 3 whereas it's an after thought in Python 2. What's happening is that python doesn't understand the character set in use and it's throwing the exception. You need to tell python what encoding to use: (not all website are "utf-8") Code example (using python 2.7): >>> u = u'\u2014' >>> print(u) Traceback (most recent call last): File "", line 1, in File "c:\Python27\lib\encodings\cp850.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\u2014' in position 0: character maps to >>> s = u.encode("utf-8") >>> print(s) ÔÇö I also strongly suggest you read: https://docs.python.org/2/howto/unicode.html There is much cursing to come. Unicode and especially multi-byte character string processing is a nightmare! Good luck ;-) James ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Assistance with UnicodeDecodeError
On 02/02/2015 02:52 AM, Cristian Di Stefano wrote: Hi Dave, you should set the correct encoding (maybe utf-8) in order to handle data from web. You cannot handle unicode data with simple string, you should encode to ASCII or manage data with the unicode type Best Cristian Please don't top-post, as it confuses who wrote what part and in what sequence. But I can see you're already confused, as you're addressing me when replying to J Mberia. In any case, one cannot encode to ASCII, so you have to be much more explicit in what you're trying to say. Or just wait till the OP clarifies his own code. Il 31/01/2015 23:44, Dave Angel ha scritto: On 01/31/2015 08:37 AM, J Mberia wrote: Hi, -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Assistance with UnicodeDecodeError
Hi Dave, you should set the correct encoding (maybe utf-8) in order to handle data from web. You cannot handle unicode data with simple string, you should encode to ASCII or manage data with the unicode type Best Cristian Il 31/01/2015 23:44, Dave Angel ha scritto: On 01/31/2015 08:37 AM, J Mberia wrote: Hi, Welcome to Python tutor. Thanks for posting using text email, and for specifying both your Python version and Operating system. I am teaching myself programming in python and assistance with UnicodeDecodeError I am trying to scrap text from a website using Python 2.7 in windows 8 and i am getting this error *"**UnicodeDecodeError: 'charmap codec can't encode character u'\u2014 in position 11231 character maps to "* *How do i resolve? Pls assist.* You can start by posting the whole error message, including the stack trace. Then you probably should include an appropriate segment of your code. The message means that you've got some invalid characters that you're trying to convert. That can either be that the data is invalid, or that you're specifying the wrong encoding, directly or implicitly. --- Questa e-mail è stata controllata per individuare virus con Avast antivirus. http://www.avast.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Assistance with UnicodeDecodeError
On 01/31/2015 08:37 AM, J Mberia wrote: Hi, Welcome to Python tutor. Thanks for posting using text email, and for specifying both your Python version and Operating system. I am teaching myself programming in python and assistance with UnicodeDecodeError I am trying to scrap text from a website using Python 2.7 in windows 8 and i am getting this error *"**UnicodeDecodeError: 'charmap codec can't encode character u'\u2014 in position 11231 character maps to "* *How do i resolve? Pls assist.* You can start by posting the whole error message, including the stack trace. Then you probably should include an appropriate segment of your code. The message means that you've got some invalid characters that you're trying to convert. That can either be that the data is invalid, or that you're specifying the wrong encoding, directly or implicitly. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Assistance with UnicodeDecodeError
Hi, I am teaching myself programming in python and assistance with UnicodeDecodeError I am trying to scrap text from a website using Python 2.7 in windows 8 and i am getting this error *"**UnicodeDecodeError: 'charmap codec can't encode character u'\u2014 in position 11231 character maps to "* *How do i resolve? Pls assist.* *Jerry* ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor