Re: Newbie question about text encoding

Dave Angel Tue, 24 Feb 2015 03:28:02 -0800

On 02/24/2015 05:49 AM, pierrick.brih...@gmail.com wrote:

Hello,


Working with pyshp, this is my code :

What version of Python, what version of pyshp, from where, and what OS?These are the first information to supply in any query that goesoutside of the standard library.

For example, you might be running CPython 3.2.1 on Ubuntu 14.04.1, andinstalled pyshp 1.2.1 from https://pypi.python.org/pypi/pyshp


Or some other combination.


import shapefile

inFile = shapefile.Reader("blah")

for sr in inFile.shapeRecords():
     rec = sr.record[2]
     print("Output : ", rec, type(rec))

Output:  hippodrome du resto <class 'str'>
Output:  b'stade de man\xe9 braz' <class 'bytes'>

Why do I get 2 different types ?
How to get a string object when I have accented characters ?

Thank you,

p.b.

From my (cursory) reading of the pyshp docs on the above page, I cannotsee what the [2] element of the record list should look like. So I'dhave to guess.

The bytes object is presumably an encoded version of the characterstring. I don't see anything on that page about unicode, or decode, soyou might have to guess the encoding. Anyway, you can decode thebytestring into a regular string if you can correctly guess the encodingmethod, such as utf-8.


If that were the right decoding, you could just use
    mystring = rec.decode()

But utf-8 does not seem to be the right encoding for that bytestring.So you'll need a form like:

    mystring = rec.decode(encoding='xxx')

for some value of xxx.







--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: Newbie question about text encoding

Reply via email to