truncating null bytes in 'S' breaks decoding that needs them
a = np.array([si.encode('utf-16LE') for si in ['Õsc', 'zxc']], dtype='S')
a
array([b'\xd5\x00s\x00c', b'z\x00x\x00c'],
dtype='|S6')
[ai.decode('utf-16LE') for ai in a]
Traceback (most recent call last):
File pyshell#118, line 1, in module
[ai.decode('utf-16LE') for ai in a]
File pyshell#118, line 1, in listcomp
[ai.decode('utf-16LE') for ai in a]
File C:\Programs\Python33\lib\encodings\utf_16_le.py, line 16, in decode
return codecs.utf_16_le_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0x63 in position
4: truncated data
messy workaround (arrays in contrast to scalars are not truncated in `tostring`)
[a[i:i+1].tostring().decode('utf-16LE') for i in range(len(a))]
['Õsc', 'zxc']
Found while playing with examples in the other thread.
Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion