Sun, 25 Jul 2010 10:17:53 -0400, Thomas Robitaille wrote: > The following example illustrates a problem I'm encountering a problem > with the np.fromstring function in Python 3: > > Python 3.1.2 (r312:79360M, Mar 24 2010, 01:33:18) [GCC 4.0.1 (Apple Inc. > build 5493)] on darwin Type "help", "copyright", "credits" or "license" > for more information. > >>> import numpy as np > >>> string = "".join(chr(i) for i in range(256)) > >>> a = np.fromstring(string, dtype=np.int8) > >>> print(len(string)) > 256 > >>> print(len(a)) > 384 > > The array 'a' should have the same size as 'string' since I'm using a > 1-byte datatype. Is this a bug, or do I need to change the way I use > this function in Python 3?
That's a bug. It apparently implicitly encodes the Unicode string you pass in to UTF-8, instead of trying to encode in ASCII and fail, like it does on Python 2: >>> np.fromstring("\xe4".decode('latin1'), dtype=np.int8) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128) You probably meant to use byte strings, though: string = b"".join(chr(i).encode('latin1') for i in range(256)) -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion