Hi, On Tue, Mar 15, 2011 at 5:30 PM, Christoph Gohlke <cgoh...@uci.edu> wrote: > > > On 3/15/2011 5:13 PM, Matthew Brett wrote: >> Hi, >> >> On Tue, Mar 15, 2011 at 10:23 AM, Matthew Brett<matthew.br...@gmail.com> >> wrote: >>> Hi, >>> >>> On Tue, Mar 15, 2011 at 10:12 AM, Pauli Virtanen<p...@iki.fi> wrote: >>>> Tue, 15 Mar 2011 10:06:09 -0700, Matthew Brett wrote: >>>>> Sorry to ask, and I ask partly because I'm in the middle of a py3k port, >>>>> but is this the right fix to this problem? I was confused by the >>>>> presence of the old PyString_AsString function. >>>> >>>> It's not a correct fix. The original code seems also wrong ("index" can >>>> either be Unicode or Bytes/String object), and will probably bomb when >>>> indexing with Unicode strings on Python 2. The right thing to do is to >>>> make it show the repr of the "index" object. >>> >>> OK - I realize I'm being very lazy here but, do you mean: >>> >>> PyErr_Format(PyExc_ValueError, >>>>> "field named %s not found.", >>>>> PyString_AsString(PyObject_Repr(index))); >> >> Being less lazy, and having read the cpython source, and read >> Christoph's mail more carefully, I believe Christoph's patch is >> correct... >> >> Unicode indexing of structured array fields doesn't raise an error on >> python 2.x; I assume because PyString_AsString is returning a char* >> using the Unicode default encoding, as per the docs. >> > > I think the patch is correct for Python 3 but, as Pauli pointed out, the > original code can crash also under Python 2.x when indexing with an > unicode string that contains non-ascii7 characters, which seems much > less likely and apparently has been undetected for quite a while. For > example, this crashes for me on Python 2.7: > > import numpy as np > a = np.zeros((1,), dtype=[('f1', 'f')]) > a[u's'] = 1 # works > a[u'µ'] = 1 # crash > > So, the proposed patch works for Python 3, but there could be a better > patch fixing also the corner case on Python 2.
Thanks. How about something like the check further up the file: if (PyUnicode_Check(op)) { temp = PyUnicode_AsUnicodeEscapeString(op); } PyErr_Format(PyExc_ValueError, "field named %s not found.", PyBytes_AsString(temp)); ? I'm happy to submit a pull request with tests if that kind of thing looks sensible to y'all, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion