On Wed, Oct 31, 2018 at 4:01 PM Charles R Harris <charlesr.har...@gmail.com> wrote:
> > > On Wed, Oct 31, 2018 at 3:59 PM Allan Haldane <allanhald...@gmail.com> > wrote: > >> On 10/30/18 5:04 AM, Matti Picus wrote: >> > TL;DR - should we revert the attribute-hiding constructs in >> > ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject? >> > >> > >> > Background >> > >> > >> > NumPy 1.8 deprecated direct access to PyArrayObject fields. It made >> > PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields >> > structure >> > >> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659 >> > with a comment about moving this to a private header. In order to access >> > the fields, users are supposed to use PyArray_FIELDNAME functions, like >> > PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time >> > that numpy might move away from a C-struct based >> > >> > underlying data structure. Other changes were also made to enum names, >> > but those are relatively painless to find-and-replace. >> > >> > >> > NumPy has a mechanism to manage deprecating APIs, C users define >> > NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and >> > can then access the API "as if" they were using NumPy 1.8. Users who do >> > not define NPY_NO_DEPRICATED_API get a warning when compiling, and >> > default to the pre-1.8 API (aliasing of PyArrayObject to >> > PyArrayObject_fields and direct access to the C struct fields). This is >> > convenient for downstream users, both since the new API does not provide >> > much added value, and it is much easier to write a->nd than >> > PyArray_NDIM(a). For instance, pandas uses direct assignment to the data >> > field for fast json parsing >> > >> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203 >> > via chunks. Working around the new API in pandas would require more >> > engineering. Also, for example, cython has a mechanism to transpile >> > python code into C, mapping slow python attribute lookup to fast C >> > struct field access >> > >> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types >> > >> > >> > >> > In a parallel but not really related universe, cython recently upgraded >> > the object mapping so that we can quiet the annoying "size changed" >> > runtime warning https://github.com/numpy/numpy/issues/11788 without >> > requiring warning filters, but that requires updating the numpy.pxd file >> > provided with cython, and it was proposed that NumPy actually vendor its >> > own file rather than depending on the cython one >> > (https://github.com/numpy/numpy/issues/11803). >> > >> > >> > The problem >> > >> > >> > We have now made further changes to our API. In NumPy 1.14 we changed >> > UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate >> > PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning >> > when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported >> > by cython without some deep surgery >> > (https://github.com/cython/cython/pull/2640). When I tried dogfooding >> an >> > updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came >> > across some of these issues (https://github.com/numpy/numpy/pull/12284 >> ). >> > Forcing the new API will require downstream users to refactor code or >> > re-engineer constructs, as in the pandas example above. >> >> I haven't understood the cython issue, but just want to mention that for >> optimization purposes it's nice to be able to modify the fields, like in >> the pandas/json example above. >> >> In particular, PyArray_ConcatenateArrays uses some tricks which >> temporarily clobber the data pointer and shape of an array to >> concatenate arrays efficiently. It seems fairly safe to me. These tricks >> would be nice to re-use in a C port of the new block code we merged >> recently. >> >> Those optimizations aren't possible if only using PyArray_Object. >> >> > It's OK for numpy internals to directly access the structures, as > presumably they will be updated if anything changes. Maybe it would be > useful for Cython to have a flag like Py_LIMITED_API? > That probably only makes sense if we enable such a flag by default - which is a big backwards compat break that users can then undo by setting Py_LIMITED_API=0. Otherwise the vast majority of users will never use it, and hence we still cannot change in the C API without breaking the world. Such breakage would be fine for conda, because it special-cases NumPy in the same way as Python. For wheel/pip users however, it would cause major issues. Ralf
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion