Hi, I am just continuing the discussion around ABI/API, the technical side of things that is, as this is unrelated to 1.7.x. release.
On Tue, Jun 26, 2012 at 11:41 AM, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > On 06/26/2012 11:58 AM, David Cournapeau wrote: >> On Tue, Jun 26, 2012 at 10:27 AM, Dag Sverre Seljebotn >> <d.s.seljeb...@astro.uio.no> wrote: >>> On 06/26/2012 05:35 AM, David Cournapeau wrote: >>>> On Tue, Jun 26, 2012 at 4:10 AM, Ondřej Čertík<ondrej.cer...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> My understanding is that Travis is simply trying to stress "We have to >>>>> think about the implications of our changes on existing users." and >>>>> also that little changes (with the best intentions!) that however mean >>>>> either a breakage or confusion for users (due to historical reasons) >>>>> should be avoided if possible. And I very strongly feel the same way. >>>>> And I think that most people on this list do as well. >>>> >>>> I think Travis is more concerned about API than ABI changes (in that >>>> example for 1.4, the ABI breakage was caused by a change that was >>>> pushed by Travis IIRC). >>>> >>>> The relative importance of API vs ABI is a tough one: I think ABI >>>> breakage is as bad as API breakage (but matter in different >>>> circumstances), but it is hard to improve the situation around our ABI >>>> without changing the API (especially everything around macros and >>>> publicly accessible structures). Changing this is politically >>> >>> But I think it is *possible* to get to a situation where ABI isn't >>> broken without changing API. I have posted such a proposal. >>> If one uses the kind of C-level duck typing I describe in the link >>> below, one would do >>> >>> typedef PyObject PyArrayObject; >>> >>> typedef struct { >>> ... >>> } NumPyArray; /* used to be PyArrayObject */ >> >> Maybe we're just in violent agreement, but whatever ends up being used >> would require to change the *current* C API, right ? If one wants to > > Accessing arr->dims[i] directly would need to change. But that's been > discouraged for a long time. By "API" I meant access through the macros. > > One of the changes under discussion here is to change PyArray_SHAPE from > a macro that accepts both PyObject* and PyArrayObject* to a function > that only accepts PyArrayObject* (hence breakage). I'm saying that under > my proposal, assuming I or somebody else can find the time to implement > it under, you can both make it a function and have it accept both > PyObject* and PyArrayObject* (since they are the same), undoing the > breakage but allowing to hide the ABI. > > (It doesn't give you full flexibility in ABI, it does require that you > somewhere have an "npy_intp dims[nd]" with the same lifetime as your > object, etc., but I don't consider that a big disadvantage). > >> allow for changes in our structures more freely, we have to hide them >> from the headers, which means breaking the code that depends on the >> structure binary layout. Any code that access those directly will need >> to be changed. >> >> There is the particular issue of iterator, which seem quite difficult >> to make "ABI-safe" without losing significant performance. > > I don't agree (for some meanings of "ABI-safe"). You can export the data > (dataptr/shape/strides) through the ABI, then the iterator uses these in > whatever way it wishes consumer-side. Sort of like PEP 3118 without the > performance degradation. The only sane way IMO of doing iteration is > building it into the consumer anyway. (I have not read the whole cython discussion yet) What do you mean by "building iteration in the consumer" ? My understanding is that any data export would be done through a level of indirection (dataptr/shape/strides). Conceptually, I can't see how one could keep ABI without that level of indirection without some compile. In the case of iterator, that means multiple pointer chasing per sample -- i.e. the tight loop issue you mentioned earlier for PyArray_DATA is the common case for iterator. I can only see two ways of doing fast (special casing) iteration: compile-time special casing or runtime optimization. Compile-time requires access to the internals (even if one were to use C++ with advanced template magic ala STL/iterator, I don't think one can get performance if everything is not in the headers, but maybe C++ compilers are super smart those days in ways I can't comprehend). I would think runtime is the long-term solution, but that's far away, David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion