> > I think the long-term generality is a lot bigger than that: > > - Compressed arrays > - Interfaces to HDF files > - Distributed-memory arrays > - Blocked arrays > - Semi-sparse and sparse (diagonal, but also triangular, symmetric, > repeating, ...) > - Lazy evaluation: "generating_multiply(mydata, zero_mask)" > > While what me and Mark F. cares about is computational efficiency for > current arrays, this generality is almost unavoidable. > > In fact -- from ideas Travis have posted to this list earlier + > continuum.io, I assume this wider scope is something you and Travis must > necessarily have thought a lot about. > > Anyway, I agree with Mark F. that right design is probably a new, > low-level, (very small!) C library with no Python dependencies that just > provides some APIs to try to standardize this "how to communicate array > data" at a more basic level than NumPy (and much smaller and different > scope than the various "distill NumPy to a C core" things that's been > talked about the past years, something I have zero interest in). > > If NumPy devs are interested in this discussion on a detailed level, > please say so; me and Mark F might go to Skype (or even meet in person) > to get higher bandwidth than ML, and if more people should be invited > then it's good to know. >
I, for one, am very interested in this discussion. This is very much along the lines I have been thinking. To me it is much more important to solidify the concepts of the "interface" and what is essential about it than to create yet another library. I think to your general notion of a N-d block transfer API you would also need 1-d, 2-d and maybe 3-d specializations which take an additional "axis" argument to denote which sub-region is being described. But, this is probably enough. I am not sure what the specific relationship is between your thoughts and the email thread Mark referenced, but I do know that there is a deep connection to the *concept* of ufuncs which are currently the core abstraction for iterating over low-level calculations. You want the ability to create more powerful iteration constructs (like broadcasting and generalized ufuncs and windowed kernel funcs) while only having to define a single calculation of the kernel. A more generalized ufunc notion coupled with an improved low-level interface concept and you could have a system for doing anything that is independent of NumPy and NumPy would just be one of many array concepts that could co-exist and share development resources. Your thoughts are definitely the future. We are currently building such a thing. We would like it to be open source. We are currently preparing a proposal to DARPA as part of their XDATA proposal in order to help fund this. Email me offlist if you would like to be a part of this proposal. You don't have to be a U.S. citizen to participate in this. Thanks, -Travis > Dag > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion