On Tue, Jul 12, 2016 at 7:56 AM, Travis Oliphant <tra...@continuum.io> wrote:
> > > > http://www.continuum.io > On Mon, Jul 11, 2016 at 12:58 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > >> >> >> On Mon, Jul 11, 2016 at 11:39 AM, Chris Barker <chris.bar...@noaa.gov> >> wrote: >> >>> >>> >>> On Sun, Jul 10, 2016 at 8:12 PM, Nathan Goldbaum <nathan12...@gmail.com> >>> wrote: >>> >>>> >>>> Maybe this can be an informal BOF session? >>>> >>> >>> or maybe a formal BoF? after all, how formal do they get? >>> >>> Anyway, it was my understanding that we really needed to do some >>> significant refactoring of how numpy deals with dtypes in order to do this >>> kind of thing cleanly -- so where has that gone since last year? >>> >>> Maybe this conversation should be about how to build a more flexible >>> dtype system generally, rather than specifically about unit support. >>> (though unit support is a great use-case to focus on) >>> >> >> Note that Mark Wiebe will also be giving a talk Friday, so he may be >> around. As the last person to add a type to Numpy and the designer of DyND >> he might have some useful input. DyND development is pretty active and I'm >> always curious how we can somehow move in that direction. >> >> > There has been a lot of work over the past 6 months on making DyND > implement the "pluribus" concept that I have talked about briefly in the > past. DyND now has a separate C++ ndt data-type library. The Python > interface to that type library is still unified in the dynd module but it > is separable and work is in progress to make a separate Python-wrapper to > this type library. The dynd type library is datashape described at > http://datashape.pydata.org > > This type system is extensible and could be the foundation of a > re-factored NumPy. My view (and what I am encouraging work in the > direction of) is that array computing in Python should be refactored into a > "type-subsystem" (I think ndt is the right model there), a generic > ufunc-system (I think dynd has a very promising approach there as well), > and then a container (the memoryview already in Python might be enough > already). These modules could be separately installed, maintained and > eventually moved into Python itself. > > Then, a potential future NumPy project could be ported to be a layer of > calculations and connections to other C-libraries on-top of this system. > Many parts of the current code could be re-used in that effort --- or the > new system could be part of a re-factoring of NumPy to make the innards of > NumPy more accessible to a JIT compiler. > > We are already far enough along that this could be pursued with a > motivated person. It would take 18 months to complete the system but > first-light would be less than 6 months for a dedicated, motivated, and > talented resource. DyND is far enough along as well as Cython and/or > Numba to make this pretty straight-forward. For this re-factored > array-computing project to take the NumPy name, this community would have > to decide that that is the right thing to do. But, other projects like > Pandas and/or xarray and/or numpy-py and/or NumPy on Jython could use this > sub-system also. > > It has taken me a long time to actually get to the point where I would > recommend a specific way forward. I have thought about this for many > years and don't make these recommendations lightly. The pluribus concept > is my recommendation about what would be best now and in the future --- and > I will be pursuing this concept and working to get to a point where this > community will accept it if possible because it would be ideal if this new > array library were still called NumPy. > > My working view is that someone will have to build the new prototype NumPy > for the community to evaluate whether it's the right approach and get > consensus that it is the right way forward. There is enough there now > with DyND, data-shape, and Numba/Cython to do this fairly quickly. It > is not strictly necessary to use DyND or Numba or even data-shape to > accomplish this general plan --- but these are already available and a > great place to start as they have been built explicitly with the intention > of improving array-computing in Python. > > This potential NumPy could be backwards compatible from an API perspective > (including a C-API) --- though recompliation would be necessary and there > would be some semantic differences in corner-cases that could either be > fixed where necessary but potentially just made part of the new version. > > I will be at the Continuum Happy hour on Thursday at our offices and > welcome anyone to come discuss things with me there --- I am also willing > to meet with anyone on Thursday and Friday if I can --- but I don't have a > ticket to ScPy itself. Please CC me directly if you have questions. I > try to follow the numpy-discussion mailing list but I am not always > successful at keeping up. > > To be clear as some have mis-interpreted me in the past, while I > originally wrote NumPy (borrowing heavily from Numeric and drawing > inspiration from Numarray and receiving a lot of help for specific modules > from many of you), the community has continued to develop NumPy and now has > a proper governance model. I am now simply an interested NumPy user and > previous NumPy developer who finally has some concrete ideas to share based > on work that I have been funding, leading, and encouraging for the past > several years. > > I am still very interested in helping NumPy progress, but we are also > going to be taking these ideas to create a general concept of the "buffer > protocol in Python" to enable cross-language code-sharing to enable more > code re-use for data analytics among language communities. This is the > concept of "data-fabric" which is pre-alpha vapor-ware at this point but > with some ideas expressed at http://datashape.pydata.org and here: > https://github.com/blaze/datafabric and is something DyND is enabling. > > NumPy itself has a clear governance model and whether NumPy (the project) > adopts any of the new array-computing concepts I am proposing will depend > on this community's decisions as well as work done by motivated developers > willing to work on prototypes. I will be wiling to help get funding for > someone motivated to work on this. > Thanks Travis! I'm going to let the technical parts sink in for a bit first, but wanted to say already that your continued interest and sharing of new ideas are much appreciated. Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion