On May 10, 2012, at 12:21 AM, Charles R Harris wrote: > > > On Wed, May 9, 2012 at 11:05 PM, Benjamin Root <ben.r...@ou.edu> wrote: > > > On Wednesday, May 9, 2012, Nathaniel Smith wrote: > > > My only objection to this proposal is that committing to this approach > seems premature. The existing masked array objects act quite > differently from numpy.ma, so why do you believe that they're a good > foundation for numpy.ma, and why will users want to switch to their > semantics over numpy.ma's semantics? These aren't rhetorical > questions, it seems like they must have concrete answers, but I don't > know what they are. > > Based on the design decisions made in the original NEP, a re-made numpy.ma > would have to lose _some_ features particularly, the ability to share masks. > Save for that and some very obscure behaviors that are undocumented, it is > possible to remake numpy.ma as a compatibility layer. > > That being said, I think that there are some fundamental questions that has > concerned. If I recall, there were unresolved questions about behaviors > surrounding assignments to elements of a view. > > I see the project as broken down like this: > 1.) internal architecture (largely abi issues) > 2.) external architecture (hooks throughout numpy to utilize the new features > where possible such as where= argument) > 3.) getter/setter semantics > 4.) mathematical semantics > > At this moment, I think we have pieces of 2 and they are fairly > non-controversial. It is 1 that I see as being the immediate hold-up here. 3 > & 4 are non-trivial, but because they are mostly about interfaces, I think we > can be willing to accept some very basic, fundamental, barebones components > here in order to lay the groundwork for a more complete API later. > > To talk of Travis's proposal, doing nothing is no-go. Not moving forward > would dishearten the community. Making a ndmasked type is very intriguing. I > see it as a set towards eventually deprecating ndarray? Also, how would it > behave with no.asarray() and no.asanyarray()? My other concern is a possible > violation of DRY. How difficult would it be to maintain two ndarrays in > parallel? > > As for the flag approach, this still doesn't solve the problem of legacy code > (or did I misunderstand?) > > My understanding of the flag is to allow the code to stay in and get reworked > and experimented with while keeping it from contaminating conventional use. > > The whole point of putting the code in was to experiment and adjust. The > rather bizarre idea that it needs to be perfect from the get go is > disheartening, and is seldom how new things get developed. Sure, there is a > plan up front, but there needs to be feedback and change. And in fact, I > haven't seen much feedback about the actual code, I don't even know that the > people complaining have tried using it to see where it hurts. I'd like that > sort of feedback. >
I don't think anyone is saying it needs to be perfect from the get go. What I am saying is that this is fundamental enough to downstream users that this kind of thing is best done as a separate object. The flag could still be used to make all Python-level array constructors build ndmasked objects. But, this doesn't address the C-level story where there is quite a bit of downstream use where people have used the NumPy array as just a pointer to memory without considering that there might be a mask attached that should be inspected as well. The NEP addresses this a little bit for those C or C++ consumers of the ndarray in C who always use PyArray_FromAny which can fail if the array has non-NULL mask contents. However, it is *not* true that all downstream users use PyArray_FromAny. A large number of users just use something like PyArray_Check and then PyArray_DATA to get the pointer to the data buffer and then go from there thinking of their data as a strided memory chunk only (no extra mask). The NEP fundamentally changes this simple invariant that has been in NumPy and Numeric before it for a long, long time. I really don't see how we can do this in a 1.7 release. It has too many unknown and I think unknowable downstream effects. But, I think we could introduce another arrayobject that is the masked_array with a Python-level flag that makes it the default array in Python. There are a few more subtleties, PyArray_Check by default will pass sub-classes so if the new ndmask array were a sub-class then it would be passed (just like current numpy.ma arrays and matrices would pass that check today). However, there is a PyArray_CheckExact macro which could be used to ensure the object was actually of PyArray_Type. There is also the PyArg_ParseTuple command with "O!" that I have seen used many times to ensure an exact NumPy array. -Travis > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion