On Apr 16, 2012, at 11:59 PM, Matthew Brett wrote:

> Hi,
> 
> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant <tra...@continuum.io> wrote:
>>>> 
>>>> I think the answer to this is yes, but it could be as a feature-filled 
>>>> sub-class (like the current numpy.ma, except in C).
>>> 
>>> I'd love to hear that argument fleshed out in more detail - do you have 
>>> time?
>> 
>> 
>> My proposal here is to basically take the current github NumPy 
>> data-structure and make this a sub-type (in C) of the NumPy 1.6 
>> data-structure which is unchanged in NumPy 1.7.
>> 
>> This would not require removing code but would require another PyTypeObject 
>> and associated structures.  I expect Mark could do this work in 2-4 weeks.   
>> We also have other developers who could help in order to get the sub-type in 
>> NumPy 1.7.     What kind of details would you like to see?
> 
> I was dimly thinking of the same questions that Chuck had - about how
> subclassing would relate to the ufunc changes.

Basically, there are two sets of changes as far as I understand right now:  

        1) ufunc infrastructure understands masked arrays
        2) ndarray grew attributes to represent masked arrays

I am proposing that we keep 1) but change 2) so that only certain kinds of 
NumPy arrays actually have the extra function pointers (effectively a 
sub-type).   In essence, what I'm proposing is that the NumPy 1.6 PyArrayObject 
become a base-object, but the other members of the C-structure are not even 
present unless the Masked flag is set.   Such changes would not require ripping 
code out --- just altering the presentation a bit.   Yet, they could have large 
long-term implications, that we should explore before they get fixed.    

Whether masked arrays should be a formal sub-class is actually an un-related 
question and I generally lean in the direction of not encouraging sub-classes 
of the ndarray.   The big questions are does this object work in the 
calculation infrastructure.   Can I add an array to a masked array.   Does it 
have a sum method?   I think it could be argued that a masked array does have a 
"is a" relationship with an array.   It can also be argued that it is better to 
have a "has a" relationship with an array and be-it's own-object.   Either way, 
this object could still have it's first-part be binary compatible with a NumPy 
Array, and that is what I'm really suggesting. 

-Travis





> 
>> I just think we need more data and uses and this would provide a way to get 
>> that without making a forced decision one way or another.
> 
> Is the proposal that this would be an alternative API to numpy.ma?
> Is numpy.ma not itself satisfactory as a test of these uses, because
> of performance or some other reason?
> 
>>>>> 2) Will likely changes to the masked array API make any difference to
>>>>> the number of extra pointers?  Likely answer no?
>>>>> 
>>>>> Is that right?
>>>> 
>>>> The answer to this is very likely no on the Python side.  But, on the 
>>>> C-side, their could be some differences (i.e. are masked arrays a 
>>>> sub-class of the ndarray or not).
>>>> 
>>>>> 
>>>>> I have the impression that the masked array API discussion still has
>>>>> not come out fully into the unforgiving light of discussion day, but
>>>>> if the answer to 2) is No, then I suppose the API discussion is not
>>>>> relevant to the 3 pointers change.
>>>> 
>>>> You are correct that the API discussion is separate from this one.     
>>>> Overall,  I was surprised at how fervently people would oppose ABI 
>>>> changes.   As has been pointed out, NumPy and Numeric before it were not 
>>>> really designed to prevent having to recompile when changes were made.   
>>>> I'm still not sure that a better overall solution is not to promote better 
>>>> availability of downstream binary packages than excessively worry about 
>>>> ABI changes in NumPy.    But, that is the current climate.
>>> 
>>> The objectors object to any binary ABI change, but not specifically
>>> three pointers rather than two or one?
>> 
>> Adding pointers is not really an ABI change (but removing them after they 
>> were there would be...)  It's really just the addition of data to the NumPy 
>> array structure that they aren't going to use.  Most of the time it would 
>> not be a real problem (the number of use-cases where you have a lot of small 
>> NumPy arrays is small), but when it is a problem it is very annoying.
>> 
>>> 
>>> Is their point then about ABI breakage?  Because that seems like a
>>> different point again.
>> 
>> Yes, it's not that.
>> 
>>> 
>>> Or is it possible that they are in fact worried about the masked array API?
>> 
>> I don't think most people whose opinion would be helpful are really tuned in 
>> to the discussion at this point.  I think they just want us to come up with 
>> an answer and then move forward.    But, they will judge us based on the 
>> solution we come up with.
>> 
>>> 
>>>> Mark and I will talk about this long and hard.  Mark has ideas about where 
>>>> he wants to see NumPy go, but I don't think we have fully accounted for 
>>>> where NumPy and its user base *is* and there may be better ways to 
>>>> approach this evolution.    If others are interested in the outcome of the 
>>>> discussion please speak up (either on the list or privately) and we will 
>>>> make sure your views get heard and accounted for.
>>> 
>>> I started writing something about this but I guess you'd know what I'd
>>> write, so I only humbly ask that you consider whether it might be
>>> doing real damage to allow substantial discussion that is not
>>> documented or argued out in public.
>> 
>> It will be documented and argued in public.     We are just going to have 
>> one off-list conversation to try and speed up the process.    You make a 
>> valid point, and I appreciate the perspective.     Please speak up again 
>> after hearing the report if something is not clear.   I don't want this to 
>> even have the appearance of a "back-room" deal.
>> 
>> Mark and I will have conversations about NumPy while he is in Austin.   
>> There are many other active stake-holders whose opinions and views are 
>> essential for major changes.    Mark and I are working on other things 
>> besides just NumPy and all NumPy changes will be discussed on list and 
>> require consensus or super-majority for NumPy itself to change.     I'm not 
>> sure if that helps.   Is there more we can do?
> 
> As you might have heard me say before, my concern is that it has not
> been easy to have good discussions on this list.   I think the problem
> has been that is has not been clear what the culture was, and how
> decisions got made, and that had led to some uncomfortable and
> unhelpful discussions.  My plea would be for you as BDF$N to strongly
> encourage on-list discussions and discourage off-list discussions as
> far as possible, and to help us make the difficult public effort to
> bash out the arguments to clarity and consensus.  I know that's a big
> ask.
> 
> See you,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to