Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-22 Thread Paul Anton Letnes

On 21. apr. 2012, at 00:16, Drew Frank wrote:

> On Fri, Apr 20, 2012 at 11:45 AM, Chris Barker  wrote:
>> 
>> On Fri, Apr 20, 2012 at 11:39 AM, Dag Sverre Seljebotn
>>  wrote:
>>> Oh, right. I was thinking "small" as in "fits in L2 cache", not small as
>>> in a few dozen entries.
> 
> Another example of a small array use-case: I've been using numpy for
> my research in multi-target tracking, which involves something like a
> bunch of entangled hidden markov models. I represent target states
> with small 2d arrays (e.g. 2x2, 4x4, ..) and observations with small
> 1d arrays (1 or 2 elements). It may be possible to combine a bunch of
> these small arrays into a single larger array and use indexing to
> extract views, but it is much cleaner and more intuitive to use
> separate, small arrays. It's also convenient to use numpy arrays
> rather than a custom class because I use the linear algebra
> functionality as well as integration with other libraries (e.g.
> matplotlib).
> 
> I also work with approximate probabilistic inference in graphical
> models (belief propagation, etc), which is another area where it can
> be nice to work with many small arrays.
> 
> In any case, I just wanted to chime in with my small bit of evidence
> for people wanting to use numpy for work with small arrays, even if
> they are currently pretty slow. If there were a special version of a
> numpy array that would be faster for cases like this, I would
> definitely make use of it.
> 
> Drew

Although performance hasn't been a killer for me, I've been using numpy arrays 
(or matrices) for Mueller matrices [0] and Stokes vectors [1]. These describe 
the polarization of light and are always 4x1 vectors or 4x4 matrices. It would 
be nice if my code ran in 1 night instead of one week, although this is still 
tolerable in my case. Again, just an example of how small-vector/matrix 
performance can be important in certain use cases.

Paul

[0] https://en.wikipedia.org/wiki/Mueller_calculus
[1] https://en.wikipedia.org/wiki/Stokes_vector
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Drew Frank
On Fri, Apr 20, 2012 at 11:45 AM, Chris Barker  wrote:
>
> On Fri, Apr 20, 2012 at 11:39 AM, Dag Sverre Seljebotn
>  wrote:
> > Oh, right. I was thinking "small" as in "fits in L2 cache", not small as
> > in a few dozen entries.

Another example of a small array use-case: I've been using numpy for
my research in multi-target tracking, which involves something like a
bunch of entangled hidden markov models. I represent target states
with small 2d arrays (e.g. 2x2, 4x4, ..) and observations with small
1d arrays (1 or 2 elements). It may be possible to combine a bunch of
these small arrays into a single larger array and use indexing to
extract views, but it is much cleaner and more intuitive to use
separate, small arrays. It's also convenient to use numpy arrays
rather than a custom class because I use the linear algebra
functionality as well as integration with other libraries (e.g.
matplotlib).

I also work with approximate probabilistic inference in graphical
models (belief propagation, etc), which is another area where it can
be nice to work with many small arrays.

In any case, I just wanted to chime in with my small bit of evidence
for people wanting to use numpy for work with small arrays, even if
they are currently pretty slow. If there were a special version of a
numpy array that would be faster for cases like this, I would
definitely make use of it.

Drew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Chris Barker
On Fri, Apr 20, 2012 at 11:39 AM, Dag Sverre Seljebotn
 wrote:
> Oh, right. I was thinking "small" as in "fits in L2 cache", not small as
> in a few dozen entries.

or even two or three entries.

I often use a (2,) or (3,) numpy array to represent an (x,y) point
(usually pulled out from a Nx2 array).

I like it 'cause i can do array math, etc. it makes the code cleaner,
but it's actually faster to use tuples and do the indexing by hand :-(

but yes, having something built-in, or at least very compatible with
numpy would be best.

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Dag Sverre Seljebotn
On 04/20/2012 08:35 PM, Fernando Perez wrote:
> On Fri, Apr 20, 2012 at 11:27 AM, Dag Sverre Seljebotn
>   wrote:
>>
>> I don't think you gain that much by using a different type though? Those 
>> optimized code paths could be plugged into NumPy as well.
>
> Could be: this was years ago, and the bottleneck for me was in the
> constructor and in basic arithmetic.  I had to make millions of these
> vectors and I needed to do basic arithmetic, but they were always 1-d
> and had one to 6 entries only.  So writing a very static constructor
> with very low overhead did make a huge difference in that project.

Oh, right. I was thinking "small" as in "fits in L2 cache", not small as 
in a few dozen entries. You definitely still need a Cython class then.

Dag

>
> Also, when I wrote this code numpy didn't exist, I was using Numeric.
>
> Perhaps the same results could be obtained in numpy itself with
> judicious coding, I don't know.  But in that project, ~600 lines of
> really easy pyrex code (it would be cython today) made a *huge*
> performance difference for me.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Fernando Perez
On Fri, Apr 20, 2012 at 11:27 AM, Dag Sverre Seljebotn
 wrote:
>
> I don't think you gain that much by using a different type though? Those 
> optimized code paths could be plugged into NumPy as well.

Could be: this was years ago, and the bottleneck for me was in the
constructor and in basic arithmetic.  I had to make millions of these
vectors and I needed to do basic arithmetic, but they were always 1-d
and had one to 6 entries only.  So writing a very static constructor
with very low overhead did make a huge difference in that project.

Also, when I wrote this code numpy didn't exist, I was using Numeric.

Perhaps the same results could be obtained in numpy itself with
judicious coding, I don't know.  But in that project, ~600 lines of
really easy pyrex code (it would be cython today) made a *huge*
performance difference for me.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Dag Sverre Seljebotn


Fernando Perez  wrote:

>On Fri, Apr 20, 2012 at 9:49 AM, Chris Barker 
>wrote:
>>
>> I recall discossion a couple times in the past of having some
>> special-case numpy arrays for the simple, small cases -- perhaps 1-d
>> or 2-d C-contiguous only, for instance. That might be a better way to
>> address the small-array performance issue, and free us of concerns
>> about minor growth to the core ndarray object.
>
>+1 on that: I once wrote such code in pyrex (years ago) and it worked
>extremely well for me.  No fancy features, very small footprint and
>highly optimized codepaths that gave me excellent performance.

I don't think you gain that much by using a different type though? Those 
optimized code paths could be plugged into NumPy as well.

I always assumed that it would be possible to optimize NumPy, just that nobody 
invested time in it.

Starting from scratch you gain that you don't have to work with and understand 
NumPy's codebase, but I honestly think that's a small price to pay for 
compatibility.

Dag


>
>
>Cheers,
>
>f
>___
>NumPy-Discussion mailing list
>NumPy-Discussion@scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Fernando Perez
On Fri, Apr 20, 2012 at 9:49 AM, Chris Barker  wrote:
>
> I recall discossion a couple times in the past of having some
> special-case numpy arrays for the simple, small cases -- perhaps 1-d
> or 2-d C-contiguous only, for instance. That might be a better way to
> address the small-array performance issue, and free us of concerns
> about minor growth to the core ndarray object.

+1 on that: I once wrote such code in pyrex (years ago) and it worked
extremely well for me.  No fancy features, very small footprint and
highly optimized codepaths that gave me excellent performance.


Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Chris Barker
On Mon, Apr 16, 2012 at 7:46 PM, Travis Oliphant  wrote:
> As Chuck points out, 3 more pointers is not necessarily that big of a deal if 
> you are talking about a large array (though for small arrays it could matter).

yup -- for the most part, numpy arrays are best for workign with large
data sets, in which case a little bit bigger core object doesn't
matter. But there are many times that we do want to work with small
arrays (particularly ones that are pulled out of a larger array --
iterating over an array or (x,y) points or the like)

However, numpy overhead is already pretty heavy for such use, so it
may not matter.

I recall discossion a couple times in the past of having some
special-case numpy arrays for the simple, small cases -- perhaps 1-d
or 2-d C-contiguous only, for instance. That might be a better way to
address the small-array performance issue, and free us of concerns
about minor growth to the core ndarray object.

-Chris




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-20 Thread Frédéric Bastien
Hi,

I just discovered that the NA mask will modify the base ndarray
object. So I read about it to find the consequences on our c code. Up
to now I have fully read:

http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

and partially read:

https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
https://github.com/njsmith/numpy/wiki/NA-discussion-status

In those documents, I see a problem with legacy code that will receive
an NA masked array as input. If I missed something, tell me.


All our c functions check their inputs array with PyArray_Check and
PyArray_ISALIGNED. If the NA mask array is set inside the ndarray c
object, our c functions who don't know about it and will treat those
inputs as not masked. So the user will have unexpected results. The
output will be an ndarray without mask but the code will have used the
masked value.

This will also happen with all other c code that use ndarray.

In our case, all the input check is done at the same place, so adding
the check with "PyArray_HasNASupport(PyArrayObject* obj)" to raise an
error will be easy for us. But I don't think this is the case for most
other c code.

So I would prefer a separate object to protect users from code not
being updated to reject NA masked inputs.

An alternative would be to have PyArray_Check return False for the NA
masked array, but I don't like that as this break the semantic that it
check for the class.

A last option I see would be to make the NPY_ARRAY_BEHAVED flag also
check that the array is not an NA marked array. I suppose many c code
do this check. But this is not a bullet proof check as not all code
(as ours) do not use it.


Also, I don't mind the added pointers to the structure as we use big arrays.

thanks

Frédéric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Tim Cera
I have never found mailing lists good places for discussion and consensus.
 I think the format itself does not lend itself to involvement, carefully
considered (or the ability to change) positions, or voting since all of it
can be so easily lost within all of the quoting, the back and forth, people
walking away,,,etc.  And you also want involvement from people who don't
have x hours to craft a finely worded, politically correct, and detailed
response.  I am not advocating this particular system, but something like
http://meta.programmers.stackexchange.com/ would be a better platform for
building to a decision when there are many choices to be made.

Now about ma, NA, missing...

I am just an engineer working in water resources and I had lots of
difficulty reading the NEP (so slepy) so I will be the first to admit
that I probably have something wrong.  Just for reference (since I missed
it the first time around) Nathaniel posted this page at
https://github.com/njsmith/numpy/wiki/NA-discussion-status

I think that I could adapt to everything that is discussed in the NEP, but
I do have some comments about things that puzzled me.  I don't need
answers, but if I am puzzled maybe others are also.

First - 'maskna=True'?
Tested on development version of numpy...
>>> a = np.arange(10, maskna = True)
>>> a[:2] = np.NA
>>> a
array([NA, NA, 2, 3, 4, 5, 6, 7, 8, 9])

Why do I have to specify 'maskna = True'?  If NA and ndarray are intended
to be combined in some way, then I don't think that I need this.  During
development, I understand, but the NEP shouldn't have it.  Heck, even if
you keep NA and ndarrays separate when someone tries to set a ndarray
element with np.NA, instead of a ValueError convert to an NA array.  I say
that very casually as if I know how to do it.  I do have a proof, but the
margin is too small to include it.  :-)

I am torn about 'skipna=True'.  I think I understand the desire for
explicit behavior, but I suspect that every operation that I would use a NA
array for, would require 'skipna=True'.  Actually, I don't use that many
reducing functions, so maybe not a big deal.  Regardless of the skipna
setting, a related idea that could be useful for reducing functions is
to set an 'includesna' attribute with the returned scalar value.

The view() didn't work as described in the NEP, where np.NA isn't
propagated back to the original array.  This could be because the NEP
references a 'missingdata' work in progress branch and I don't know what
has been merged.  I can force the NEP described behavior if I set
'd.flags.ownmaskna=True'.
>>> d = a.view()
>>> d
 array([NA, NA, 2, 3, 4, 5, 6, 7, 8, 9])
>>> d[0] = 4
>>> a
 array([4, NA, 2, 3, 4, 5, 6, 7, 8, 9])
>>> d
 array([4, NA, 2, 3, 4, 5, 6, 7, 8, 9])
>>> d[6] = np.NA
>>> d
 array([4, NA, 2, 3, 4, 5, NA, 7, 8, 9])
>>> a
 array([4, NA, 2, 3, 4, 5, NA, 7, 8, 9])

In the NEP 'Accessing a Boolean Mask' section there is a comment about...
actually I don't understand this section at all.  Especially about a
boolean byle level mask?  Why would it need to be a byte level mask in
order to be viewed?  The logic also of mask = True or False, that can be
easily handled by using a better name for the flag.  'mask = True' means
that the value is masked (missing), where if 'exposed = True' is used that
means the value is not masked (not missing).

The biggest question mark to me is that 'a[0] = np.NA' is destructive and
(using numpy.ma) 'a.mask[0] = True' is not.  Is that a big deal?  I am
trying to think back on all of my 'ma' code and try to remember if I
masked, then unmasked values and I don't recall any time that I did that.
 Of course my use cases are constrained to what I have done in the past.
 It feels like a bad idea, for the sake of saving the memory for the mask
bits.

Now, the amazing thing is that understanding so little, doing even less of
the work, I get to vote. Sounds like America!

I would really like to see NA in the wild, and I think that I can adapt my
ma code to it, so +1.  If it has to wait until 1.8, +1.  If it has to wait
until 1.9, +1.

Kindest regards,
Tim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Matthew Brett
On Tue, Apr 17, 2012 at 12:32 PM, Fernando Perez  wrote:
> On Tue, Apr 17, 2012 at 12:10 PM, Matthew Brett  
> wrote:
>> Right - but that would be an absurd overstatement of what I said.
>> There's no point in addressing something I didn't say and no sensible
>> person would think.   Indeed, it makes the discussion harder.
>
> Well, in that case neither Eric Firing nor I are 'sensible persons',
> since that's how we both understood what you said (Eric's email
> appeared to me as a more concise/better phrased version of the same
> points I was making). You said:
>
> """
> I'm glad to hear that discussion is happening, but please do have it
> on list.   If it's off list it easy for people to feel they are being
> bypassed, and that the public discussion is not important.
> """
>
> I don't think it's an 'absurd overstatement' to interpret that as
> "don't have discussions off-list", but hey, it may just be me :)

The absurd over-statement is the following:

" I'm afraid I have to disagree: you seem to be proposing an absolute,
'zero-tolerance'-style policy against any off-list discussion.  "

>> meta-problem that is a real problem is that we've shown ourselves that
>> we are not currently good at having discussions on list.
>
> Oh, I know that did happen in the past regarding this very topic (the
> big NA mess last summer); what I meant was to try and trust that *this
> time around* things might be already moving in a better direction,
> which it seems to me they are.  It seems to me that everyone is
> genuinely trying to tackle the discussion/consensus questions head-on
> right on the list, and that's why I proposed waiting to see if there
> were really any problems before asking Nathaniel not to have any
> discussion off-list (esp. since we have no evidence that what they
> talked about had any impact on any decisions bypassing the open
> forum).

The question - which seems to me  to be sensible rational and
important - is how to get better at on-list discussion, and whether
taking this particular discussion mainly off-list is good or bad in
that respect.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Fernando Perez
On Tue, Apr 17, 2012 at 12:10 PM, Matthew Brett  wrote:
> Right - but that would be an absurd overstatement of what I said.
> There's no point in addressing something I didn't say and no sensible
> person would think.   Indeed, it makes the discussion harder.

Well, in that case neither Eric Firing nor I are 'sensible persons',
since that's how we both understood what you said (Eric's email
appeared to me as a more concise/better phrased version of the same
points I was making). You said:

"""
I'm glad to hear that discussion is happening, but please do have it
on list.   If it's off list it easy for people to feel they are being
bypassed, and that the public discussion is not important.
"""

I don't think it's an 'absurd overstatement' to interpret that as
"don't have discussions off-list", but hey, it may just be me :)

> meta-problem that is a real problem is that we've shown ourselves that
> we are not currently good at having discussions on list.

Oh, I know that did happen in the past regarding this very topic (the
big NA mess last summer); what I meant was to try and trust that *this
time around* things might be already moving in a better direction,
which it seems to me they are.  It seems to me that everyone is
genuinely trying to tackle the discussion/consensus questions head-on
right on the list, and that's why I proposed waiting to see if there
were really any problems before asking Nathaniel not to have any
discussion off-list (esp. since we have no evidence that what they
talked about had any impact on any decisions bypassing the open
forum).

Best,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Matthew Brett
Hi,

On Tue, Apr 17, 2012 at 12:04 PM, Fernando Perez  wrote:
> On Tue, Apr 17, 2012 at 11:40 AM, Matthew Brett  
> wrote:
>> I'm glad to hear that discussion is happening, but please do have it
>> on list.   If it's off list it easy for people to feel they are being
>> bypassed, and that the public discussion is not important.
>
> I'm afraid I have to disagree: you seem to be proposing an absolute,
> 'zero-tolerance'-style policy against any off-list discussion.  The
> only thing ZT policies achieve is to remove common sense and human
> judgement from a process, invariably causing more harm than they do
> good, no matter how well intentioned.

Right - but that would be an absurd overstatement of what I said.
There's no point in addressing something I didn't say and no sensible
person would think.   Indeed, it makes the discussion harder.

It's just exhausting to have to keep stating the obvious.  Of course
discussions happen off-list.  Of course sometimes that has to happen.
Of course that can be a better and quicker way of having discussions.

However, in this case the

> Let's try to trust for one minute that the actual decisions will be
> made here with solid debate and project-wide input, and seek change
> only if we have evidence that this isn't happening (not evidence of a
> meta-problem that isn't a problem here).

meta-problem that is a real problem is that we've shown ourselves that
we are not currently good at having discussions on list.

There are clearly reasons for that, and also clearly, they can be
addressed.   The particular point I am making is neither silly nor
extreme nor vapid.  It is simply that, in order to make discussion
work better on the list, it is in my view better to make an explicit
effort to make the discussions - explicit.

Yours in Bay Area opposition,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Fernando Perez
On Tue, Apr 17, 2012 at 11:40 AM, Matthew Brett  wrote:
> I'm glad to hear that discussion is happening, but please do have it
> on list.   If it's off list it easy for people to feel they are being
> bypassed, and that the public discussion is not important.

I'm afraid I have to disagree: you seem to be proposing an absolute,
'zero-tolerance'-style policy against any off-list discussion.  The
only thing ZT policies achieve is to remove common sense and human
judgement from a process, invariably causing more harm than they do
good, no matter how well intentioned.

There are perfectly reasonable cases where a quick phone call may be a
more effective and sensible way to work than an on-list discussion.
The question isn't whether someone, somewhere, had an off-list
discussion or not; it's whether *the main decision making process* is
being handled transparently or not.

I trust that Nathaniel and Travis had a sensible reason to speak
off-list; as long as it appears clear that the *decisions about numpy*
are being made via public discussion with room for all necessary input
and confrontation of disparate viewpoints, I don't care what they talk
about in private.

In IPython, I am constantly fielding private emails that I very often
refer to the list because they make more sense there, but I also have
off-list discussions when I consider that to be the right thing to do.
 And I certainly hope nobody ever asks me to *never* have an off-list
discussion.  I try very hard to ensure that the direction of the
project is very transparent, with redundant points (people) of access
to critical resources and a good vetting of key decisions with public
input (e.g. our first IPEP at
https://github.com/ipython/ipython/issues/1611).  If I am failing at
that, I hope people will call me out *on that point*, but not on
whether I ever pick up the phone or email to talk about IPython
off-list.

Let's try to trust for one minute that the actual decisions will be
made here with solid debate and project-wide input, and seek change
only if we have evidence that this isn't happening (not evidence of a
meta-problem that isn't a problem here).

Best,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Eric Firing
On 04/17/2012 08:40 AM, Matthew Brett wrote:
> Hi,
>
> On Tue, Apr 17, 2012 at 7:24 AM, Nathaniel Smith  wrote:
>> On Tue, Apr 17, 2012 at 5:59 AM, Matthew Brett  
>> wrote:
>>> Hi,
>>>
>>> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant  
>>> wrote:
 Mark and I will have conversations about NumPy while he is in Austin.   
 There are many other active stake-holders whose opinions and views are 
 essential for major changes.Mark and I are working on other things 
 besides just NumPy and all NumPy changes will be discussed on list and 
 require consensus or super-majority for NumPy itself to change. I'm 
 not sure if that helps.   Is there more we can do?
>>>
>>> As you might have heard me say before, my concern is that it has not
>>> been easy to have good discussions on this list.   I think the problem
>>> has been that is has not been clear what the culture was, and how
>>> decisions got made, and that had led to some uncomfortable and
>>> unhelpful discussions.  My plea would be for you as BDF$N to strongly
>>> encourage on-list discussions and discourage off-list discussions as
>>> far as possible, and to help us make the difficult public effort to
>>> bash out the arguments to clarity and consensus.  I know that's a big
>>> ask.
>>
>> Hi Matthew,
>>
>> As you know, I agree with everything you just said :-). So in interest
>> of transparency, I should add: I have been in touch with Travis some
>> off-list, and the main topic has been how to proceed in a way that
>> let's us achieve public consensus.

...when possible without paralysis.

>
> I'm glad to hear that discussion is happening, but please do have it
> on list.   If it's off list it easy for people to feel they are being
> bypassed, and that the public discussion is not important.  So, yes,
> you might get a better outcome for this specific case, but a worse
> outcome in the long term, because the list will start to feel that
> it's for signing off or voting rather than discussion, and that - I
> feel sure - would lead to worse decisions.

I think you are over-stating the case a bit.  Taking what you say 
literally, one might conclude that numpy people should never meet and 
chat, or phone each other up and chat.  But such small conversations are 
an important extension and facilitator of individual thinking. Major 
decisions do need to get hashed out publicly, but mailing list 
discussions are only one part of the thinking and decision process.

Eric

>
> The other issue is that there's a reason you are having the discussion
> off-list - which is that it was getting difficult on-list.  But -
> again - a personal view - that really has to be addressed directly by
> setting out the rules of engagement and modeling the kind of
> discussion we want to have.
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Matthew Brett
Hi,

On Tue, Apr 17, 2012 at 7:24 AM, Nathaniel Smith  wrote:
> On Tue, Apr 17, 2012 at 5:59 AM, Matthew Brett  
> wrote:
>> Hi,
>>
>> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant  wrote:
>>> Mark and I will have conversations about NumPy while he is in Austin.   
>>> There are many other active stake-holders whose opinions and views are 
>>> essential for major changes.    Mark and I are working on other things 
>>> besides just NumPy and all NumPy changes will be discussed on list and 
>>> require consensus or super-majority for NumPy itself to change.     I'm not 
>>> sure if that helps.   Is there more we can do?
>>
>> As you might have heard me say before, my concern is that it has not
>> been easy to have good discussions on this list.   I think the problem
>> has been that is has not been clear what the culture was, and how
>> decisions got made, and that had led to some uncomfortable and
>> unhelpful discussions.  My plea would be for you as BDF$N to strongly
>> encourage on-list discussions and discourage off-list discussions as
>> far as possible, and to help us make the difficult public effort to
>> bash out the arguments to clarity and consensus.  I know that's a big
>> ask.
>
> Hi Matthew,
>
> As you know, I agree with everything you just said :-). So in interest
> of transparency, I should add: I have been in touch with Travis some
> off-list, and the main topic has been how to proceed in a way that
> let's us achieve public consensus.

I'm glad to hear that discussion is happening, but please do have it
on list.   If it's off list it easy for people to feel they are being
bypassed, and that the public discussion is not important.  So, yes,
you might get a better outcome for this specific case, but a worse
outcome in the long term, because the list will start to feel that
it's for signing off or voting rather than discussion, and that - I
feel sure - would lead to worse decisions.

The other issue is that there's a reason you are having the discussion
off-list - which is that it was getting difficult on-list.  But -
again - a personal view - that really has to be addressed directly by
setting out the rules of engagement and modeling the kind of
discussion we want to have.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Nathaniel Smith
On Tue, Apr 17, 2012 at 5:59 AM, Matthew Brett  wrote:
> Hi,
>
> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant  wrote:
>> Mark and I will have conversations about NumPy while he is in Austin.   
>> There are many other active stake-holders whose opinions and views are 
>> essential for major changes.    Mark and I are working on other things 
>> besides just NumPy and all NumPy changes will be discussed on list and 
>> require consensus or super-majority for NumPy itself to change.     I'm not 
>> sure if that helps.   Is there more we can do?
>
> As you might have heard me say before, my concern is that it has not
> been easy to have good discussions on this list.   I think the problem
> has been that is has not been clear what the culture was, and how
> decisions got made, and that had led to some uncomfortable and
> unhelpful discussions.  My plea would be for you as BDF$N to strongly
> encourage on-list discussions and discourage off-list discussions as
> far as possible, and to help us make the difficult public effort to
> bash out the arguments to clarity and consensus.  I know that's a big
> ask.

Hi Matthew,

As you know, I agree with everything you just said :-). So in interest
of transparency, I should add: I have been in touch with Travis some
off-list, and the main topic has been how to proceed in a way that
let's us achieve public consensus.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-17 Thread Nathaniel Smith
On Tue, Apr 17, 2012 at 6:44 AM, Travis Oliphant  wrote:
> Basically, there are two sets of changes as far as I understand right now:
>
>        1) ufunc infrastructure understands masked arrays
>        2) ndarray grew attributes to represent masked arrays
>
> I am proposing that we keep 1) but change 2) so that only certain kinds of 
> NumPy arrays actually have the extra function pointers (effectively a 
> sub-type).   In essence, what I'm proposing is that the NumPy 1.6 
> PyArrayObject become a base-object, but the other members of the C-structure 
> are not even present unless the Masked flag is set.   Such changes would not 
> require ripping code out --- just altering the presentation a bit.   Yet, 
> they could have large long-term implications, that we should explore before 
> they get fixed.
>
> Whether masked arrays should be a formal sub-class is actually an un-related 
> question and I generally lean in the direction of not encouraging sub-classes 
> of the ndarray.   The big questions are does this object work in the 
> calculation infrastructure.   Can I add an array to a masked array.   Does it 
> have a sum method?   I think it could be argued that a masked array does have 
> a "is a" relationship with an array.   It can also be argued that it is 
> better to have a "has a" relationship with an array and be-it's own-object.   
> Either way, this object could still have it's first-part be binary compatible 
> with a NumPy Array, and that is what I'm really suggesting.

It sounds like the main implementation issue here is that this masked
array class needs some way to coordinate with the ufunc infrastructure
to efficiently and reliably handle the mask in calculations. The core
ufunc code now knows how to handle masks, and this functionality is
needed for where= and NA-dtypes, so obviously it's staying,
independent of what we decide to do with masked arrays. So the
question is just, how do we get the masked array and the ufuncs
talking to each other so they can do the right thing. Perhaps we
should focus, then, on how to create a better hooking mechanism for
ufuncs? Something along these lines?
  http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html
If done in a solid enough way, this would also solve other problems,
e.g. we could make ufuncs work reliably on sparse matrices, which
seems to trip people up on scipy-user every month or two. Of course,
it's very tricky to get right :-(

As far the masked array API: I'm still not convinced we know how we
want these things to behave. The masked arrays in master currently
implement MISSING semantics, but AFAICT everyone who wants MISSING
semantics prefers NA-dtypes or even plain old NaN's over a masked
implementation. And some of the current implementation's biggest
backers, like Chuck, have argued that they should switch to
skipNA=True, which is more of an IGNORED-style semantic. OTOH, there's
still disagreement over how IGNORED-style semantics should even work
(I'm thinking of that discussion about commutivity). The best existing
model is numpy.ma -- but the numpy.ma API is quite different from the
NEP, in more ways than just the default setting for skipNA. numpy.ma
uses the opposite convention for mask values, it has additional
concepts like the fillvalue, hardmask versus softmask, and then
there's the whole way the NEP uses views to manage the mask. And I
don't know which of these numpy.ma features are useful, which are
extraneous, and which are currently useful but will become extraneous
once the users who really wanted something more like NA-dtypes switch
to those.

So we all agree that masked arrays can be useful, and that numpy.ma
has problems. But straightforwardly porting numpy.ma to C doesn't seem
like it would help much, and neither does simply declaring that
numpy.ma has been deprecated in favour of a new NEP-like API.

So, I dunno. It seems like it might make the most sense to:
1) take the mask fields out of the core ndarray (while leaving the
rest of Mark's infrastructure, as per above)
2) make sure we have the hooks needed so that numpy.ma, and NEP-like
APIs, and whatever other experiments people want to try, can all
integrate well with ufuncs, and make any other extensions that are
generally useful and required so that they can work well
3) once we've experimented, move the winner into the core. Or whatever
else makes sense to do once we understand what we're trying to
accomplish.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Gael Varoquaux
On Mon, Apr 16, 2012 at 10:40:53PM -0500, Travis Oliphant wrote:
> > The objectors object to any binary ABI change, but not specifically
> > three pointers rather than two or one?

> Adding pointers is not really an ABI change (but removing them after
> they were there would be...)  It's really just the addition of data to
> the NumPy array structure that they aren't going to use.  Most of the
> time it would not be a real problem (the number of use-cases where you
> have a lot of small NumPy arrays is small), but when it is a problem it
> is very annoying. 

I think that something that the numpy community must be very careful
about is ABI breakage. At the scale of a large and heavy institution, it
is very costly. In my mind, this is the argument that should guide the
discussion: does going one way of the other (removing NA or not) will
lead us likely into ABI breakage ?

My 2 cents,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant

On Apr 16, 2012, at 11:59 PM, Matthew Brett wrote:

> Hi,
> 
> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant  wrote:
 
 I think the answer to this is yes, but it could be as a feature-filled 
 sub-class (like the current numpy.ma, except in C).
>>> 
>>> I'd love to hear that argument fleshed out in more detail - do you have 
>>> time?
>> 
>> 
>> My proposal here is to basically take the current github NumPy 
>> data-structure and make this a sub-type (in C) of the NumPy 1.6 
>> data-structure which is unchanged in NumPy 1.7.
>> 
>> This would not require removing code but would require another PyTypeObject 
>> and associated structures.  I expect Mark could do this work in 2-4 weeks.   
>> We also have other developers who could help in order to get the sub-type in 
>> NumPy 1.7. What kind of details would you like to see?
> 
> I was dimly thinking of the same questions that Chuck had - about how
> subclassing would relate to the ufunc changes.

Basically, there are two sets of changes as far as I understand right now:  

1) ufunc infrastructure understands masked arrays
2) ndarray grew attributes to represent masked arrays

I am proposing that we keep 1) but change 2) so that only certain kinds of 
NumPy arrays actually have the extra function pointers (effectively a 
sub-type).   In essence, what I'm proposing is that the NumPy 1.6 PyArrayObject 
become a base-object, but the other members of the C-structure are not even 
present unless the Masked flag is set.   Such changes would not require ripping 
code out --- just altering the presentation a bit.   Yet, they could have large 
long-term implications, that we should explore before they get fixed.

Whether masked arrays should be a formal sub-class is actually an un-related 
question and I generally lean in the direction of not encouraging sub-classes 
of the ndarray.   The big questions are does this object work in the 
calculation infrastructure.   Can I add an array to a masked array.   Does it 
have a sum method?   I think it could be argued that a masked array does have a 
"is a" relationship with an array.   It can also be argued that it is better to 
have a "has a" relationship with an array and be-it's own-object.   Either way, 
this object could still have it's first-part be binary compatible with a NumPy 
Array, and that is what I'm really suggesting. 

-Travis





> 
>> I just think we need more data and uses and this would provide a way to get 
>> that without making a forced decision one way or another.
> 
> Is the proposal that this would be an alternative API to numpy.ma?
> Is numpy.ma not itself satisfactory as a test of these uses, because
> of performance or some other reason?
> 
> 2) Will likely changes to the masked array API make any difference to
> the number of extra pointers?  Likely answer no?
> 
> Is that right?
 
 The answer to this is very likely no on the Python side.  But, on the 
 C-side, their could be some differences (i.e. are masked arrays a 
 sub-class of the ndarray or not).
 
> 
> I have the impression that the masked array API discussion still has
> not come out fully into the unforgiving light of discussion day, but
> if the answer to 2) is No, then I suppose the API discussion is not
> relevant to the 3 pointers change.
 
 You are correct that the API discussion is separate from this one. 
 Overall,  I was surprised at how fervently people would oppose ABI 
 changes.   As has been pointed out, NumPy and Numeric before it were not 
 really designed to prevent having to recompile when changes were made.   
 I'm still not sure that a better overall solution is not to promote better 
 availability of downstream binary packages than excessively worry about 
 ABI changes in NumPy.But, that is the current climate.
>>> 
>>> The objectors object to any binary ABI change, but not specifically
>>> three pointers rather than two or one?
>> 
>> Adding pointers is not really an ABI change (but removing them after they 
>> were there would be...)  It's really just the addition of data to the NumPy 
>> array structure that they aren't going to use.  Most of the time it would 
>> not be a real problem (the number of use-cases where you have a lot of small 
>> NumPy arrays is small), but when it is a problem it is very annoying.
>> 
>>> 
>>> Is their point then about ABI breakage?  Because that seems like a
>>> different point again.
>> 
>> Yes, it's not that.
>> 
>>> 
>>> Or is it possible that they are in fact worried about the masked array API?
>> 
>> I don't think most people whose opinion would be helpful are really tuned in 
>> to the discussion at this point.  I think they just want us to come up with 
>> an answer and then move forward.But, they will judge us based on the 
>> solution we come up with.
>> 
>>> 
 Mark and I will talk about this long and hard.  Mark has ideas about 

Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Charles R Harris
On Mon, Apr 16, 2012 at 10:38 PM, Travis Oliphant wrote:

>
> On Apr 16, 2012, at 11:01 PM, Charles R Harris wrote:
>
>
>
> On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant wrote:
>
>>
>> On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:
>>
>> > Hi,
>> >
>> > On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant 
>> wrote:
>> >
>> >> I have heard from a few people that they are not excited by the growth
>> of
>> >> the NumPy data-structure by the 3 pointers needed to hold the
>> masked-array
>> >> storage.   This is especially true when there is talk to potentially
>> add
>> >> additional attributes to the NumPy array (for labels and other
>> >> meta-information).  If you are willing to let us know how you feel
>> about
>> >> this, please speak up.
>> >
>> > I guess there are two questions here
>> >
>> > 1) Will something like the current version of masked arrays have a
>> > long term future in numpy, regardless of eventual API? Most likely
>> > answer - yes?
>>
>> I think the answer to this is yes, but it could be as a feature-filled
>> sub-class (like the current numpy.ma, except in C).
>>
>
> I think making numpy.ma a subclass of ndarray has caused all sorts of
> trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from
> ndarray for implementation of various parts. The upshot is that almost
> everything has to be overridden, so it didn't buy much.
>
>
> This is a valid point.   One could create a new object that is binary
> compatible with the NumPy Array but not really a sub-class but provides the
> array interface.We could keep Mark's modifications to the array
> interface as well so that it can communicate a mask.
>
>
Another place inheritance causes problems is PyUnicodeArrType inheriting
from PyUnicodeType. There the difficulty is that the unicode
itemsize/encoding may not match
between the types. IIRC, it isn't recommended that derived classes change
the itemsize. Numpy also has the different byte orderings...

The Python types are sort of like virtual classes, so in some sense they
are designed for inheritance. We could maybe set up some sort of parallel
numpy type system with empty slots and such but we would need to decide
what those slots are ahead of time. And if we got really serious, ABI
backwards compatibility would break big time.



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Matthew Brett
Hi,

On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant  wrote:
>>>
>>> I think the answer to this is yes, but it could be as a feature-filled 
>>> sub-class (like the current numpy.ma, except in C).
>>
>> I'd love to hear that argument fleshed out in more detail - do you have time?
>
>
> My proposal here is to basically take the current github NumPy data-structure 
> and make this a sub-type (in C) of the NumPy 1.6 data-structure which is 
> unchanged in NumPy 1.7.
>
> This would not require removing code but would require another PyTypeObject 
> and associated structures.  I expect Mark could do this work in 2-4 weeks.   
> We also have other developers who could help in order to get the sub-type in 
> NumPy 1.7.     What kind of details would you like to see?

I was dimly thinking of the same questions that Chuck had - about how
subclassing would relate to the ufunc changes.

> I just think we need more data and uses and this would provide a way to get 
> that without making a forced decision one way or another.

Is the proposal that this would be an alternative API to numpy.ma?
Is numpy.ma not itself satisfactory as a test of these uses, because
of performance or some other reason?

 2) Will likely changes to the masked array API make any difference to
 the number of extra pointers?  Likely answer no?

 Is that right?
>>>
>>> The answer to this is very likely no on the Python side.  But, on the 
>>> C-side, their could be some differences (i.e. are masked arrays a sub-class 
>>> of the ndarray or not).
>>>

 I have the impression that the masked array API discussion still has
 not come out fully into the unforgiving light of discussion day, but
 if the answer to 2) is No, then I suppose the API discussion is not
 relevant to the 3 pointers change.
>>>
>>> You are correct that the API discussion is separate from this one.     
>>> Overall,  I was surprised at how fervently people would oppose ABI changes. 
>>>   As has been pointed out, NumPy and Numeric before it were not really 
>>> designed to prevent having to recompile when changes were made.   I'm still 
>>> not sure that a better overall solution is not to promote better 
>>> availability of downstream binary packages than excessively worry about ABI 
>>> changes in NumPy.    But, that is the current climate.
>>
>> The objectors object to any binary ABI change, but not specifically
>> three pointers rather than two or one?
>
> Adding pointers is not really an ABI change (but removing them after they 
> were there would be...)  It's really just the addition of data to the NumPy 
> array structure that they aren't going to use.  Most of the time it would not 
> be a real problem (the number of use-cases where you have a lot of small 
> NumPy arrays is small), but when it is a problem it is very annoying.
>
>>
>> Is their point then about ABI breakage?  Because that seems like a
>> different point again.
>
> Yes, it's not that.
>
>>
>> Or is it possible that they are in fact worried about the masked array API?
>
> I don't think most people whose opinion would be helpful are really tuned in 
> to the discussion at this point.  I think they just want us to come up with 
> an answer and then move forward.    But, they will judge us based on the 
> solution we come up with.
>
>>
>>> Mark and I will talk about this long and hard.  Mark has ideas about where 
>>> he wants to see NumPy go, but I don't think we have fully accounted for 
>>> where NumPy and its user base *is* and there may be better ways to approach 
>>> this evolution.    If others are interested in the outcome of the 
>>> discussion please speak up (either on the list or privately) and we will 
>>> make sure your views get heard and accounted for.
>>
>> I started writing something about this but I guess you'd know what I'd
>> write, so I only humbly ask that you consider whether it might be
>> doing real damage to allow substantial discussion that is not
>> documented or argued out in public.
>
> It will be documented and argued in public.     We are just going to have one 
> off-list conversation to try and speed up the process.    You make a valid 
> point, and I appreciate the perspective.     Please speak up again after 
> hearing the report if something is not clear.   I don't want this to even 
> have the appearance of a "back-room" deal.
>
> Mark and I will have conversations about NumPy while he is in Austin.   There 
> are many other active stake-holders whose opinions and views are essential 
> for major changes.    Mark and I are working on other things besides just 
> NumPy and all NumPy changes will be discussed on list and require consensus 
> or super-majority for NumPy itself to change.     I'm not sure if that helps. 
>   Is there more we can do?

As you might have heard me say before, my concern is that it has not
been easy to have good discussions on this list.   I think the problem
has been that is has not been clear what the culture wa

Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant

On Apr 16, 2012, at 11:01 PM, Charles R Harris wrote:

> 
> 
> On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant  wrote:
> 
> On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:
> 
> > Hi,
> >
> > On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant  
> > wrote:
> >
> >> I have heard from a few people that they are not excited by the growth of
> >> the NumPy data-structure by the 3 pointers needed to hold the masked-array
> >> storage.   This is especially true when there is talk to potentially add
> >> additional attributes to the NumPy array (for labels and other
> >> meta-information).  If you are willing to let us know how you feel 
> >> about
> >> this, please speak up.
> >
> > I guess there are two questions here
> >
> > 1) Will something like the current version of masked arrays have a
> > long term future in numpy, regardless of eventual API? Most likely
> > answer - yes?
> 
> I think the answer to this is yes, but it could be as a feature-filled 
> sub-class (like the current numpy.ma, except in C).
> 
> I think making numpy.ma a subclass of ndarray has caused all sorts of 
> trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from 
> ndarray for implementation of various parts. The upshot is that almost 
> everything has to be overridden, so it didn't buy much.

This is a valid point.   One could create a new object that is binary 
compatible with the NumPy Array but not really a sub-class but provides the 
array interface.We could keep Mark's modifications to the array interface 
as well so that it can communicate a mask. 

-Travis




>  
> 
> > 2) Will likely changes to the masked array API make any difference to
> > the number of extra pointers?  Likely answer no?
> >
> > Is that right?
> 
> The answer to this is very likely no on the Python side.  But, on the C-side, 
> their could be some differences (i.e. are masked arrays a sub-class of the 
> ndarray or not).
> 
> >
> > I have the impression that the masked array API discussion still has
> > not come out fully into the unforgiving light of discussion day, but
> > if the answer to 2) is No, then I suppose the API discussion is not
> > relevant to the 3 pointers change.
> 
> You are correct that the API discussion is separate from this one. 
> Overall,  I was surprised at how fervently people would oppose ABI changes.   
> As has been pointed out, NumPy and Numeric before it were not really designed 
> to prevent having to recompile when changes were made.   I'm still not sure 
> that a better overall solution is not to promote better availability of 
> downstream binary packages than excessively worry about ABI changes in NumPy. 
>But, that is the current climate.
> 
> In that climate, my concern is that we haven't finalized the API but are 
> rapidly cementing the *structure* of NumPy arrays into a modified form that 
> has real downstream implications.   Two other people I have talked to share 
> this concern (nobody who has posted on this list before but who are heavy 
> users of NumPy).I may have missed the threads where it was discussed, but 
> have these structure changes and their implications been fully discussed?   
> Is there anyone else who is concerned about adding 3 more pointers (12 bytes 
> or 24 bytes) to the NumPy structure?
> 
> As Chuck points out, 3 more pointers is not necessarily that big of a deal if 
> you are talking about a large array (though for small arrays it could 
> matter).   But, I personally know of half-written NEPs that propose to add 
> more pointers to the NumPy array:
> 
>* to allow meta-information to be attached to a NumPy array
>* to allow labels to be attached to a NumPy array (ala data-array)
>* to allow multiple chunks for an array.
> 
> Are people O.K. with 5 or 6 more pointers on every NumPy array?We could 
> also think about adding just one more pointer to a new "enhanced" structure 
> that contains multiple enhancements to the NumPy array.
> 
> 
> Yes, this whole thing could get out of hand with too many extras. One of the 
> things you could discuss with Mark is how to deal with this, or limit the 
> modifications. At some point the ndarray class could become cumbersome, 
> complicated, and difficult to maintain. We need to be careful that it doesn't 
> go that way. I'd like to keep it as simple as possible, the question is what 
> is fundamental. The main long term advantage of having masks part of the base 
> is the possibility of adapted loops in ufuncs, which would give the advantage 
> of speed. But that is just how it looks from where I stand, no doubt others 
> have different priorities.
> 
> But, this whole line of discussion sounds a lot like a true sub-class of the 
> NumPy array at the C-level.It has the benefit that only people that use 
> the features of the sub-class have to worry about using the extra space.
> 
> Mark and I will talk about this long and hard.  Mark has ideas about where he 
> wants to see NumP

Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Charles R Harris
On Mon, Apr 16, 2012 at 8:46 PM, Travis Oliphant wrote:

>
> On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:
>
> > Hi,
> >
> > On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant 
> wrote:
> >
> >> I have heard from a few people that they are not excited by the growth
> of
> >> the NumPy data-structure by the 3 pointers needed to hold the
> masked-array
> >> storage.   This is especially true when there is talk to potentially add
> >> additional attributes to the NumPy array (for labels and other
> >> meta-information).  If you are willing to let us know how you feel
> about
> >> this, please speak up.
> >
> > I guess there are two questions here
> >
> > 1) Will something like the current version of masked arrays have a
> > long term future in numpy, regardless of eventual API? Most likely
> > answer - yes?
>
> I think the answer to this is yes, but it could be as a feature-filled
> sub-class (like the current numpy.ma, except in C).
>

I think making numpy.ma a subclass of ndarray has caused all sorts of
trouble. It doesn't satisfy 'is a', rather it tries to use inheritance from
ndarray for implementation of various parts. The upshot is that almost
everything has to be overridden, so it didn't buy much.


>
> > 2) Will likely changes to the masked array API make any difference to
> > the number of extra pointers?  Likely answer no?
> >
> > Is that right?
>
> The answer to this is very likely no on the Python side.  But, on the
> C-side, their could be some differences (i.e. are masked arrays a sub-class
> of the ndarray or not).
>
> >
> > I have the impression that the masked array API discussion still has
> > not come out fully into the unforgiving light of discussion day, but
> > if the answer to 2) is No, then I suppose the API discussion is not
> > relevant to the 3 pointers change.
>
> You are correct that the API discussion is separate from this one.
> Overall,  I was surprised at how fervently people would oppose ABI changes.
>   As has been pointed out, NumPy and Numeric before it were not really
> designed to prevent having to recompile when changes were made.   I'm still
> not sure that a better overall solution is not to promote better
> availability of downstream binary packages than excessively worry about ABI
> changes in NumPy.But, that is the current climate.
>
> In that climate, my concern is that we haven't finalized the API but are
> rapidly cementing the *structure* of NumPy arrays into a modified form that
> has real downstream implications.   Two other people I have talked to share
> this concern (nobody who has posted on this list before but who are heavy
> users of NumPy).I may have missed the threads where it was discussed,
> but have these structure changes and their implications been fully
> discussed?   Is there anyone else who is concerned about adding 3 more
> pointers (12 bytes or 24 bytes) to the NumPy structure?
>
> As Chuck points out, 3 more pointers is not necessarily that big of a deal
> if you are talking about a large array (though for small arrays it could
> matter).   But, I personally know of half-written NEPs that propose to add
> more pointers to the NumPy array:
>
>* to allow meta-information to be attached to a NumPy array
>* to allow labels to be attached to a NumPy array (ala data-array)
>* to allow multiple chunks for an array.
>
> Are people O.K. with 5 or 6 more pointers on every NumPy array?We
> could also think about adding just one more pointer to a new "enhanced"
> structure that contains multiple enhancements to the NumPy array.
>
>
Yes, this whole thing could get out of hand with too many extras. One of
the things you could discuss with Mark is how to deal with this, or limit
the modifications. At some point the ndarray class could become cumbersome,
complicated, and difficult to maintain. We need to be careful that it
doesn't go that way. I'd like to keep it as simple as possible, the
question is what is fundamental. The main long term advantage of having
masks part of the base is the possibility of adapted loops in ufuncs, which
would give the advantage of speed. But that is just how it looks from where
I stand, no doubt others have different priorities.

But, this whole line of discussion sounds a lot like a true sub-class of
> the NumPy array at the C-level.It has the benefit that only people that
> use the features of the sub-class have to worry about using the extra space.
>
> Mark and I will talk about this long and hard.  Mark has ideas about where
> he wants to see NumPy go, but I don't think we have fully accounted for
> where NumPy and its user base *is* and there may be better ways to approach
> this evolution.If others are interested in the outcome of the
> discussion please speak up (either on the list or privately) and we will
> make sure your views get heard and accounted for.
>
>
Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.o

Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant
>> 
>> I think the answer to this is yes, but it could be as a feature-filled 
>> sub-class (like the current numpy.ma, except in C).
> 
> I'd love to hear that argument fleshed out in more detail - do you have time?


My proposal here is to basically take the current github NumPy data-structure 
and make this a sub-type (in C) of the NumPy 1.6 data-structure which is 
unchanged in NumPy 1.7.   

This would not require removing code but would require another PyTypeObject and 
associated structures.  I expect Mark could do this work in 2-4 weeks.   We 
also have other developers who could help in order to get the sub-type in NumPy 
1.7. What kind of details would you like to see? 

In this way, the masked-array approach to missing data could be pursued by 
those who prefer that approach without affecting any other users of numpy 
arrays (and the numpy.ma sub-class could be deprecated). I would also like 
to add missing-data dtypes (ideally before NumPy 1.7, but it is not a 
requirement of release). 

I just think we need more data and uses and this would provide a way to get 
that without making a forced decision one way or another. 

> 
>>> 2) Will likely changes to the masked array API make any difference to
>>> the number of extra pointers?  Likely answer no?
>>> 
>>> Is that right?
>> 
>> The answer to this is very likely no on the Python side.  But, on the 
>> C-side, their could be some differences (i.e. are masked arrays a sub-class 
>> of the ndarray or not).
>> 
>>> 
>>> I have the impression that the masked array API discussion still has
>>> not come out fully into the unforgiving light of discussion day, but
>>> if the answer to 2) is No, then I suppose the API discussion is not
>>> relevant to the 3 pointers change.
>> 
>> You are correct that the API discussion is separate from this one. 
>> Overall,  I was surprised at how fervently people would oppose ABI changes.  
>>  As has been pointed out, NumPy and Numeric before it were not really 
>> designed to prevent having to recompile when changes were made.   I'm still 
>> not sure that a better overall solution is not to promote better 
>> availability of downstream binary packages than excessively worry about ABI 
>> changes in NumPy.But, that is the current climate.
> 
> The objectors object to any binary ABI change, but not specifically
> three pointers rather than two or one?

Adding pointers is not really an ABI change (but removing them after they were 
there would be...)  It's really just the addition of data to the NumPy array 
structure that they aren't going to use.  Most of the time it would not be a 
real problem (the number of use-cases where you have a lot of small NumPy 
arrays is small), but when it is a problem it is very annoying. 

> 
> Is their point then about ABI breakage?  Because that seems like a
> different point again.

Yes, it's not that. 

> 
> Or is it possible that they are in fact worried about the masked array API?

I don't think most people whose opinion would be helpful are really tuned in to 
the discussion at this point.  I think they just want us to come up with an 
answer and then move forward.But, they will judge us based on the solution 
we come up with. 

> 
>> Mark and I will talk about this long and hard.  Mark has ideas about where 
>> he wants to see NumPy go, but I don't think we have fully accounted for 
>> where NumPy and its user base *is* and there may be better ways to approach 
>> this evolution.If others are interested in the outcome of the discussion 
>> please speak up (either on the list or privately) and we will make sure your 
>> views get heard and accounted for.
> 
> I started writing something about this but I guess you'd know what I'd
> write, so I only humbly ask that you consider whether it might be
> doing real damage to allow substantial discussion that is not
> documented or argued out in public.

It will be documented and argued in public. We are just going to have one 
off-list conversation to try and speed up the process.You make a valid 
point, and I appreciate the perspective. Please speak up again after 
hearing the report if something is not clear.   I don't want this to even have 
the appearance of a "back-room" deal. 

Mark and I will have conversations about NumPy while he is in Austin.   There 
are many other active stake-holders whose opinions and views are essential for 
major changes.Mark and I are working on other things besides just NumPy and 
all NumPy changes will be discussed on list and require consensus or 
super-majority for NumPy itself to change. I'm not sure if that helps.   Is 
there more we can do? 

Thanks, 

-Travis



> 
> See you,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discuss

Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant
Ralf, 

I wouldn't change your plans just yet for NumPy 1.7.   With Mark available full 
time for the next few weeks, I think we will be able to make rapid progress on 
whatever is decided -- in fact if people are available to help but just need 
resources let me know off list.  

I just want to make sure that the process for making significant changes to 
NumPy does not dis-enfranchise any voice.   Like bug-reports, and 
feature-requests, complaints are food to a project, just like usage is oxygen.  
   In my view, we should take any concern that is raised from the perspective 
of NumPy is "guilty until proven innocent."  This takes some intentional 
effort.   I have found that because of how much work it takes to design and 
implement software, my natural perspective is to be defensive, but I have 
always appreciated the outcome when all view-points are considered seriously 
and addressed respectfully.  

Best regards,

-Travis

 


On Apr 16, 2012, at 6:01 PM, Ralf Gommers wrote:

> 
> 
> On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez  wrote:
> On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers
>  wrote:
> > That's the first time I've heard this. Until now, we have talked a lot about
> > adding bitmasks and API changes, not about complete removal. My assumption
> > was that the experimental label was enough. From Nathaniel's reaction I
> > gathered the same. It looks like too many conversations on this topic are
> > happening off-list.
> 
> My impression was that Travis was just suggesting that as an option
> here for discussion, not presenting it as something discussed
> elsewhere.  
> 
> From "I have heard from a few people that they are not excited " I deduce 
> it was discussed to some extent.
> 
> I read Travis' email precisely as restarting the
> discussion for consideration of the issues in full public view
> 
> It wasn't restating anything, it's completely opposite to the part that I 
> thought we did reach consensus on (*not* backing out changes). I stated as 
> much when first discussing a 1.7.0 in December, 
> http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027, 
> with no one disagreeing.
> 
> It's perfectly fine to reconsider any previous decisions/discussions of 
> course. 
> 
> However, I do now draw the conclusion that it's best to wait for this issue 
> to be resolved before considering a new release. I had been working on 
> closing tickets and cleaning up loose ends for 1.7.0, and pinging others to 
> do the same. I guess I'll stop doing that for now, until the renewed NA 
> debate has been settled.
> 
> If there are bug fixes that are important (like the Debian segfaults with 
> Python debug builds), we can do a 1.6.2 release.
> 
> Ralf
> 
> (+
> calls/skype open to anyone interested for bandwidth purposes), so in
> this case I don't think there's any background off-list to worry
> about.  At least that's how I read it...
> 
> Cheers,
> 
> f
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Matthew Brett
Hi,

On Mon, Apr 16, 2012 at 7:46 PM, Travis Oliphant  wrote:
>
> On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:
>
>> Hi,
>>
>> On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant  wrote:
>>
>>> I have heard from a few people that they are not excited by the growth of
>>> the NumPy data-structure by the 3 pointers needed to hold the masked-array
>>> storage.   This is especially true when there is talk to potentially add
>>> additional attributes to the NumPy array (for labels and other
>>> meta-information).      If you are willing to let us know how you feel about
>>> this, please speak up.
>>
>> I guess there are two questions here
>>
>> 1) Will something like the current version of masked arrays have a
>> long term future in numpy, regardless of eventual API? Most likely
>> answer - yes?
>
> I think the answer to this is yes, but it could be as a feature-filled 
> sub-class (like the current numpy.ma, except in C).

I'd love to hear that argument fleshed out in more detail - do you have time?

>> 2) Will likely changes to the masked array API make any difference to
>> the number of extra pointers?  Likely answer no?
>>
>> Is that right?
>
> The answer to this is very likely no on the Python side.  But, on the C-side, 
> their could be some differences (i.e. are masked arrays a sub-class of the 
> ndarray or not).
>
>>
>> I have the impression that the masked array API discussion still has
>> not come out fully into the unforgiving light of discussion day, but
>> if the answer to 2) is No, then I suppose the API discussion is not
>> relevant to the 3 pointers change.
>
> You are correct that the API discussion is separate from this one.     
> Overall,  I was surprised at how fervently people would oppose ABI changes.   
> As has been pointed out, NumPy and Numeric before it were not really designed 
> to prevent having to recompile when changes were made.   I'm still not sure 
> that a better overall solution is not to promote better availability of 
> downstream binary packages than excessively worry about ABI changes in NumPy. 
>    But, that is the current climate.

The objectors object to any binary ABI change, but not specifically
three pointers rather than two or one?

Is their point then about ABI breakage?  Because that seems like a
different point again.

Or is it possible that they are in fact worried about the masked array API?

> Mark and I will talk about this long and hard.  Mark has ideas about where he 
> wants to see NumPy go, but I don't think we have fully accounted for where 
> NumPy and its user base *is* and there may be better ways to approach this 
> evolution.    If others are interested in the outcome of the discussion 
> please speak up (either on the list or privately) and we will make sure your 
> views get heard and accounted for.

I started writing something about this but I guess you'd know what I'd
write, so I only humbly ask that you consider whether it might be
doing real damage to allow substantial discussion that is not
documented or argued out in public.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant

On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:

> Hi,
> 
> On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant  wrote:
> 
>> I have heard from a few people that they are not excited by the growth of
>> the NumPy data-structure by the 3 pointers needed to hold the masked-array
>> storage.   This is especially true when there is talk to potentially add
>> additional attributes to the NumPy array (for labels and other
>> meta-information).  If you are willing to let us know how you feel about
>> this, please speak up.
> 
> I guess there are two questions here
> 
> 1) Will something like the current version of masked arrays have a
> long term future in numpy, regardless of eventual API? Most likely
> answer - yes?

I think the answer to this is yes, but it could be as a feature-filled 
sub-class (like the current numpy.ma, except in C). 

> 2) Will likely changes to the masked array API make any difference to
> the number of extra pointers?  Likely answer no?
> 
> Is that right?

The answer to this is very likely no on the Python side.  But, on the C-side, 
their could be some differences (i.e. are masked arrays a sub-class of the 
ndarray or not). 

> 
> I have the impression that the masked array API discussion still has
> not come out fully into the unforgiving light of discussion day, but
> if the answer to 2) is No, then I suppose the API discussion is not
> relevant to the 3 pointers change.

You are correct that the API discussion is separate from this one. Overall, 
 I was surprised at how fervently people would oppose ABI changes.   As has 
been pointed out, NumPy and Numeric before it were not really designed to 
prevent having to recompile when changes were made.   I'm still not sure that a 
better overall solution is not to promote better availability of downstream 
binary packages than excessively worry about ABI changes in NumPy.But, that 
is the current climate. 

In that climate, my concern is that we haven't finalized the API but are 
rapidly cementing the *structure* of NumPy arrays into a modified form that has 
real downstream implications.   Two other people I have talked to share this 
concern (nobody who has posted on this list before but who are heavy users of 
NumPy).I may have missed the threads where it was discussed, but have these 
structure changes and their implications been fully discussed?   Is there 
anyone else who is concerned about adding 3 more pointers (12 bytes or 24 
bytes) to the NumPy structure? 

As Chuck points out, 3 more pointers is not necessarily that big of a deal if 
you are talking about a large array (though for small arrays it could matter).  
 But, I personally know of half-written NEPs that propose to add more pointers 
to the NumPy array: 

* to allow meta-information to be attached to a NumPy array
* to allow labels to be attached to a NumPy array (ala data-array)
* to allow multiple chunks for an array.

Are people O.K. with 5 or 6 more pointers on every NumPy array?We could 
also think about adding just one more pointer to a new "enhanced" structure 
that contains multiple enhancements to the NumPy array.

But, this whole line of discussion sounds a lot like a true sub-class of the 
NumPy array at the C-level.It has the benefit that only people that use the 
features of the sub-class have to worry about using the extra space.  

Mark and I will talk about this long and hard.  Mark has ideas about where he 
wants to see NumPy go, but I don't think we have fully accounted for where 
NumPy and its user base *is* and there may be better ways to approach this 
evolution.If others are interested in the outcome of the discussion please 
speak up (either on the list or privately) and we will make sure your views get 
heard and accounted for. 

Best regards,

-Travis





> 
> See y'all,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Matthew Brett
Hi,

On Mon, Apr 16, 2012 at 6:03 PM, Matthew Brett  wrote:
> Hi,
>
> On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant  wrote:
>
>> I have heard from a few people that they are not excited by the growth of
>> the NumPy data-structure by the 3 pointers needed to hold the masked-array
>> storage.   This is especially true when there is talk to potentially add
>> additional attributes to the NumPy array (for labels and other
>> meta-information).      If you are willing to let us know how you feel about
>> this, please speak up.
>
> I guess there are two questions here
>
> 1) Will something like the current version of masked arrays have a
> long term future in numpy, regardless of eventual API? Most likely
> answer - yes?
> 2) Will likely changes to the masked array API make any difference to
> the number of extra pointers?  Likely answer no?
>
> Is that right?
>
> I have the impression that the masked array API discussion still has
> not come out fully into the unforgiving light of discussion day, but
> if the answer to 2) is No, then I suppose the API discussion is not
> relevant to the 3 pointers change.

Sorry, if the answers to 1 and 2 are Yes and No then the API
discussion may not be relevant.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Matthew Brett
Hi,

On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant  wrote:

> I have heard from a few people that they are not excited by the growth of
> the NumPy data-structure by the 3 pointers needed to hold the masked-array
> storage.   This is especially true when there is talk to potentially add
> additional attributes to the NumPy array (for labels and other
> meta-information).      If you are willing to let us know how you feel about
> this, please speak up.

I guess there are two questions here

1) Will something like the current version of masked arrays have a
long term future in numpy, regardless of eventual API? Most likely
answer - yes?
2) Will likely changes to the masked array API make any difference to
the number of extra pointers?  Likely answer no?

Is that right?

I have the impression that the masked array API discussion still has
not come out fully into the unforgiving light of discussion day, but
if the answer to 2) is No, then I suppose the API discussion is not
relevant to the 3 pointers change.

See y'all,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Charles R Harris
On Mon, Apr 16, 2012 at 5:17 PM, Travis Oliphant wrote:

> The comments I have heard have been from people who haven't wanted to make
> them on this list.   I wish they would, but I understand that not everyone
> wants to be drawn into a long discussion.They have not been discussions.
>
> My bias is to just move forward with what is there.   After a week or two
> of discussion, I expect that we will resolve this one way or another.  The
> result be to just move forward as previously planned.  However, that might
> not be the best move forward either.   These are significant changes and
> they do impact users.  We need to understand those implications and take
> very seriously any concerns from users.
>
> There is time to look at this carefully.   We need to take the time.   I
> am really posting so that the discussions Mark and I have this week (I
> haven't seen Mark since PyCon) can be productive with as many other people
> participating as possible.
>
>
I would suggest the you and Mark have a good talk first, then report here
with some specifics that you think need discussion, along with specifics
from the unnamed sources. The somewhat vague "some say" doesn't help much
and in the absence of specifics the discussion is likely to proceed along
the same old lines if it happens at all. Meanwhile there is a disturbance
in the force that makes us all uneasy.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant
The comments I have heard have been from people who haven't wanted to make them 
on this list.   I wish they would, but I understand that not everyone wants to 
be drawn into a long discussion.They have not been discussions.

My bias is to just move forward with what is there.   After a week or two of 
discussion, I expect that we will resolve this one way or another.  The result 
be to just move forward as previously planned.  However, that might not be the 
best move forward either.   These are significant changes and they do impact 
users.  We need to understand those implications and take very seriously any 
concerns from users.

There is time to look at this carefully.   We need to take the time.   I am 
really posting so that the discussions Mark and I have this week (I haven't 
seen Mark since PyCon) can be productive with as many other people 
participating as possible.

--
Travis Oliphant
(on a mobile)
512-826-7480


On Apr 16, 2012, at 6:01 PM, Ralf Gommers  wrote:

> 
> 
> On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez  wrote:
> On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers
>  wrote:
> > That's the first time I've heard this. Until now, we have talked a lot about
> > adding bitmasks and API changes, not about complete removal. My assumption
> > was that the experimental label was enough. From Nathaniel's reaction I
> > gathered the same. It looks like too many conversations on this topic are
> > happening off-list.
> 
> My impression was that Travis was just suggesting that as an option
> here for discussion, not presenting it as something discussed
> elsewhere.  
> 
> From "I have heard from a few people that they are not excited " I deduce 
> it was discussed to some extent.
> 
> I read Travis' email precisely as restarting the
> discussion for consideration of the issues in full public view
> 
> It wasn't restating anything, it's completely opposite to the part that I 
> thought we did reach consensus on (*not* backing out changes). I stated as 
> much when first discussing a 1.7.0 in December, 
> http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027, 
> with no one disagreeing.
> 
> It's perfectly fine to reconsider any previous decisions/discussions of 
> course. 
> 
> However, I do now draw the conclusion that it's best to wait for this issue 
> to be resolved before considering a new release. I had been working on 
> closing tickets and cleaning up loose ends for 1.7.0, and pinging others to 
> do the same. I guess I'll stop doing that for now, until the renewed NA 
> debate has been settled.
> 
> If there are bug fixes that are important (like the Debian segfaults with 
> Python debug builds), we can do a 1.6.2 release.
> 
> Ralf
> 
> (+
> calls/skype open to anyone interested for bandwidth purposes), so in
> this case I don't think there's any background off-list to worry
> about.  At least that's how I read it...
> 
> Cheers,
> 
> f
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Ralf Gommers
On Tue, Apr 17, 2012 at 12:27 AM, Fernando Perez wrote:

> On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers
>  wrote:
> > That's the first time I've heard this. Until now, we have talked a lot
> about
> > adding bitmasks and API changes, not about complete removal. My
> assumption
> > was that the experimental label was enough. From Nathaniel's reaction I
> > gathered the same. It looks like too many conversations on this topic are
> > happening off-list.
>
> My impression was that Travis was just suggesting that as an option
> here for discussion, not presenting it as something discussed
> elsewhere.


>From "I have heard from a few people that they are not excited " I
deduce it was discussed to some extent.

I read Travis' email precisely as restarting the
> discussion for consideration of the issues in full public view


It wasn't restating anything, it's completely opposite to the part that I
thought we did reach consensus on (*not* backing out changes). I stated as
much when first discussing a 1.7.0 in December,
http://thread.gmane.org/gmane.comp.python.numeric.general/47022/focus=47027,
with no one disagreeing.

It's perfectly fine to reconsider any previous decisions/discussions of
course.

However, I do now draw the conclusion that it's best to wait for this issue
to be resolved before considering a new release. I had been working on
closing tickets and cleaning up loose ends for 1.7.0, and pinging others to
do the same. I guess I'll stop doing that for now, until the renewed NA
debate has been settled.

If there are bug fixes that are important (like the Debian segfaults with
Python debug builds), we can do a 1.6.2 release.

Ralf

(+
> calls/skype open to anyone interested for bandwidth purposes), so in
> this case I don't think there's any background off-list to worry
> about.  At least that's how I read it...
>
> Cheers,
>
> f
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Charles R Harris
On Mon, Apr 16, 2012 at 4:33 PM, Travis Oliphant wrote:

> No off list discussions have been happening material to this point.   I am
> basically stating my view for the first time.  I have delayed because I
> realize it is not a pleasant view and I was hoping I could end up resolving
> it favorably.
>
> But,  it needs to be discussed before 1.7 is released.
>
>
What is the problem with three extra pointers?



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant
No off list discussions have been happening material to this point.   I am 
basically stating my view for the first time.  I have delayed because I realize 
it is not a pleasant view and I was hoping I could end up resolving it 
favorably.

But,  it needs to be discussed before 1.7 is released.  

--
Travis Oliphant
(on a mobile)
512-826-7480


On Apr 16, 2012, at 5:27 PM, Fernando Perez  wrote:

> On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers
>  wrote:
>> That's the first time I've heard this. Until now, we have talked a lot about
>> adding bitmasks and API changes, not about complete removal. My assumption
>> was that the experimental label was enough. From Nathaniel's reaction I
>> gathered the same. It looks like too many conversations on this topic are
>> happening off-list.
> 
> My impression was that Travis was just suggesting that as an option
> here for discussion, not presenting it as something discussed
> elsewhere.  I read Travis' email precisely as restarting the
> discussion for consideration of the issues in full public view (+
> calls/skype open to anyone interested for bandwidth purposes), so in
> this case I don't think there's any background off-list to worry
> about.  At least that's how I read it...
> 
> Cheers,
> 
> f
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Fernando Perez
On Mon, Apr 16, 2012 at 3:21 PM, Ralf Gommers
 wrote:
> That's the first time I've heard this. Until now, we have talked a lot about
> adding bitmasks and API changes, not about complete removal. My assumption
> was that the experimental label was enough. From Nathaniel's reaction I
> gathered the same. It looks like too many conversations on this topic are
> happening off-list.

My impression was that Travis was just suggesting that as an option
here for discussion, not presenting it as something discussed
elsewhere.  I read Travis' email precisely as restarting the
discussion for consideration of the issues in full public view (+
calls/skype open to anyone interested for bandwidth purposes), so in
this case I don't think there's any background off-list to worry
about.  At least that's how I read it...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Ralf Gommers
On Tue, Apr 17, 2012 at 12:06 AM, Travis Oliphant wrote:

> There is an issue with the NumPy 1.7 release that we all need to
> understand.   Doesn't including the missing-data attributes in the NumPy
> structure in a released version of NumPy basically commit to including
> those attributes in NumPy 1.8?
>

We clearly labeled NA as experimental, so some changes are to be expected.
But not complete removal - so yes, if we release them they should stay in
some form.


>  I'm not comfortable with that, is everyone else?One possibility is to
> move those attributes to a C-level sub-class of NumPy.
>

That's the first time I've heard this. Until now, we have talked a lot
about adding bitmasks and API changes, not about complete removal. My
assumption was that the experimental label was enough. From Nathaniel's
reaction I gathered the same. It looks like too many conversations on this
topic are happening off-list.

Ralf


> I have heard from a few people that they are not excited by the growth of
> the NumPy data-structure by the 3 pointers needed to hold the masked-array
> storage.   This is especially true when there is talk to potentially add
> additional attributes to the NumPy array (for labels and other
> meta-information).  If you are willing to let us know how you feel
> about this, please speak up.
>
> Mark Wiebe will be in Austin for about 3 months.  He and I will be hashing
> some of this out in the first week or two.We will present any proposal
> and ask questions to this list before acting. We will be using some
> phone calls and face-to-face communications to increase the bandwidth and
> speed of the conversations (not to exclude anyone).If you would like to
> be part of the in-person discussions let me know -- or just make your views
> known here --- they will be taken seriously.
>
> The goal is consensus for any major change in NumPy.   If we can't get
> consensus, then we vote on this list and use a super-majority.   If we
> can't get a super-majority, then except in rare circumstances we can't move
> forward.Heavy users of NumPy get higher voting privileges.
>
> My perspective is that we don't have consensus on the current additions to
> the NumPy data-structure to have the current additional attributes on the
> NumPy data-structure be included for long-term release.
>
> Best,
>
> -Travis
>
>
>
>
>
> On Mar 25, 2012, at 6:27 PM, Charles R Harris wrote:
>
>
>
> On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers  > wrote:
>
>>
>>
>> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> There several problems with numpy master that need to be fixed before a
>>> release can be considered.
>>>
>>>1. Datetime on windows with mingw.
>>>2. Bus error on SPARC, ticket #2076.
>>>3. NA and real/complex views of complex arrays.
>>>
>>> Number 1 has been proved to be particularly difficult, any help or
>>> suggestions for that would be much appreciated. The current work has been
>>> going in pull request 214 .
>>>
>>> This isn't to say that there aren't a ton of other things that need
>>> fixing or that we can skip out on the current stack of pull requests, but I
>>> think it is impossible to consider a release while those three problems are
>>> outstanding.
>>>
>> Why do you consider (2) a blocker? Not saying it's not important, but
>> there are eight other open tickets with segfaults. Some are more esoteric
>> than other, but I don't see why for example #1713 and #1808 are less
>> important than this one.
>>
>> #1522 provides a patch that fixes a segfault by the way, could use a
>> review.
>>
>>
> I wasn't aware of the other segfaults, I'd like to get them all fixed...
> The list was meant to elicit additions.
>
> I don't know where the missed floating point errors come from, but they
> are somewhat dependent on the compiler doing the right thing and hardware
> support. I'd welcome any insight into why we get them on SPARC (underflow)
> and Windows (overflow). The windows buildbot doesn't seem to be updating
> correctly since it is still missing the combinations method that is now
> part of the test module.
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

2012-04-16 Thread Travis Oliphant
There is an issue with the NumPy 1.7 release that we all need to understand.   
Doesn't including the missing-data attributes in the NumPy structure in a 
released version of NumPy basically commit to including those attributes in 
NumPy 1.8? I'm not comfortable with that, is everyone else?One 
possibility is to move those attributes to a C-level sub-class of NumPy.  

I have heard from a few people that they are not excited by the growth of the 
NumPy data-structure by the 3 pointers needed to hold the masked-array storage. 
  This is especially true when there is talk to potentially add additional 
attributes to the NumPy array (for labels and other meta-information).  If 
you are willing to let us know how you feel about this, please speak up.   

Mark Wiebe will be in Austin for about 3 months.  He and I will be hashing some 
of this out in the first week or two.We will present any proposal and ask 
questions to this list before acting. We will be using some phone calls and 
face-to-face communications to increase the bandwidth and speed of the 
conversations (not to exclude anyone).If you would like to be part of the 
in-person discussions let me know -- or just make your views known here --- 
they will be taken seriously. 

The goal is consensus for any major change in NumPy.   If we can't get 
consensus, then we vote on this list and use a super-majority.   If we can't 
get a super-majority, then except in rare circumstances we can't move forward.  
  Heavy users of NumPy get higher voting privileges.   

My perspective is that we don't have consensus on the current additions to the 
NumPy data-structure to have the current additional attributes on the NumPy 
data-structure be included for long-term release. 

Best, 

-Travis





On Mar 25, 2012, at 6:27 PM, Charles R Harris wrote:

> 
> 
> On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers  
> wrote:
> 
> 
> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris 
>  wrote:
> Hi All,
> 
> There several problems with numpy master that need to be fixed before a 
> release can be considered.
> Datetime on windows with mingw.
> Bus error on SPARC, ticket #2076.
> NA and real/complex views of complex arrays.
> Number 1 has been proved to be particularly difficult, any help or 
> suggestions for that would be much appreciated. The current work has been 
> going in pull request 214.
> 
> This isn't to say that there aren't a ton of other things that need fixing or 
> that we can skip out on the current stack of pull requests, but I think it is 
> impossible to consider a release while those three problems are outstanding.
> Why do you consider (2) a blocker? Not saying it's not important, but there 
> are eight other open tickets with segfaults. Some are more esoteric than 
> other, but I don't see why for example #1713 and #1808 are less important 
> than this one.
> 
> #1522 provides a patch that fixes a segfault by the way, could use a review.
> 
> 
> I wasn't aware of the other segfaults, I'd like to get them all fixed... The 
> list was meant to elicit additions.
> 
> I don't know where the missed floating point errors come from, but they are 
> somewhat dependent on the compiler doing the right thing and hardware 
> support. I'd welcome any insight into why we get them on SPARC (underflow) 
> and Windows (overflow). The windows buildbot doesn't seem to be updating 
> correctly since it is still missing the combinations method that is now part 
> of the test module.
> 
> Chuck 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion