Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-17 Thread Matthew Brett
Hi,

On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire
cjord...@uw.edu wrote:
 On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden stu...@molden.no wrote:


 Den 18. feb. 2012 kl. 05:01 skrev Jason Grout jason-s...@creativetrax.com:

 On 2/17/12 9:54 PM, Sturla Molden wrote:
 We would have to write a C++ programming tutorial that is based on Pyton 
 knowledge instead of C knowledge.

 I personally would love such a thing.  It's been a while since I did
 anything nontrivial on my own in C++.


 One example: How do we code multiple return values?

 In Python:
 - Return a tuple.

 In C:
 - Use pointers (evilness)

 In C++:
 - Return a std::tuple, as you would in Python.
 - Use references, as you would in Fortran or Pascal.
 - Use pointers, as you would in C.

 C++ textbooks always pick the last...

 I would show the first and the second method, and perhaps intentionally 
 forget the last.

 Sturla


 On the flip side, cython looked pretty...but I didn't get the
 performance gains I wanted, and had to spend a lot of time figuring
 out if it was cython, needing to add types, buggy support for numpy,
 or actually the algorithm.

At the time, was the numpy support buggy?  I personally haven't had
many problems with Cython and numpy.

 The C files generated by cython were
 enormous and difficult to read. They really weren't meant for human
 consumption.

Yes, it takes some practice to get used to what Cython will do, and
how to optimize the output.

 As Sturla has said, regardless of the quality of the
 current product, it isn't stable.

I've personally found it more or less rock solid.  Could you say what
you mean by it isn't stable?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-17 Thread Matthew Brett
Hi, again (sorry),

On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire
cjord...@uw.edu wrote:
 On the broader topic of recruitment...sure, cython has a lower barrier
 to entry than C++. But there are many, many more C++ developers and
 resources out there than cython resources. And it likely will stay
 that way for quite some time.

On the other hand, in the current development community around numpy,
and among the subscribers to this mailing list, I suspect there is
more Cython experience than C++ experience.

Of course it might be that so-far undiscovered C++ developers are
drawn to a C++ rewrite of Numpy.  But it that really likely?  I can
see a C++ developer being drawn to C++ performance library they would
use in their C++ applications, but it's harder for me to imagine a C++
programmer being drawn to a Python library because the internals are
C++.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread Matthew Brett
Hi,

On Thu, Feb 16, 2012 at 4:23 AM, Francesc Alted franc...@continuum.io wrote:
 On Feb 16, 2012, at 12:15 PM, Jason Grout wrote:

 On 2/15/12 6:27 PM, Dag Sverre Seljebotn wrote:
 But in the very end, when agreement can't
 be reached by other means, the developers are the one making the calls.
 (This is simply a consequence that they are the only ones who can
 credibly threaten to fork the project.)

 Interesting point.  I hope I'm not pitching a log onto the fire here,
 but in numpy's case, there are very many capable developers on other
 projects who depend on numpy who could credibly threaten a fork if they
 felt numpy was drastically going wrong.

 Jason, that there capable developers out there that are able to fork NumPy 
 (or any other project you can realize) is a given.  The point Dag was 
 signaling is that this threaten is more probable to happen *inside* the 
 community.

 And you pointed out an important aspect too by saying if they felt numpy was 
 drastically going wrong.  It makes me the impression that some people is 
 very frightened about something really bad would happen, well before it 
 happens.  While I agree that this is *possible*, I'd also advocate to give 
 Travis the benefit of doubt.  I'm convinced he (and Continuum as a whole) is 
 making things happen that will benefit the entire NumPy community; but in 
 case something gets really wrong and catastrophic, it is always a relief to 
 know that things can be reverted in the pure open source tradition (by either 
 doing a fork, creating a new foundation, or even better, proposing a new way 
 to do things).  What it does not sound reasonable to me is to allow fear to 
 block Continuum efforts for making a better NumPy.  I think it is better to 
 relax a bit, see how things are going, and then judge by looking at the 
 *results*.

I'm finding this conversation a bit frustrating.

The question on the table as I understand it, is just the following:

Is there any governance structure / procedure / set of guidelines that
would help ensure the long-term health of the numpy project?

The subtext of your response is that you regard *any structure at all*
as damaging to the numpy effort and in particular, as damaging to the
efforts of Continuum.  It seems to me that is a very extreme point of
view, and I think, honestly, it is not tenable.

But surely - surely - the best thing to do here is to formulate
something that might be acceptable, and for everyone to say what they
think the problems would be.  Do you agree?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread Matthew Brett
Hi,

Just for my own sake, can I clarify what you are saying here?

On Thu, Feb 16, 2012 at 1:11 PM, Travis Oliphant tra...@continuum.io wrote:
  I'm not a big fan of design-by-committee as I haven't seen it be very 
 successful in creating new technologies.   It is pretty good at enforcing the 
 status-quo.  If I felt like that is what NumPy needed I would be fine with it.

Was it your impression that what was being proposed, was design by committee?

 However, I feel that NumPy is going to be surpassed with other solutions if 
 steps are not taken to improve the code-base *and* add new features.

As far as you are concerned, is there any controversy about that?

 For the next 6-12 months, I am comfortable taking the benevolent dictator 
 role.   During that time, I hope we can find many more core developers and 
 then re-visit the discussion.  My view is that design decisions should be a 
 consensus based on current contributors to the code base and major users.   
 To continue to be relevant, NumPy has to serve it's customers.   They are the 
 ones who will have the final say.   If others feel like they can do better, a 
 fork is an option.  I don't want that to happen, but it is the only effective 
 and practical governance structure that exists in my mind outside of the 
 self-governance of the people that participate.

To confirm, you are saying that you can imagine no improvement in the
current governance structure?

 No organizational structure can make up for the lack of great people putting 
 their hearts and efforts into a great cause.

But you agree that there might be an organizational structure that
would make this harder or easier?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread Matthew Brett
Hi,

On Thu, Feb 16, 2012 at 3:58 PM, Travis Oliphant tra...@continuum.io wrote:

 Matthew,

 What you should take from my post is that I appreciate your concern for the 
 future of the NumPy project, and am grateful that you have an eye to the sort 
 of things that can go wrong --- it will help ensure they don't go wrong.

 But, I personally don't agree that it is necessary to put any more formal 
 structure in place at this time, and we should wait for 6-12 months, and see 
 where we are at while doing everything we can to get more people interested 
 in contributing to the project.     I'm comfortable playing the role of BDF12 
 with a cadre of developers/contributors who seeks to come to consensus.    I 
 believe there are sufficient checks on the process that will make it quite 
 difficult for me to *abuse* that in the short term.   Charles, Rolf, Mark, 
 David, Robert, Josef, you, and many others are already quite adept at calling 
 me out when I do things they don't like or think are problematic.    I 
 encourage them to continue this.   I can't promise I'll do everything you 
 want, but I can promise I will listen and take your opinions seriously --- 
 just like I take the opinions of every contributor to the NumPy and SciPy 
 lists seriously (though weighted by the work-effort they have put on the 
 project).
  We can all only continue to do our best to help out wherever we can.

 Just so we are clear:  Continuum's current major client  is the larger 
 NumPy/SciPy community itself and this will remain the case for at least 
 several months.    You have nothing to fear from other clients we are 
 trying to please.   Thus, we are incentivized to keep as many people happy as 
 possible.    In the second place, the Foundation's major client is the same 
 community (and even broader) and the rest of the board is committed to the 
 overall success of the ecosystem.   There is a reason the board is comprised 
 of a wide-representation of that eco-system.   I am very hopeful that 
 numfocus will evolve over time to have an active community of people who 
 participate in it's processes and plans to support as many projects as it can 
 given the bandwidth and funding available to it.

 So, if I don't participate in this discussion, anymore, it's because I am 
 working on some open-source things I'd like to show at PyCon, and time is 
 clicking down.    If you really feel strongly about this, then I would 
 suggest that you come up with a proposal for governance that you would like 
 us all to review.  At the SciPy conference in Austin this summer we can talk 
 about it --- when many of us will be face-to-face.

This has not been an encouraging episode in striving for consensus.

I see virtually no movement from your implied position at the
beginning of this thread, other than the following 1) yes you are in
charge 2) you'll consider other options in 6 to 12 months.

I think you're saying here that you won't reply any more on this
thread, and I suppose that reflects the importance you attach to this
problem.

I will not myself propose a governance model because I do not consider
myself to have enough influence (on various metrics) to make it likely
it would be supported.  I wish that wasn't my perception of how things
are done here.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread Matthew Brett
Hi,

On Thu, Feb 16, 2012 at 5:26 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 2/16/2012 7:22 PM, Matthew Brett wrote:
 This has not been an encouraging episode in striving for consensus.

 Striving for consensus does not mean that a minority
 automatically gets veto rights.

'Striving' for consensus does imply some attempt to get to grips with
the arguments, and working on some compromise to accommodate both
parties.

It seems to me there was very great latitude for finding such a
comprise here, but Travis has terminated the discussion and I see no
sign of a compromise.

Striving for consensus can't of course be regulated.  The desire has
to be there.   It's probably true, as Nathaniel says, that there isn't
much you can do to legislate on that.  We can only try to persuade.  I
was trying to do that, I failed, I'll have to look back and see if
there was something else I could have done that would have been more
useful to the same end,

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-16 Thread Matthew Brett
Hi John,

On Thu, Feb 16, 2012 at 8:20 PM, John Hunter jdh2...@gmail.com wrote:


 On Thu, Feb 16, 2012 at 7:26 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/16/2012 7:22 PM, Matthew Brett wrote:
  This has not been an encouraging episode in striving for consensus.

 I disagree.
 Failure to reach consensus does not imply lack of striving.


 Hey Alan, thanks for your thoughtful and nuanced views.  I agree  with
 everything you've said, but have a few additional points.

I thought I'd looked deep in my heart and failed to find paranoia
about corporate involvement in numpy.

I am happy that Travis formed Continuum and look forward to the
progress we can expect for numpy.

I don't think the conversation was much about 'democracy'.  As far as
I was concerned, anything on the range of no change but at least
being specific to full veto power from mailing list members was up
for discussion and anything in between.

I wish we had not had to deal with the various red herrings here, such
as whether Continuum is good or bad, whether Travis has been given
adequate credit, or whether companies are bad for software.   But, we
did.  It's fine.  Argument over now.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Buildbot/continuous integration (was Re: Issue Tracking)

2012-02-16 Thread Matthew Brett
Hi,

On Thu, Feb 16, 2012 at 10:11 PM, Travis Oliphant tra...@continuum.io wrote:
 The OS X slaves (especially PPC) are very valuable for testing.    We have an 
 intern who could help keep the build-bots going if you would give her access 
 to those machines.

 Thanks for being willing to offer them.

No problem.  The OSX machines should be reliably available.  Please do
put your intern in touch, I'll give her access.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 5:51 AM, Alan G Isaac alan.is...@gmail.com wrote:
 On 2/14/2012 10:07 PM, Bruce Southey wrote:
 The one thing that gets over looked here is that there is a huge
 diversity of users with very different skill levels. But very few
 people have an understanding of the core code. (In fact the other
 thread about type-casting suggests that it is extremely few people.)
 So in all of this, I do not yet see 'community'.


 As an active user and long-time list member
 who has never even looked at the core code,
 I perhaps presumptuously urge a moderation
 of rhetoric. I object to the idea that users
 like myself do not form part of the community.

 This list has 1400 subscribers, and the fact that
 most of us are quiet most of the time does not mean we
 are not interested or attentive to the discussions,
 including discussions of governance.

 It looks to me like this will be great for NumPy.
 People who would otherwise not be able to spend much
 time on NumPy will be spending a lot of time improving
 the code and adding features. In my view, this will help
 NumPy advance which will enlarge the user community, which will
 slowly but inevitably enlarge the contributor community.
 I'm pretty excited about Travis's bold efforts to find
 ways to allow him and others to spend more time on NumPy.
 I wish him the best of luck.

I think it is important to stick to the thread topic here, which is
'Governance'.

It's not about whether it is good or bad that Travis has re-engaged in
Numpy and is funding development in Numpy through his company.   I'm
personally very glad to see Travis back on the list and engaged again,
but that's really not what the thread is about.

The thread is about whether we need explicit Numpy governance,
especially in the situation where one new company will surely dominate
numpy development in the short term at least.

I would say - for the benefit of Continuum Analytics and for the Numpy
community, there should be explicit governance, that takes this
relationship into account.

I believe that leaving the governance informal and underspecified at
this stage would be a grave mistake, for everyone concerned.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

Thanks for these interesting and specific questions.

On Wed, Feb 15, 2012 at 11:33 AM, Eric Firing efir...@hawaii.edu wrote:
 On 02/15/2012 08:50 AM, Matthew Brett wrote:
 Hi,

 On Wed, Feb 15, 2012 at 5:51 AM, Alan G Isaacalan.is...@gmail.com  wrote:
 On 2/14/2012 10:07 PM, Bruce Southey wrote:
 The one thing that gets over looked here is that there is a huge
 diversity of users with very different skill levels. But very few
 people have an understanding of the core code. (In fact the other
 thread about type-casting suggests that it is extremely few people.)
 So in all of this, I do not yet see 'community'.


 As an active user and long-time list member
 who has never even looked at the core code,
 I perhaps presumptuously urge a moderation
 of rhetoric. I object to the idea that users
 like myself do not form part of the community.

 This list has 1400 subscribers, and the fact that
 most of us are quiet most of the time does not mean we
 are not interested or attentive to the discussions,
 including discussions of governance.

 It looks to me like this will be great for NumPy.
 People who would otherwise not be able to spend much
 time on NumPy will be spending a lot of time improving
 the code and adding features. In my view, this will help
 NumPy advance which will enlarge the user community, which will
 slowly but inevitably enlarge the contributor community.
 I'm pretty excited about Travis's bold efforts to find
 ways to allow him and others to spend more time on NumPy.
 I wish him the best of luck.

 I think it is important to stick to the thread topic here, which is
 'Governance'.

 Do you have in mind a model of how this might work?  (I suspect you have
 already answered a question like that in some earlier thread; sorry.)  A
 comparable project that is doing it right?

The example that had come up previously was the book by Karl Fogel:

http://producingoss.com/en/social-infrastructure.html
http://producingoss.com/en/consensus-democracy.html

In particular, the section When Consensus Cannot Be Reached, Vote in
the second page.

Here's an example of a voting policy:

http://www.apache.org/foundation/voting.html

Debian is a famous example:

http://www.debian.org/devel/constitution

Obviously some open-source projects do not have much of a formal
governance structure, but I think in our case a) we have already run
into problems with big decisions and b) we have now reached a
situation where there is serious potential for actual or perceived
problems with conflicts of interest.

 Governance implies enforcement power, doesn't it?  Where, how, and by
 whom would the power be exercised?

The governance that I had in mind is more to do with review and
constraint of power.  Thus, I believe we need a set of rules to govern
how we deal with serious disputes, such as the masked array NA debate,
or, previously the ABI breakage discussion at numpy 1.5.0.

To go to a specific use-case.  Let us imagine that Continuum think of
an excellent feature they want in Numpy but that many others think
would make the underlying array object too complicated.  How would the
desires of Continuum be weighed against the desires of other members
of the community?

 It's not about whether it is good or bad that Travis has re-engaged in
 Numpy and is funding development in Numpy through his company.   I'm
 personally very glad to see Travis back on the list and engaged again,
 but that's really not what the thread is about.

 The thread is about whether we need explicit Numpy governance,
 especially in the situation where one new company will surely dominate
 numpy development in the short term at least.

 I would say - for the benefit of Continuum Analytics and for the Numpy
 community, there should be explicit governance, that takes this
 relationship into account.

 Please elaborate; are you saying that Continuum Analytics must develop
 numpy as decided by some outside body?

No - of course not.   Here's the discussion from Karl Fogel's book:

http://producingoss.com/en/contracting.html

I'm proposing Governance not as some council that contracts work, but
as a committee set up with formal rules that can resolve disputes and
rule changes as they arise.  This committee needs to be able to do
this to make sure that the interests of the community (developers of
numpy outside Continuum) are being represented.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 11:46 AM, Benjamin Root ben.r...@ou.edu wrote:


 On Wed, Feb 15, 2012 at 1:32 PM, Alan G Isaac alan.is...@gmail.com wrote:
 Can you provide an example where a more formal
 governance structure for NumPy would have meant
 more or better code development? (Please do not
 suggest the NA discussion!)


 Why not the NA discussion?  Would we really want to have that happen again?
 Note that it still isn't fully resolved and progress still needs to be made
 (I think the last thread did an excellent job of fleshing out the ideas, but
 it became too much to digest.  We may need to have someone go through the
 information, reduce it down and make one last push to bring it to a
 conclusion).  The NA discussion is the perfect example where a governance
 structure would help resolve disputes.

Yes, that was the most obvious example. I don't know about you, but I
can't see any sign of that one being resolved.

The other obvious example was the dispute about ABI breakage for numpy
1.5.0 where I believe Travis did invoke some sort of committee to
vote, but (Travis can correct me if I'm wrong), the committee was
named ad-hoc and contacted off-list.



 Can you provide an example of what you might
 envision as a more formal governance structure?
 (I assume that any such structure will not put people
 who are not core contributors to NumPy in a position
 to tell core contributors what to spend their time on.)

 Early last December, Chuck Harris estimated that three
 people were active NumPy developers.  I liked the idea of
 creating a board of these 3 and a rule that says any
 active developer can request to join the board, that
 additions are determined by majority vote of the existing
 board, and  that having the board both small and odd
 numbered is a priority.  I also suggested inviting to this
 board a developer or two from important projects that are
 very NumPy dependent (e.g., Matplotlib).

 I still like this idea.  Would it fully satisfy you?


 I actually like that idea.  Matthew, is this along the lines of what you
 were thinking?

Honestly it would make me very happy if the discussion moved to what
form the governance should take.  I would have thought that 3 was too
small a number.   We should look at what other projects do.   I think
that this committee needs to be people who know numpy code; projects
using numpy could advise, but people developing numpy should vote I
think.

There should be rules of engagement, a constitution, especially how to
deal with disputes with Continuum or other contracting organizations.

I would personally very much like to see a committment to consensus,
where possible on these lines (as noted previously by Nathaniel):

http://producingoss.com/en/consensus-democracy.html

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 12:55 PM, Mark Wiebe mwwi...@gmail.com wrote:
 On Wed, Feb 15, 2012 at 12:09 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Wed, Feb 15, 2012 at 11:46 AM, Benjamin Root ben.r...@ou.edu wrote:
 
 
  On Wed, Feb 15, 2012 at 1:32 PM, Alan G Isaac alan.is...@gmail.com
  wrote:
  Can you provide an example where a more formal
  governance structure for NumPy would have meant
  more or better code development? (Please do not
  suggest the NA discussion!)
 
 
  Why not the NA discussion?  Would we really want to have that happen
  again?
  Note that it still isn't fully resolved and progress still needs to be
  made
  (I think the last thread did an excellent job of fleshing out the ideas,
  but
  it became too much to digest.  We may need to have someone go through
  the
  information, reduce it down and make one last push to bring it to a
  conclusion).  The NA discussion is the perfect example where a
  governance
  structure would help resolve disputes.

 Yes, that was the most obvious example. I don't know about you, but I
 can't see any sign of that one being resolved.

 The other obvious example was the dispute about ABI breakage for numpy
 1.5.0 where I believe Travis did invoke some sort of committee to
 vote, but (Travis can correct me if I'm wrong), the committee was
 named ad-hoc and contacted off-list.

 
 
  Can you provide an example of what you might
  envision as a more formal governance structure?
  (I assume that any such structure will not put people
  who are not core contributors to NumPy in a position
  to tell core contributors what to spend their time on.)
 
  Early last December, Chuck Harris estimated that three
  people were active NumPy developers.  I liked the idea of
  creating a board of these 3 and a rule that says any
  active developer can request to join the board, that
  additions are determined by majority vote of the existing
  board, and  that having the board both small and odd
  numbered is a priority.  I also suggested inviting to this
  board a developer or two from important projects that are
  very NumPy dependent (e.g., Matplotlib).
 
  I still like this idea.  Would it fully satisfy you?
 
 
  I actually like that idea.  Matthew, is this along the lines of what you
  were thinking?

 Honestly it would make me very happy if the discussion moved to what
 form the governance should take.  I would have thought that 3 was too
 small a number.


 One thing to note about this point is that during the NA discussion, the
 only people doing active C-level development were Charles and me. I suspect
 a discussion about how to recruit more people into that group might be more
 important than governance at this point in time.

Mark - a) thanks for replying, it's good to hear your voice and b) I
don't think there's any competition between the discussion about
governance and the need to recruit more people into the group who
understand the C code.

Remember we are deciding here between governance - of a form to be
decided - and no governance - which I think is the current situation.
I know your desire is to see more people contributing to the C code.
It would help a lot if you could say what you think the barriers are,
how they could be lowered, and the risks that you see as a result of
the numpy C expertise moving essentially into one company.  Then we
can formulate some governance that would help lower those barriers and
reduce those risks.

 If we need a formal structure, maybe a good approach is giving Travis the
 final say for now, until a trigger point occurs. That could be 6 months
 after the number of active developers hits 5, or something like that. At
 that point, we would reopen the discussion with a larger group of people who
 would directly play in that role, and any decision made then will probably
 be better than a decision we make now while the development team is so
 small.

Honestly - as I was saying to Alan and indirectly to Ben - any formal
model - at all - is preferable to the current situation. Personally, I
would say that making the founder of a company, which is working to
make money from Numpy, the only decision maker on numpy - is - scary.
But maybe it's the best way.   But, again, we're all high-functioning
sensible people, I'm sure it's possible for us to formulate what the
risks are, what the potential solutions are, and come up with the best
- maybe short-term - solution,

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 12:45 PM, Alan G Isaac alan.is...@gmail.com wrote:
 My analysis is fundamentally different than Matthew
 and Benjamin's for a few reasons.

 1. The problem has been miscast.
    The economic interests of the developers *always*
    has had an apparent conflict with the economic
    interests of the users: users want developers to work more
    on the code, and developers need to make a living, which
    often involves spending their time on other things.
    On this score, nothing has really changed.
 2. It seems pretty clear that Matthew wants some governance
    power to be held by individuals who are not actively
    developing NumPy.  As Chuck Harris pointed out long ago,
    that dog ain't going to hunt.
 3. Constitutions can be broken (and are, all the time).
    Designing a stable institution requires making it in
    the interests of the members to participate.

 Any formal governance structure that can be desirable
 for the NumPy community as a whole has to be desirable
 for the core developers.  The right way to produce a
 governance structure is to make concrete proposals and
 show how these proposals are in the interest of the
 *developers* (as well as of the users).

 For example, Benjamin obliquely suggested that with an
 appropriate governance board, the NA discussion could
 have simply been shut down by having the developers
 vote (as part of their governance).  This might be in
 the interest of the developers and of the community
 (I'm not sure), but I doubt it is what Matthew has in mind.
 In any case, until proposals are put on the table along
 with a clear effort to illustrate why it is in the interest
 of the *developers* to adopt the proposals, I really do not
 see this discussion moving forward.

That's helpful, it would be good to discuss concrete proposals.
Would you care to flesh out your proposal in more detail or is it as
you quoted it before?

Where do you stand on the desirability of consensus?

Do you have any suggestions on how to ensure that the non-Continuum
community has sufficient weight in decision making?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 2:30 PM, Peter Wang pw...@streamitive.com wrote:
 On Feb 15, 2012, at 3:36 PM, Matthew Brett wrote:

 Honestly - as I was saying to Alan and indirectly to Ben - any formal
 model - at all - is preferable to the current situation. Personally, I
 would say that making the founder of a company, which is working to
 make money from Numpy, the only decision maker on numpy - is - scary.

 How is this different from the situation of the last 4 years?  Travis was 
 President at Enthought, which makes money from not only Numpy but SciPy as 
 well.  In addition to employing Travis, Enthought also employees many other 
 key contributors to Numpy and Scipy, like Robert and David.

The difference is fairly obvious to me, but stop me if I'm wrong.
First - although Enthought was in a position to influence numpy
development, it didn't very much, partly, I suppose because Travis did
not have time to contribute to numpy.  The exception is of course the
masked array stuff by Mark that caused a lot of controversy.

  Furthermore, the Scipy and Numpy mailing lists and repos and web pages were 
 all hosted at Enthought.  If they didn't like how a particular discussion was 
 going, they could have memory-holed the entire conversation from the 
 archives, or worse yet, revoked commit access and reverted changes.

Obviously we should be realistic about the risks.   Situations like
that are very unlikely.

 But such things never transpired, and of course most of us know that such 
 things would never happen.

Right.

 I don't see why the current situation is any different from the previous 
situation, other than the fact that Travis actually plans on actively 
developing Numpy again, and that hardly seems scary.

It would be silly to be worried about Travis contributing to numpy, in general.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 4:27 PM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
 On 02/15/2012 02:24 PM, Mark Wiebe wrote:
 On Wed, Feb 15, 2012 at 1:36 PM, Matthew Brett matthew.br...@gmail.com
 mailto:matthew.br...@gmail.com wrote:

     Hi,

     On Wed, Feb 15, 2012 at 12:55 PM, Mark Wiebe mwwi...@gmail.com
     mailto:mwwi...@gmail.com wrote:
       On Wed, Feb 15, 2012 at 12:09 PM, Matthew Brett
     matthew.br...@gmail.com mailto:matthew.br...@gmail.com
       wrote:
      
       Hi,
      
       On Wed, Feb 15, 2012 at 11:46 AM, Benjamin Root ben.r...@ou.edu
     mailto:ben.r...@ou.edu wrote:
       
       
        On Wed, Feb 15, 2012 at 1:32 PM, Alan G Isaac
     alan.is...@gmail.com mailto:alan.is...@gmail.com
        wrote:
        Can you provide an example where a more formal
        governance structure for NumPy would have meant
        more or better code development? (Please do not
        suggest the NA discussion!)
       
       
        Why not the NA discussion?  Would we really want to have that
     happen
        again?
        Note that it still isn't fully resolved and progress still
     needs to be
        made
        (I think the last thread did an excellent job of fleshing out
     the ideas,
        but
        it became too much to digest.  We may need to have someone go
     through
        the
        information, reduce it down and make one last push to bring it
     to a
        conclusion).  The NA discussion is the perfect example where a
        governance
        structure would help resolve disputes.
      
       Yes, that was the most obvious example. I don't know about you,
     but I
       can't see any sign of that one being resolved.
      
       The other obvious example was the dispute about ABI breakage for
     numpy
       1.5.0 where I believe Travis did invoke some sort of committee to
       vote, but (Travis can correct me if I'm wrong), the committee was
       named ad-hoc and contacted off-list.
      
       
       
        Can you provide an example of what you might
        envision as a more formal governance structure?
        (I assume that any such structure will not put people
        who are not core contributors to NumPy in a position
        to tell core contributors what to spend their time on.)
       
        Early last December, Chuck Harris estimated that three
        people were active NumPy developers.  I liked the idea of
        creating a board of these 3 and a rule that says any
        active developer can request to join the board, that
        additions are determined by majority vote of the existing
        board, and  that having the board both small and odd
        numbered is a priority.  I also suggested inviting to this
        board a developer or two from important projects that are
        very NumPy dependent (e.g., Matplotlib).
       
        I still like this idea.  Would it fully satisfy you?
       
       
        I actually like that idea.  Matthew, is this along the lines
     of what you
        were thinking?
      
       Honestly it would make me very happy if the discussion moved to what
       form the governance should take.  I would have thought that 3
     was too
       small a number.
      
      
       One thing to note about this point is that during the NA
     discussion, the
       only people doing active C-level development were Charles and me.
     I suspect
       a discussion about how to recruit more people into that group
     might be more
       important than governance at this point in time.

     Mark - a) thanks for replying, it's good to hear your voice and b) I
     don't think there's any competition between the discussion about
     governance and the need to recruit more people into the group who
     understand the C code.


 There hasn't really been any discussion about recruiting developers to
 compete with the governance topic, now we can let the topics compete. :)

 Some of the mechanisms which will help are already being set in motion
 through the discussion about better infrastructure support like bug
 trackers and continuous integration. The forthcoming roadmap discussion
 Travis alluded to, where we will propose a roadmap for review by the
 numpy user community, will include many more such points.

     Remember we are deciding here between governance - of a form to be
     decided - and no governance - which I think is the current situation.
     I know your desire is to see more people contributing to the C code.
     It would help a lot if you could say what you think the barriers are,
     how they could be lowered, and the risks that you see as a result of
     the numpy C expertise moving essentially into one company.  Then we
     can formulate some governance that would help lower those barriers and
     reduce those risks.


 There certainly is governance now, it's just informal. It's a
 combination of how the design discussions

Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 4:27 PM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
 On 02/15/2012 02:24 PM, Mark Wiebe wrote:

 There certainly is governance now, it's just informal. It's a
 combination of how the design discussions are carried out, how pull
 requests occur, and who has commit rights.

 +1

 If non-contributing users came along on the Cython list demanding that
 we set up a system to select non-developers along on a board that would
 have discussions in order to veto pull requests, I don't know whether
 we'd ignore it or ridicule it or try to show some patience, but we
 certainly wouldn't take it seriously.

In the spirit (as I read) of Dag's post, maybe we should accept that
this thread is not going anywhere much, and summarize:

The current situation is the following:

Travis is de-facto BDFL for Numpy
Disputes get resolved by convening an ad-hoc group of interested and /
or active developers to resolve or vote, maybe off-list.  How this
happens is for Travis to call.

I think that's reasonable?

As far as I can make out, in favor of the current status quo with no
significant modification are:

Travis (is that right)?
Mark
Peter
Bryan vdv
Perry
Dag

In favor of some sort of formalization of governance to be decided are:

Me
Ben R (did I get that right?)
Bruce Southey
Souheil Inati
TJ
Joe H

I am not quite sure which side of that fence are:

Josef
Alan
Chuck

If I missed someone who gave an opinion - sorry - please do speak up.

I think it's clear that if - you, Travis, don't want to go this
direction, there isn't much chance of anything happening, and I think
those of us who think something needs doing will have to keep quiet,
as Dag suggests.

I would only suggest that you (Travis) specify that you will take the
BDFL role so that we can be clear about the informal governance at
least.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 6:07 PM,  josef.p...@gmail.com wrote:
 On Wed, Feb 15, 2012 at 8:49 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Feb 15, 2012 at 4:27 PM, Dag Sverre Seljebotn
 d.s.seljeb...@astro.uio.no wrote:
 On 02/15/2012 02:24 PM, Mark Wiebe wrote:

 There certainly is governance now, it's just informal. It's a
 combination of how the design discussions are carried out, how pull
 requests occur, and who has commit rights.

 +1

 If non-contributing users came along on the Cython list demanding that
 we set up a system to select non-developers along on a board that would
 have discussions in order to veto pull requests, I don't know whether
 we'd ignore it or ridicule it or try to show some patience, but we
 certainly wouldn't take it seriously.

 In the spirit (as I read) of Dag's post, maybe we should accept that
 this thread is not going anywhere much, and summarize:

 The current situation is the following:

 Travis is de-facto BDFL for Numpy
 Disputes get resolved by convening an ad-hoc group of interested and /
 or active developers to resolve or vote, maybe off-list.  How this
 happens is for Travis to call.

 I think that's reasonable?

 As far as I can make out, in favor of the current status quo with no
 significant modification are:

 Travis (is that right)?
 Mark
 Peter
 Bryan vdv
 Perry
 Dag

 In favor of some sort of formalization of governance to be decided are:

 Me
 Ben R (did I get that right?)
 Bruce Southey
 Souheil Inati
 TJ
 Joe H

 I am not quite sure which side of that fence are:

 Josef

 Actually in the sense of separation of powers, I would vote for Chuck
 as president, Travis as prime minister and an independent release
 manager as supreme court, and the noisy mailing list community as
 parliament.

That sounds dangerously Canadian ...

But actually - I was hoping for an answer to whether you felt there
was a need for a more formal governance structure, or not.

 (I don't see a constitution yet.)

My feeling is there is not enough appetite for any change for that to
be worth thinking about, but I might be wrong.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update

2012-02-15 Thread Matthew Brett
Hi,

On Wed, Feb 15, 2012 at 9:47 PM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
 On 02/15/2012 05:02 PM, Matthew Brett wrote:
 Hi,

 On Wed, Feb 15, 2012 at 4:27 PM, Dag Sverre Seljebotn
 d.s.seljeb...@astro.uio.no  wrote:
 On 02/15/2012 02:24 PM, Mark Wiebe wrote:
 There certainly is governance now, it's just informal. It's a
 combination of how the design discussions are carried out, how pull
 requests occur, and who has commit rights.

 +1

 If non-contributing users came along on the Cython list demanding that
 we set up a system to select non-developers along on a board that would
 have discussions in order to veto pull requests, I don't know whether
 we'd ignore it or ridicule it or try to show some patience, but we
 certainly wouldn't take it seriously.

 Ouch.  Is that me, one of the non-contributing users?  Was I
 suggesting that we set up a system to select non-developers to a
 board?   I must say, now you mention it, I do feel a bit ridiculous.

 In retrospect I was unfair and my email way too harsh. Anyway, I'm
 really happy with your follow-up in turning this into something more
 constructive.

Don't worry - thanks for this reply.

 You believe, I suppose, that there are no significant risks in nearly
 all the numpy core development being done by a new company, or at
 least, that there can little benefit to a governance discussion in
 that situation.  I think you are wrong, but of course it's a tenable
 point of view,

 The question is more about what can possibly be done about it. To really
 shift power, my hunch is that the only practical way would be to, like
 Mark said, make sure there are very active non-Continuum-employed
 developers. But perhaps I'm wrong.

It's not obvious to me that there isn't a set of guidelines,
procedures, structures that would help to keep things clear in this
situation.  Obviously it would be good to have more non-Continuum
developers, but also obviously, there is a risk that that won't
happen.

 Sometimes it is worth taking some risks because it means one can go
 forward faster. Possibly *a lot* faster, if one shifts things from email
 to personal communication.

Yes, obviously it's in no-one's interest to slow down the Continuum
developers.   I wonder though whether there is a way of organizing
things, that does not slow down the Continuum developers, but does
keep the sense of community involvement and ownership.

 It is not like the current versions of NumPy disappear. If things do go
 wrong and NumPy is developed in some crazy direction, it's easy to go
 for the stagnated option simply by taking the current release and
 maintain bugfixes on it.

But we all want to avoid a fork, which is what that could easily become.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Typecasting changes from 1.5.1 to 1.6.1

2012-02-14 Thread Matthew Brett
Hi Travis,

On Mon, Feb 13, 2012 at 11:46 PM, Travis Oliphant tra...@continuum.io wrote:
 Here is the code I used to determine the coercion table of types.   I first 
 used *all* of the numeric_ops, narrowed it down to those with 2 inputs and 1 
 output, and then determined the run-time coercion table.   Then, I removed 
 ops that had the same tables until I was left with binary ops that had 
 different coercion tables.

 Some operations were NotImplemented and I used 'X' in the table for those 
 combinations.

 The table for each op is a dictionary with keys given by (type1, type2) and 
 values given by a length-4 list of the types of the result between:  
 [scalar-scalar, scalar-array, array-scalar, array-array] where the first term 
 is type1 and the second term is type2.

 This resulting dictionary of tables for each op is then saved to a file.   I 
 ran this code for NumPy 1.5.1 64-bit and then again for NumPy 1.6.1 64-bit.   
 I also ran this code for NumPy 1.4.1 64-bit and NumPy 1.3.1.dev 64-bit.

 The code to compare them is also attached.    I'm attaching also the changes 
 that have occurred between 1.3.1.dev and 1.4.1, 1.4.1 to 1.5.1, and finally 
 1.5.1 to 1.6.1

 As you can see there were changes in each release.   Most of these were minor 
 prior to the change from 1.5.1 to 1.6.1. I am still reviewing the changes 
 from 1.5.1 to 1.6.1.    At first blush, it looks like there are a lot of 
 changes to swallow that are not necessarily minor.    I really would like to 
 just say all is well, and it's no big deal.   I hope that users really don't 
 care and nobody's code is really relying on array-scalar combination 
 conversions.

Thanks for looking into this.

It strikes me that changes in behavior here could be dangerous and
easily missed, and it does seem to me that it is worth a pause to
consider what the effect of the changes might be.

Obviously, now both 1.6 and 1.6.1 are in the wild, there will be costs
to reverting as well.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Numpy governance update - was: Updated differences between 1.5.1 to 1.6.1

2012-02-14 Thread Matthew Brett
Hi,

On Tue, Feb 14, 2012 at 10:25 AM, Travis Oliphant tra...@continuum.io wrote:

 On Feb 14, 2012, at 3:32 AM, David Cournapeau wrote:

 Hi Travis,

 It is great that some resources can be spent to have people paid to
 work on NumPy. Thank you for making that happen.

 I am slightly confused about roadmaps for numpy 1.8 and 2.0. This
 needs discussion on the ML, and our release manager currently is Ralf
 - he is the one who ultimately decides what goes when.

 Thank you for reminding me of this.  Ralf and I spoke several days ago, and 
 have been working on how to give him more time to spend on SciPy full-time.   
 As a result, he will be release managing NumPy 1.7, but for NumPy 1.8, I will 
 be the release manager again.   Ralf will continue serving as release manager 
 for SciPy.

 For NumPy 2.0 and beyond, Mark Wiebe will likely be the release manager.   I 
 only know that I won't be release manager past NumPy 1.X.

 I am also not
 completely comfortable by having a roadmap advertised to Pycon not
 coming from the community.

 This is my bad wording which is a function of being up very late.    At PyCon 
 we will be discussing the roadmap conversations that are taking place on this 
 list.   We won't be presenting anything there related to the NumPy project 
 that has not first been discussed here.

 The community will have ample opportunity to provide input, suggestions, and 
 criticisms for anything that goes into NumPy --- the same as I've always done 
 before when releasing open source software.   In fact, I will also be 
 discussing at PyCon, the creation of NumFOCUS (NumPy Foundation for Open Code 
 for Usable Science) which has been organized precisely for ensuring that 
 NumPy, SciPy, Matplotlib, and IPython stay community-focused and 
 community-led even while receiving input and money from multiple companies 
 and organizations.

 There is a mailing list for numfocus that you can sign up for if you would 
 like to be part of those discussions.   Let me know if you would like more 
 information about that.    John Hunter, Fernando Perez, me, Perry Greenfield, 
 and Jarrod Millman are the initial board of the Foundation.   But, I expect 
 the Foundation directors to evolve over time.

I should say that I have no knowledge of the events above other than
from the mailing list (I say that only because some of you may know
that I'm a friend and colleague of Jarrod and Fernando).

Travis - I hope you don't mind, but here I post some links that I have
just found:

http://technicaldiscovery.blogspot.com/2012/01/transition-to-continuum.html
http://www.continuum.io/

I see that you've founded a new company, Continuum Analytics, and you
are working with Peter Wang, Mark Wiebe, Francesc Alted (PyTables),
and Bryan Van de Ven.  I think you mentioned this earlier in one of
the recent threads.

In practice this gives your company an overwhelming voice in the
direction of numpy.

From the blog post you say:

This may also mean different business models and licensing around
some of the NumPy-related code that the company writes.

Obviously your company will need to make enough money to cover your
salaries and more.  There is huge potential here for clashes of
interest, and for perceived clashes of interest.  The perceived
clashes are just as damaging as the actual clashes.

I still don't think we've got a Numpy steering group.  The
combination of the huge concentration of numpy resources in your
company, and a lack of explicit community governance, seems to me to
be something that needs to be fixed urgently.  Do you agree?

Is there any reason why the numfocus group was formed without obvious
public discussion about it's composition, remit or governance?   I'm
not objecting to it's composition, but I think it is a mistake to make
large decisions like this without public consultation.

I imagine that what happened was that things moved too fast to make it
attractive to slow the process by public discussion.   I implore you
to slow down and commit yourself  to have that discussion in full and
in public, in the interests of the common ownership of the project.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update - was: Updated differences between 1.5.1 to 1.6.1

2012-02-14 Thread Matthew Brett
Hi,

On Tue, Feb 14, 2012 at 1:54 PM, Travis Oliphant tra...@continuum.io wrote:

 There is a mailing list for numfocus that you can sign up for if you would
 like to be part of those discussions.   Let me know if you would like more
 information about that.    John Hunter, Fernando Perez, me, Perry
 Greenfield, and Jarrod Millman are the initial board of the Foundation.
 But, I expect the Foundation directors to evolve over time.


 I should say that I have no knowledge of the events above other than
 from the mailing list (I say that only because some of you may know
 that I'm a friend and colleague of Jarrod and Fernando).


 Thanks for speaking up, Matthew.   I knew that this was my first
 announcement of the Foundation to this list.   Things are still just
 starting around that organization, and so there is plenty of time for input.
   This sort of thing has actually been under-way for a long time --- it just
 has not received much impetus until now for one reason or another.

 To be clear, there were several email posts about a Foundation to this list
 last fall and we took the discussion of the Foundation that has really been
 in the works for a couple of years (thanks to Jarrod), to a Google Group
 (very poorly) called Fastechula.    There were 33 people who signed up for
 that list and discussions continued sporadically on that list away from this
 one.

 When we selected the name NumFOCUS just a few weeks ago, we created the list
 for numfocus and then I signed everyone up for that list who was on the
 other one.      I apologize if anyone felt left out.   That is not my
 intention.

My point is that there are two ways go to about this process, one is
open and the other is closed.  In the open version, someone proposes
such a group to the mailing lists.  They ask for expressions of
interest.  The discussion might then move to another mailing list that
is publicly known and widely advertised.  Members of the board are
proposed in public.  There might be some sort of formal or informal
voting process.  The reason to prefer this to the more informal
private negotiations is that a) the community feels a greater
ownership and control of the process and b) it is much harder to
weaken or subvert an organization that explicitly does all its
business in public.

The counter-argument usually goes 'members X, Y and Z are of
impeccable integrity and would only do what is best for the public
good'.  And usually, members X, Y and Z are indeed of impeccable
integrity.   Nevertheless I'm sure I don't have to unpack the evidence
that this approach frequently fails and can fail in a catastrophic
way.

 Perceptions can be damaging.   This is one of the big reasons for the
 organization of the Foundation -- to be a place separate from any commercial
 venture which can direct resources to a vision whose goal is more
 democratically determined.

Are you proposing that the Foundation oversee Numpy governance and
direction?   From your chosen members I'm guessing that the idea is
for the foundation to think about broad strategy rather than - say -
whether missing values should be encoded with masked arrays?

 I think we do have a NumPy steering group if you want to call it that.
 It is currently me, Mark Wiebe, and Charles Harris.    Rolf Gommers, Pauli
 Virtanen, David Cournapeau and Robert Kern also have opinions that carry
 significant weight.    Are there other people that should be on this list?
  There are other people who also speak up on this list whose opinions will
 be listened to and heard.   In fact, I hope that many more people will come
 to the list and speak out as development increases.

The point I was making was that the concentration of numpy development
hours and talent in your company makes it urgent that the numpy
governance is set out formally, that the interests of the company are
made clear, and that the steering group can be assured of explicit and
public independence from the interests of the company, if and when
that becomes necessary.   In the past, the numpy steering group has
seemed a virtual organization, formed ad-hoc when needed, and with no
formal governance.   I'm saying that I firmly believe that has to
change, to avoid the actual or perceived loss of community ownership.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update - was: Updated differences between 1.5.1 to 1.6.1

2012-02-14 Thread Matthew Brett
Hi,

On Tue, Feb 14, 2012 at 3:58 PM, Travis Oliphant tra...@continuum.io wrote:

 When we selected the name NumFOCUS just a few weeks ago, we created the list
 for numfocus and then I signed everyone up for that list who was on the
 other one.      I apologize if anyone felt left out.   That is not my
 intention.

 My point is that there are two ways go to about this process, one is
 open and the other is closed.  In the open version, someone proposes
 such a group to the mailing lists.  They ask for expressions of
 interest.  The discussion might then move to another mailing list that
 is publicly known and widely advertised.  Members of the board are
 proposed in public.  There might be some sort of formal or informal
 voting process.  The reason to prefer this to the more informal
 private negotiations is that a) the community feels a greater
 ownership and control of the process and b) it is much harder to
 weaken or subvert an organization that explicitly does all its
 business in public.

 Your points are well taken.   However, my point is that this has been 
 discussed on an open mailing list.   Things weren't *as* open as they could 
 have been, perhaps, in terms of board selection.  But, there was opportunity 
 for people to provide input.

I am on the numpy, scipy, matplotlib, ipython and cython mailing
lists.  Jarrod and Fernando are friends of mine.  I've been obviously
concerned about numpy governance for some time.  I didn't know about
this mailing list, had only a vague idea that some sort of foundation
was being proposed and I had no idea at all that you'd selected a
board.  Would you say that was closer to 'open' or closer to 'closed'?

 Perceptions can be damaging.   This is one of the big reasons for the
 organization of the Foundation -- to be a place separate from any commercial
 venture which can direct resources to a vision whose goal is more
 democratically determined.

 Are you proposing that the Foundation oversee Numpy governance and
 direction?   From your chosen members I'm guessing that the idea is
 for the foundation to think about broad strategy rather than - say -
 whether missing values should be encoded with masked arrays?

 No, I am not proposing that.    The Foundation will be focused on 
 higher-level broad strategy sorts of things:  mostly around how to raise 
 money and how to direct that money to projects that have their own 
 development cycles.   I would think the Foundation would be interested in 
 paying for things like issue trackers and continuous integration servers as 
 well.     It will leave NumPy management to this list and the people who have 
 gathered around this watering hole.    Obviously, there will be points of 
 connection, but exactly how this will play-out depends on who shows up to 
 both organizations.


 I think we do have a NumPy steering group if you want to call it that.
 It is currently me, Mark Wiebe, and Charles Harris.    Rolf Gommers, Pauli
 Virtanen, David Cournapeau and Robert Kern also have opinions that carry
 significant weight.    Are there other people that should be on this list?
  There are other people who also speak up on this list whose opinions will
 be listened to and heard.   In fact, I hope that many more people will come
 to the list and speak out as development increases.

 The point I was making was that the concentration of numpy development
 hours and talent in your company makes it urgent that the numpy
 governance is set out formally, that the interests of the company are
 made clear, and that the steering group can be assured of explicit and
 public independence from the interests of the company, if and when
 that becomes necessary.   In the past, the numpy steering group has
 seemed a virtual organization, formed ad-hoc when needed, and with no
 formal governance.   I'm saying that I firmly believe that has to
 change, to avoid the actual or perceived loss of community ownership.

 I hear your point.    Thank you for sharing it.    Fortunately, we are having 
 this discussion, and plan to continue to have it as any concerns arise.    I 
 think the situation is actually less concentrated than it used to be when the 
 SciPy steering committee was discussed.  On that note,  I think the SciPy 
 steering committee needs serious revision as well.    But, we've all just 
 been getting along pretty well without too much formalism, so far, so perhaps 
 that is enough for now.

But a) there have already been serious unresolved disagreements on
this list (I note no resolution of the masks / NA debate) and b) the
whole point is to set up structures that can deal with the problems
before or as they arise.  After the problem arises, it is too late.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy governance update - was: Updated differences between 1.5.1 to 1.6.1

2012-02-14 Thread Matthew Brett
Hi,

On Tue, Feb 14, 2012 at 4:43 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Feb 14, 2012 at 3:58 PM, Travis Oliphant tra...@continuum.io wrote:

 When we selected the name NumFOCUS just a few weeks ago, we created the 
 list
 for numfocus and then I signed everyone up for that list who was on the
 other one.      I apologize if anyone felt left out.   That is not my
 intention.

 My point is that there are two ways go to about this process, one is
 open and the other is closed.  In the open version, someone proposes
 such a group to the mailing lists.  They ask for expressions of
 interest.  The discussion might then move to another mailing list that
 is publicly known and widely advertised.  Members of the board are
 proposed in public.  There might be some sort of formal or informal
 voting process.  The reason to prefer this to the more informal
 private negotiations is that a) the community feels a greater
 ownership and control of the process and b) it is much harder to
 weaken or subvert an organization that explicitly does all its
 business in public.

 Your points are well taken.   However, my point is that this has been 
 discussed on an open mailing list.   Things weren't *as* open as they could 
 have been, perhaps, in terms of board selection.  But, there was opportunity 
 for people to provide input.

 I am on the numpy, scipy, matplotlib, ipython and cython mailing
 lists.  Jarrod and Fernando are friends of mine.  I've been obviously
 concerned about numpy governance for some time.  I didn't know about
 this mailing list, had only a vague idea that some sort of foundation
 was being proposed and I had no idea at all that you'd selected a
 board.  Would you say that was closer to 'open' or closer to 'closed'?

By the way - I want to be clear - I am not suggesting that I should
have been one of the people involved in these discussions.  If you
were choosing a small number of people to discuss this with, one of
them should not be me.  I am saying that, if I didn't know, it's
reasonable to assume that very few people knew, who weren't being
explicitly told, and that this means that the process was,
effectively, closed.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] can_cast with structured array output - bug?

2012-02-14 Thread Matthew Brett
Hi,

On Mon, Feb 13, 2012 at 7:02 PM, Mark Wiebe mwwi...@gmail.com wrote:
 I took a look into the code to see what is causing this, and the reason is
 that nothing has ever been implemented to deal with the fields. This means
 it falls back to treating all struct dtypes as if they were a plain void
 dtype, which allows anything to be cast to it.

 While I was redoing the casting subsystem for 1.6, I did think on this
 issue, and decided that it wasn't worth tackling it at the time because the
 'safe'/'same_kind'/'unsafe' don't seem sufficient to handle what might be
 desired. I tried to leave this alone as much as possible.

 Some random thoughts about this are:

 * Casting a scalar to a struct dtype: should it be safe if the scalar can be
 safely cast to each member of the struct dtype? This is the NumPy
 broadcasting rule applied to dtypes as if the struct dtype is another
 dimension.
 * Casting one struct dtype to another: If the fields of the source are a
 subset of the target, and the types can safely convert, should that be a
 safe cast? If the fields of the source are not a subset of the target,
 should that still be a same_kind cast? Should a second enum which
 complements the safe/same_kind/unsafe one, but is specific for how
 adding/removing struct fields be added?

 This is closely related to adding ufunc support for struct dtypes, and the
 choices here should probably be decided at the same time as designing how
 the ufuncs should work.

Thanks for the discussion - that's very helpful.

How about, at a first pass, returning True for conversion of void
types only if input dtype == output dtype, then adding more
sophisticated rules later?

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Matthew Brett
Hi,

On Mon, Feb 13, 2012 at 12:44 PM, Travis Oliphant tra...@continuum.io wrote:

 On Mon, Feb 13, 2012 at 12:12 AM, Travis Oliphant tra...@continuum.io
 wrote:

 I'm wondering about using one of these commercial issue tracking plans for
 NumPy and would like thoughts and comments.    Both of these plans allow
 Open Source projects to have unlimited plans for free.

 Free usage of a tool that's itself not open source is not all that different
 from using Github, so no objections from me.


 YouTrack from JetBrains:

 http://www.jetbrains.com/youtrack/features/issue_tracking.html

 This looks promising. It seems to have good Github integration, and I
 checked that you can easily export all your issues (so no lock-in). It's a
 company that isn't going anywhere (I hope), and they do a very nice job with
 PyCharm.


 I do like the team behind JetBrains.   And I've seen and heard good things
 about TeamCity.   Thanks for reminding me about the build-bot situation.
  That is one thing I would like to address sooner rather than later as
 well.

We've (nipy) got a buildbot collection working OK.   If you want to go
that way you are welcome to use our machines.  It's a somewhat flaky
setup though.

http://nipy.bic.berkeley.edu/builders

I have the impression that the Cython / SAGE team are happy with their
Jenkins configuration.

Ondrej did some nice stuff on integrating a build with the github pull requests:

https://github.com/sympy/sympy-bot

Some discussion of buildbot and Jenkins:

http://vperic.blogspot.com/2011/05/continuous-integration-and-sympy.html

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Matthew Brett
Hi,

On Mon, Feb 13, 2012 at 2:33 PM,  jason-s...@creativetrax.com wrote:
 On 2/13/12 2:56 PM, Matthew Brett wrote:
 I have the impression that the Cython / SAGE team are happy with their
 Jenkins configuration.

 I'm not aware of a Jenkins buildbot system for Sage, though I think
 Cython uses such a system: https://sage.math.washington.edu:8091/hudson/

 We do have a number of systems we build and test Sage on, though I don't
 think we have continuous integration yet.  I've CCd Jeroen Demeyer, who
 is the current release manager for Sage.  Jeroen, do we have an
 automatic buildbot system for Sage?

Ah - sorry - I was thinking of the Cython system on the SAGE server.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Matthew Brett
Hi,

I recently noticed a change in the upcasting rules in numpy 1.6.0 /
1.6.1 and I just wanted to check it was intentional.

For all versions of numpy I've tested, we have:

 import numpy as np
 Adata = np.array([127], dtype=np.int8)
 Bdata = np.int16(127)
 (Adata + Bdata).dtype
dtype('int8')

That is - adding an integer scalar of a larger dtype does not result
in upcasting of the output dtype, if the data in the scalar type fits
in the smaller.

For numpy  1.6.0 we have this:

 Bdata = np.int16(128)
 (Adata + Bdata).dtype
dtype('int8')

That is - even if the data in the scalar does not fit in the dtype of
the array to which it is being added, there is no upcasting.

For numpy = 1.6.0 we have this:

 Bdata = np.int16(128)
 (Adata + Bdata).dtype
dtype('int16')

There is upcasting...

I can see why the numpy 1.6.0 way might be preferable but it is an API
change I suppose.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] can_cast with structured array output - bug?

2012-02-13 Thread Matthew Brett
Hi,

I've also just noticed this oddity:

In [17]: np.can_cast('c', 'u1')
Out[17]: False

OK so far, but...

In [18]: np.can_cast('c', [('f1', 'u1')])
Out[18]: True

In [19]: np.can_cast('c', [('f1', 'u1')], 'safe')
Out[19]: True

In [20]: np.can_cast(np.ones(10, dtype='c'), [('f1', 'u1')])
Out[20]: True

I think this must be a bug.

In the other direction, it makes more sense to me:

In [24]: np.can_cast([('f1', 'u1')], 'c')
Out[24]: False

In [25]: np.can_cast([('f1', 'u1')], [('f1', 'u1')])
Out[25]: True

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unexpected reorganization of internal data

2012-01-31 Thread Matthew Brett
Hi,

On Tue, Jan 31, 2012 at 8:29 AM, Mads Ipsen madsip...@gmail.com wrote:
 Hi,

 I am confused. Here's the reason:

 The following structure is a representation of N points in 3D space:

 U = numpy.array([[x1,y1,z1], [x1,y1,z1],...,[xn,yn,zn]])

 So the array U has shape (N,3). This order makes sense to me since U[i] will
 give you the i'th point in the set. Now, I want to pass this array to a C++
 function that does some stuff with the points. Here's how I do that

 void Foo::doStuff(int n, PyObject * numpy_data)
 {
     // Get pointer to data
     double * const positions = (double *) PyArray_DATA(numpy_data);

     // Print positions
     for (int i=0; in; ++i)
     {
     float x = static_castfloat(positions[3*i+0])
     float y = static_castfloat(positions[3*i+1])
     float z = static_castfloat(positions[3*i+2])

     printf(Pos[%d] = %f %f %f\n, x, y, z);
     }
 }

 When I call this routine, using a swig wrapped Python interface to the C++
 class, everything prints out nice.

 Now, I want to apply a rotation to all the positions. So I set up some
 rotation matrix R like this:

 R = numpy.array([[r11,r12,r13],
  [r21,r22,r23],
  [r31,r32,r33]])

 To apply the matrix to the data in one crunch, I do

 V = numpy.dot(R, U.transpose()).transpose()

 Now when I call my C++ function from the Python side, all the data in V is
 printed, but it has been transposed. So apparently the internal data
 structure handled by numpy has been reorganized, even though I called
 transpose() twice, which I would expect to cancel out each other.

 However, if I do:

 V = numpy.array(U.transpose()).transpose()

 and call the C++ routine, everything is perfectly fine, ie. the data
 structure is as expected.

 What went wrong?

The numpy array reserves the right to organize its data internally.
For example, a numpy array can be in Fortran order in memory, or C
order in memory, and many more complicated schemes.  You might want to
have a look at:

http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#internal-memory-layout-of-an-ndarray

If you depend on a particular order for your array memory, you might
want to look at:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ascontiguousarray.html

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] adding unsigned int and int

2011-12-06 Thread Matthew Brett
Hi,

On Tue, Dec 6, 2011 at 4:45 AM, Skipper Seabold jsseab...@gmail.com wrote:
 Hi,

 Is this intended?

 [~/]
 [1]: np.result_type(np.uint, np.int)
 [1]: dtype('float64')

I would guess so - if your system ints are 64 bit.  int64 can't
contain the range for uint64, nor can uint64 contain all int64,  If
there had been a larger int type, it would promote to int, I believe.
At least on my system:

In [4]: np.result_type(np.int32, np.uint32)
Out[4]: dtype('int64')

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Governance

2011-12-05 Thread Matthew Brett
Hi,

2011/12/5 Stéfan van der Walt ste...@sun.ac.za:
 As for barriers to entry, improving the the nature of discourse on the
 mailing list (when it comes to thorny issues) would be good.
 Technical barriers are not that hard to breach for our community;
 setting the right social atmosphere is crucial.

I'm just about to get on a plane and am going to be out of internet
range for a while, so, in the spirit of constructive discussion:

In the spirit of use-cases:

Would it be fair to say that the two contentious recent discussions have been:

The numpy ABI breakage, 2.0 vs 1.5.1 discussion
The masked array discussion(s) ?

What did we do wrong or right in each of these two discussions?  What
could we have done better?  What process would help us to do better?

Travis - for your board-only-post mailing list - my feeling is that
this is going in the wrong direction.  The effect of the board-only
mailing list is to explicitly remove non-qualified people from the
discussion.   This will make it more explicit that the substantial
decisions will be make by a few important people.   Do you (Travis -
or Mark?) think that, if this had happened earlier in the masked array
discussion, it would have been less contentious, or had more
substantial content?  My instinct would be the reverse, and the best
solution would have been to pause and commit to beating out the issues
and getting agreement.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Governance

2011-12-03 Thread Matthew Brett
Hi Travis,

On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant teoliph...@gmail.com wrote:

 Hi everyone,

 There have been some wonderfully vigorous discussions over the past few 
 months that have made it clear that we need some clarity about how decisions 
 will be made in the NumPy community.

 When we were a smaller bunch of people it seemed easier to come to an 
 agreement and things pretty much evolved based on (mostly) consensus and who 
 was available to actually do the work.

 There is a need for a more clear structure so that we know how decisions will 
 get made and so that code can move forward while paying attention to the 
 current user-base.   There has been a steering committee structure for 
 SciPy in the past, and I have certainly been prone to lump both NumPy and 
 SciPy together given that I have a strong interest in and have spent a great 
 amount of time working on both projects.    Others have also spent time on 
 both projects.

 However, I think it is critical at this stage to clearly separate the 
 projects and define a governing structure that is fair and agreeable for 
 NumPy.   SciPy has multiple modules and will probably need structure around 
 each module independently.    For now, I wanted to open up a discussion to 
 see what people thought about NumPy's governance.

 My initial thoughts:

        * discussions happen as they do now on the mailing list
        * a small group of developers (5-11) constitute the board and major 
 decisions are made by vote of that group (not just simple majority --- needs 
 at least 2/3 +1 votes).
        * votes are +1/+0/-0/-1
        * if a topic is difficult to resolve it is moved off the main list and 
 discussed on a separate board mailing list --- these should be rare, but 
 parts of the NA discussion would probably qualify
        * This board mailing list is publically viewable but only board 
 members may post.
        * The board is renewed and adjusted each year --- based on nomination 
 and 2/3 vote of the current board until board is at 11.
        * The chairman of the board is voted by a majority of the board and 
 has veto power unless over-ridden by 3/4 of the board.
        * Petitions to remove people off the board can be made by 50+ 
 independent reverse nominations (hopefully people will just withdraw if they 
 are no longer active).

Thanks very much for starting this discussion.

You have probably seen that my preference would be for all discussions
to be public - in the sense that all can contribute.  So, it seems
reasonable to me to have 'board' as you describe, but that the board
should vote on the same mailing list as the rest of the discussion.
Having a separate mailing list for discussion makes the separation
overt between those with a granted voice and those without, and I
would hope for a structure which emphasized discsussion in an open
forum.

Put another way, what advantage would having a separate public mailing
list have?

How does this governance compare to that of - say - Linux or Python or Debian?

My worry will be that it will be too tempting to terminate discussions
and proceed to resolve by vote, when voting (as Karl Vogel describes)
may still do harm.

What will be the position - maybe I mean your position - on consensus
as Nathaniel has described it?  I feel the masked array discussion
would have been more productive (an maybe shorter and more to the
point) if there had been some rule-of-thumb that every effort is made
to reach consensus before proceeding to implementation - or a vote.

For example, in the masked array discussion, I would have liked to be
able to say 'hold on, we have a rule that we try our best to reach
consensus; I do not feel we have done that yet'.

See you,

Matthew

I guess that the
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scipy.org still says source in some subversion repo -- should be git !?

2011-12-01 Thread Matthew Brett
Yo,

On Thu, Dec 1, 2011 at 8:01 PM, Jarrod Millman mill...@berkeley.edu wrote:
 On Mon, Nov 28, 2011 at 1:19 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Maybe the content could be put in
 http://github.com/scipy/scipy.github.com so we can make pull requests
 there?


 The source is here:
   https://github.com/scipy/scipy.org-new

Are you then the person to ask about merging pull requests and
uploading the docs?

See you (literally),

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scipy.org still says source in some subversion repo -- should be git !?

2011-11-28 Thread Matthew Brett
Hi,

On Mon, Nov 28, 2011 at 1:01 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Fri, Nov 25, 2011 at 7:20 PM, Sebastian Haase seb.ha...@gmail.com
 wrote:

 google search for:  numpy browse source

 points  here:  http://new.scipy.org/download.html

 which talks about:
 svn co http://svn.scipy.org/svn/numpy/trunk numpy

 The problem is that new.scipy.org duplicates content from scipy.org, and
 is not so new anymore. I suspect that there's more out of date info (like
 installation instructions). Is anyone still working on this, or planning to
 do so in the near future? If not, it may be better to disable this site
 until someone volunteers to spend time on it again.


Who controls the new.scipy.org site?

Maybe the content could be put in
http://github.com/scipy/scipy.github.com so we can make pull requests
there?

And redirect new.scipy.org to http://scipy.github.com ?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-15 Thread Matthew Brett
Hi,

On Tue, Nov 15, 2011 at 12:51 AM, David Cournapeau courn...@gmail.com wrote:
 On Tue, Nov 15, 2011 at 6:22 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Mon, Nov 14, 2011 at 10:08 PM, David Cournapeau courn...@gmail.com 
 wrote:
 On Mon, Nov 14, 2011 at 9:01 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sun, Nov 13, 2011 at 5:03 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Sun, Nov 13, 2011 at 3:56 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sun, Nov 13, 2011 at 1:34 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sun, Nov 13, 2011 at 2:25 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sun, Nov 13, 2011 at 8:21 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
  
   On Sun, Nov 13, 2011 at 12:57 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
Sorry for my continued confusion here.  This is numpy 1.6.1 on
windows
XP 32 bit.
   
In [2]: np.finfo(np.float96).nmant
Out[2]: 52
   
In [3]: np.finfo(np.float96).nexp
Out[3]: 15
   
In [4]: np.finfo(np.float64).nmant
Out[4]: 52
   
In [5]: np.finfo(np.float64).nexp
Out[5]: 11
   
If there are 52 bits of precision, 2**53+1 should not be
representable, and sure enough:
   
In [6]: np.float96(2**53)+1
Out[6]: 9007199254740992.0
   
In [7]: np.float64(2**53)+1
Out[7]: 9007199254740992.0
   
If the nexp is right, the max should be higher for the float96
type:
   
In [9]: np.finfo(np.float64).max
Out[9]: 1.7976931348623157e+308
   
In [10]: np.finfo(np.float96).max
Out[10]: 1.#INF
   
I see that long double in C is 12 bytes wide, and double is the
usual
8
bytes.
  
   Sorry - sizeof(long double) is 12 using mingw.  I see that long
   double
   is the same as double in MS Visual C++.
  
   http://en.wikipedia.org/wiki/Long_double
  
   but, as expected from the name:
  
   In [11]: np.dtype(np.float96).itemsize
   Out[11]: 12
  
  
   Hmm, good point. There should not be a float96 on Windows using the
   MSVC
   compiler, and the longdouble types 'gG' should return float64 and
   complex128
   respectively. OTOH, I believe the mingw compiler has real float96
   types
   but
   I wonder about library support. This is really a build issue and it
   would be
   good to have some feedback on what different platforms are doing so
   that
   we
   know if we are doing things right.
 
  Is it possible that numpy is getting confused by being compiled with
  mingw on top of a visual studio python?
 
  Some further forensics seem to suggest that, despite the fact the math
  suggests float96 is float64, the storage format it in fact 80-bit
  extended precision:
 
 
  Yes, extended precision is the type on Intel hardware with gcc, the
  96/128
  bits comes from alignment on 4 or 8 byte boundaries. With MSVC, double
  and
  long double are both ieee double, and on SPARC, long double is ieee 
  quad
  precision.

 Right - but I think my researches are showing that the longdouble
 numbers are being _stored_ as 80 bit, but the math on those numbers is
 64 bit.

 Is there a reason than numpy can't do 80-bit math on these guys?  If
 there is, is there any point in having a float96 on windows?

 It's a compiler/architecture thing and depends on how the compiler
 interprets the long double c type. The gcc compiler does do 80 bit math on
 Intel/AMD hardware. MSVC doesn't, and probably never will. MSVC shouldn't
 produce float96 numbers, if it does, it is a bug. Mingw uses the gcc
 compiler, so the numbers are there, but if it uses the MS library it will
 have to convert them to double to do computations like sin(x) since there
 are no microsoft routines for extended precision. I suspect that gcc/ms
 combo is what is producing the odd results you are seeing.

 I think we might be talking past each other a bit.

 It seems to me that, if float96 must use float64 math, then it should
 be removed from the numpy namespace, because

 If we were to do so, it would break too much code.

 David - please - obviously I'm not suggesting removing it without
 deprecating it.

 Let's say I find it debatable that removing it (with all the
 deprecations) would be a good use of effort, especially given that
 there is no obviously better choice to be made.


 a) It implies higher precision than float64 but does not provide it
 b) It uses more memory to no obvious advantage

 There is an obvious advantage: to handle memory blocks which use long
 double, created outside numpy (or even python).

 Right - but that's a bit arcane, and I would have thought
 np.longdouble would be a good enough name for that.   Of course, the
 users may be surprised, as I was, that memory allocated for higher
 precision is using float64, and that may take them some time to work
 out.  I'll say

Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-14 Thread Matthew Brett
Hi,

On Sun, Nov 13, 2011 at 5:03 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sun, Nov 13, 2011 at 3:56 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sun, Nov 13, 2011 at 1:34 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sun, Nov 13, 2011 at 2:25 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sun, Nov 13, 2011 at 8:21 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
  
   On Sun, Nov 13, 2011 at 12:57 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
Sorry for my continued confusion here.  This is numpy 1.6.1 on
windows
XP 32 bit.
   
In [2]: np.finfo(np.float96).nmant
Out[2]: 52
   
In [3]: np.finfo(np.float96).nexp
Out[3]: 15
   
In [4]: np.finfo(np.float64).nmant
Out[4]: 52
   
In [5]: np.finfo(np.float64).nexp
Out[5]: 11
   
If there are 52 bits of precision, 2**53+1 should not be
representable, and sure enough:
   
In [6]: np.float96(2**53)+1
Out[6]: 9007199254740992.0
   
In [7]: np.float64(2**53)+1
Out[7]: 9007199254740992.0
   
If the nexp is right, the max should be higher for the float96
type:
   
In [9]: np.finfo(np.float64).max
Out[9]: 1.7976931348623157e+308
   
In [10]: np.finfo(np.float96).max
Out[10]: 1.#INF
   
I see that long double in C is 12 bytes wide, and double is the
usual
8
bytes.
  
   Sorry - sizeof(long double) is 12 using mingw.  I see that long
   double
   is the same as double in MS Visual C++.
  
   http://en.wikipedia.org/wiki/Long_double
  
   but, as expected from the name:
  
   In [11]: np.dtype(np.float96).itemsize
   Out[11]: 12
  
  
   Hmm, good point. There should not be a float96 on Windows using the
   MSVC
   compiler, and the longdouble types 'gG' should return float64 and
   complex128
   respectively. OTOH, I believe the mingw compiler has real float96
   types
   but
   I wonder about library support. This is really a build issue and it
   would be
   good to have some feedback on what different platforms are doing so
   that
   we
   know if we are doing things right.
 
  Is it possible that numpy is getting confused by being compiled with
  mingw on top of a visual studio python?
 
  Some further forensics seem to suggest that, despite the fact the math
  suggests float96 is float64, the storage format it in fact 80-bit
  extended precision:
 
 
  Yes, extended precision is the type on Intel hardware with gcc, the
  96/128
  bits comes from alignment on 4 or 8 byte boundaries. With MSVC, double
  and
  long double are both ieee double, and on SPARC, long double is ieee quad
  precision.

 Right - but I think my researches are showing that the longdouble
 numbers are being _stored_ as 80 bit, but the math on those numbers is
 64 bit.

 Is there a reason than numpy can't do 80-bit math on these guys?  If
 there is, is there any point in having a float96 on windows?

 It's a compiler/architecture thing and depends on how the compiler
 interprets the long double c type. The gcc compiler does do 80 bit math on
 Intel/AMD hardware. MSVC doesn't, and probably never will. MSVC shouldn't
 produce float96 numbers, if it does, it is a bug. Mingw uses the gcc
 compiler, so the numbers are there, but if it uses the MS library it will
 have to convert them to double to do computations like sin(x) since there
 are no microsoft routines for extended precision. I suspect that gcc/ms
 combo is what is producing the odd results you are seeing.

I think we might be talking past each other a bit.

It seems to me that, if float96 must use float64 math, then it should
be removed from the numpy namespace, because

a) It implies higher precision than float64 but does not provide it
b) It uses more memory to no obvious advantage

On the other hand, it seems to me that raw gcc does use higher
precision for basic math on long double, as expected.  For example,
this guy passes:

#include math.h
#include assert.h

int main(int argc, char* argv) {
double d;
long double ld;
d = pow(2, 53);
ld = d;
assert(d == ld);
d += 1;
ld += 1;
/* double rounds down because it doesn't have enough precision */
assert(d != ld);
assert(d == ld - 1);
}

whereas numpy does not use the higher precision:

In [10]: a = np.float96(2**53)

In [11]: a
Out[11]: 9007199254740992.0

In [12]: b = np.float64(2**53)

In [13]: b
Out[13]: 9007199254740992.0

In [14]: a == b
Out[14]: True

In [15]: (a + 1) == (b + 1)
Out[15]: True

So maybe there is a way of picking up the gcc math in numpy?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-14 Thread Matthew Brett
Hi,

On Mon, Nov 14, 2011 at 10:08 PM, David Cournapeau courn...@gmail.com wrote:
 On Mon, Nov 14, 2011 at 9:01 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sun, Nov 13, 2011 at 5:03 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Sun, Nov 13, 2011 at 3:56 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sun, Nov 13, 2011 at 1:34 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sun, Nov 13, 2011 at 2:25 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sun, Nov 13, 2011 at 8:21 AM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
  
   On Sun, Nov 13, 2011 at 12:57 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
Sorry for my continued confusion here.  This is numpy 1.6.1 on
windows
XP 32 bit.
   
In [2]: np.finfo(np.float96).nmant
Out[2]: 52
   
In [3]: np.finfo(np.float96).nexp
Out[3]: 15
   
In [4]: np.finfo(np.float64).nmant
Out[4]: 52
   
In [5]: np.finfo(np.float64).nexp
Out[5]: 11
   
If there are 52 bits of precision, 2**53+1 should not be
representable, and sure enough:
   
In [6]: np.float96(2**53)+1
Out[6]: 9007199254740992.0
   
In [7]: np.float64(2**53)+1
Out[7]: 9007199254740992.0
   
If the nexp is right, the max should be higher for the float96
type:
   
In [9]: np.finfo(np.float64).max
Out[9]: 1.7976931348623157e+308
   
In [10]: np.finfo(np.float96).max
Out[10]: 1.#INF
   
I see that long double in C is 12 bytes wide, and double is the
usual
8
bytes.
  
   Sorry - sizeof(long double) is 12 using mingw.  I see that long
   double
   is the same as double in MS Visual C++.
  
   http://en.wikipedia.org/wiki/Long_double
  
   but, as expected from the name:
  
   In [11]: np.dtype(np.float96).itemsize
   Out[11]: 12
  
  
   Hmm, good point. There should not be a float96 on Windows using the
   MSVC
   compiler, and the longdouble types 'gG' should return float64 and
   complex128
   respectively. OTOH, I believe the mingw compiler has real float96
   types
   but
   I wonder about library support. This is really a build issue and it
   would be
   good to have some feedback on what different platforms are doing so
   that
   we
   know if we are doing things right.
 
  Is it possible that numpy is getting confused by being compiled with
  mingw on top of a visual studio python?
 
  Some further forensics seem to suggest that, despite the fact the math
  suggests float96 is float64, the storage format it in fact 80-bit
  extended precision:
 
 
  Yes, extended precision is the type on Intel hardware with gcc, the
  96/128
  bits comes from alignment on 4 or 8 byte boundaries. With MSVC, double
  and
  long double are both ieee double, and on SPARC, long double is ieee quad
  precision.

 Right - but I think my researches are showing that the longdouble
 numbers are being _stored_ as 80 bit, but the math on those numbers is
 64 bit.

 Is there a reason than numpy can't do 80-bit math on these guys?  If
 there is, is there any point in having a float96 on windows?

 It's a compiler/architecture thing and depends on how the compiler
 interprets the long double c type. The gcc compiler does do 80 bit math on
 Intel/AMD hardware. MSVC doesn't, and probably never will. MSVC shouldn't
 produce float96 numbers, if it does, it is a bug. Mingw uses the gcc
 compiler, so the numbers are there, but if it uses the MS library it will
 have to convert them to double to do computations like sin(x) since there
 are no microsoft routines for extended precision. I suspect that gcc/ms
 combo is what is producing the odd results you are seeing.

 I think we might be talking past each other a bit.

 It seems to me that, if float96 must use float64 math, then it should
 be removed from the numpy namespace, because

 If we were to do so, it would break too much code.

David - please - obviously I'm not suggesting removing it without
deprecating it.

 a) It implies higher precision than float64 but does not provide it
 b) It uses more memory to no obvious advantage

 There is an obvious advantage: to handle memory blocks which use long
 double, created outside numpy (or even python).

Right - but that's a bit arcane, and I would have thought
np.longdouble would be a good enough name for that.   Of course, the
users may be surprised, as I was, that memory allocated for higher
precision is using float64, and that may take them some time to work
out.  I'll say again that 'longdouble' says to me 'something specific
to the compiler' and 'float96' says 'something standard in numpy', and
that I - was surprised - when I found out what it was.

 Otherwise, while gcc indeed supports long double, the fact that the C
 runtime doesn't really mean it is hopeless to reach any kind of
 consistency.

I'm sorry for my ignorance

Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-13 Thread Matthew Brett
Hi,

On Sun, Nov 13, 2011 at 8:21 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sun, Nov 13, 2011 at 12:57 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  Sorry for my continued confusion here.  This is numpy 1.6.1 on windows
  XP 32 bit.
 
  In [2]: np.finfo(np.float96).nmant
  Out[2]: 52
 
  In [3]: np.finfo(np.float96).nexp
  Out[3]: 15
 
  In [4]: np.finfo(np.float64).nmant
  Out[4]: 52
 
  In [5]: np.finfo(np.float64).nexp
  Out[5]: 11
 
  If there are 52 bits of precision, 2**53+1 should not be
  representable, and sure enough:
 
  In [6]: np.float96(2**53)+1
  Out[6]: 9007199254740992.0
 
  In [7]: np.float64(2**53)+1
  Out[7]: 9007199254740992.0
 
  If the nexp is right, the max should be higher for the float96 type:
 
  In [9]: np.finfo(np.float64).max
  Out[9]: 1.7976931348623157e+308
 
  In [10]: np.finfo(np.float96).max
  Out[10]: 1.#INF
 
  I see that long double in C is 12 bytes wide, and double is the usual 8
  bytes.

 Sorry - sizeof(long double) is 12 using mingw.  I see that long double
 is the same as double in MS Visual C++.

 http://en.wikipedia.org/wiki/Long_double

 but, as expected from the name:

 In [11]: np.dtype(np.float96).itemsize
 Out[11]: 12


 Hmm, good point. There should not be a float96 on Windows using the MSVC
 compiler, and the longdouble types 'gG' should return float64 and complex128
 respectively. OTOH, I believe the mingw compiler has real float96 types but
 I wonder about library support. This is really a build issue and it would be
 good to have some feedback on what different platforms are doing so that we
 know if we are doing things right.

Is it possible that numpy is getting confused by being compiled with
mingw on top of a visual studio python?

Some further forensics seem to suggest that, despite the fact the math
suggests float96 is float64, the storage format it in fact 80-bit
extended precision:

On OSX 32-bit where float128 is definitely 80 bit precision we see the
sign bit being flipped to show us the beginning of the number:

In [33]: bigbin(np.float128(2**53)-1)
Out[33]: 
'1011011100111000'

In [34]: bigbin(-np.float128(2**53)+1)
Out[34]: 
'111100111000'

I think that's 48 bits of padding followed by the number (bit 49 is
being flipped with the sign).

On windows (well, wine, but I think it's the same):

bigbin(np.float96(2**53)-1)
Out[14]: 
'011100111000'
bigbin(np.float96(-2**53)+1)
Out[15]: 
'111100111000'

Thanks,

Matthew

bigbin-definition
import sys
LE = sys.byteorder == 'little'

import numpy as np

def bigbin(val):
val = np.asarray(val)
nbytes = val.dtype.itemsize
dt = [('f', np.uint8, nbytes)]
out = [np.binary_repr(el, 8) for el in val.view(dt)['f']]
if LE:
out = out[::-1]
return ''.join(out)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-13 Thread Matthew Brett
Hi,

On Sun, Nov 13, 2011 at 1:34 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sun, Nov 13, 2011 at 2:25 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sun, Nov 13, 2011 at 8:21 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sun, Nov 13, 2011 at 12:57 AM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   Sorry for my continued confusion here.  This is numpy 1.6.1 on
   windows
   XP 32 bit.
  
   In [2]: np.finfo(np.float96).nmant
   Out[2]: 52
  
   In [3]: np.finfo(np.float96).nexp
   Out[3]: 15
  
   In [4]: np.finfo(np.float64).nmant
   Out[4]: 52
  
   In [5]: np.finfo(np.float64).nexp
   Out[5]: 11
  
   If there are 52 bits of precision, 2**53+1 should not be
   representable, and sure enough:
  
   In [6]: np.float96(2**53)+1
   Out[6]: 9007199254740992.0
  
   In [7]: np.float64(2**53)+1
   Out[7]: 9007199254740992.0
  
   If the nexp is right, the max should be higher for the float96 type:
  
   In [9]: np.finfo(np.float64).max
   Out[9]: 1.7976931348623157e+308
  
   In [10]: np.finfo(np.float96).max
   Out[10]: 1.#INF
  
   I see that long double in C is 12 bytes wide, and double is the usual
   8
   bytes.
 
  Sorry - sizeof(long double) is 12 using mingw.  I see that long double
  is the same as double in MS Visual C++.
 
  http://en.wikipedia.org/wiki/Long_double
 
  but, as expected from the name:
 
  In [11]: np.dtype(np.float96).itemsize
  Out[11]: 12
 
 
  Hmm, good point. There should not be a float96 on Windows using the MSVC
  compiler, and the longdouble types 'gG' should return float64 and
  complex128
  respectively. OTOH, I believe the mingw compiler has real float96 types
  but
  I wonder about library support. This is really a build issue and it
  would be
  good to have some feedback on what different platforms are doing so that
  we
  know if we are doing things right.

 Is it possible that numpy is getting confused by being compiled with
 mingw on top of a visual studio python?

 Some further forensics seem to suggest that, despite the fact the math
 suggests float96 is float64, the storage format it in fact 80-bit
 extended precision:


 Yes, extended precision is the type on Intel hardware with gcc, the 96/128
 bits comes from alignment on 4 or 8 byte boundaries. With MSVC, double and
 long double are both ieee double, and on SPARC, long double is ieee quad
 precision.

Right - but I think my researches are showing that the longdouble
numbers are being _stored_ as 80 bit, but the math on those numbers is
64 bit.

Is there a reason than numpy can't do 80-bit math on these guys?  If
there is, is there any point in having a float96 on windows?

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-12 Thread Matthew Brett
Hi,

Sorry for my continued confusion here.  This is numpy 1.6.1 on windows
XP 32 bit.

In [2]: np.finfo(np.float96).nmant
Out[2]: 52

In [3]: np.finfo(np.float96).nexp
Out[3]: 15

In [4]: np.finfo(np.float64).nmant
Out[4]: 52

In [5]: np.finfo(np.float64).nexp
Out[5]: 11

If there are 52 bits of precision, 2**53+1 should not be
representable, and sure enough:

In [6]: np.float96(2**53)+1
Out[6]: 9007199254740992.0

In [7]: np.float64(2**53)+1
Out[7]: 9007199254740992.0

If the nexp is right, the max should be higher for the float96 type:

In [9]: np.finfo(np.float64).max
Out[9]: 1.7976931348623157e+308

In [10]: np.finfo(np.float96).max
Out[10]: 1.#INF

I see that long double in C is 12 bytes wide, and double is the usual 8 bytes.

So - now I am not sure what this float96 is.  I was expecting 80 bit
extended precision, but it doesn't look right for that...

Does anyone know what representation this is?

Thanks a lot,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Odd-looking long double on windows 32 bit

2011-11-12 Thread Matthew Brett
Hi,

On Sat, Nov 12, 2011 at 11:35 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Sorry for my continued confusion here.  This is numpy 1.6.1 on windows
 XP 32 bit.

 In [2]: np.finfo(np.float96).nmant
 Out[2]: 52

 In [3]: np.finfo(np.float96).nexp
 Out[3]: 15

 In [4]: np.finfo(np.float64).nmant
 Out[4]: 52

 In [5]: np.finfo(np.float64).nexp
 Out[5]: 11

 If there are 52 bits of precision, 2**53+1 should not be
 representable, and sure enough:

 In [6]: np.float96(2**53)+1
 Out[6]: 9007199254740992.0

 In [7]: np.float64(2**53)+1
 Out[7]: 9007199254740992.0

 If the nexp is right, the max should be higher for the float96 type:

 In [9]: np.finfo(np.float64).max
 Out[9]: 1.7976931348623157e+308

 In [10]: np.finfo(np.float96).max
 Out[10]: 1.#INF

 I see that long double in C is 12 bytes wide, and double is the usual 8 bytes.

Sorry - sizeof(long double) is 12 using mingw.  I see that long double
is the same as double in MS Visual C++.

http://en.wikipedia.org/wiki/Long_double

but, as expected from the name:

In [11]: np.dtype(np.float96).itemsize
Out[11]: 12

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Int casting different across platforms

2011-11-05 Thread Matthew Brett
Hi,

On Sat, Nov 5, 2011 at 6:24 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Fri, Nov 4, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 I noticed this:

 (Intel Mac):

 In [2]: np.int32(np.float32(2**31))
 Out[2]: -2147483648

 (PPC):

 In [3]: np.int32(np.float32(2**31))
 Out[3]: 2147483647

 I assume what is happening is that the casting is handing off to the c
 library, and that behavior of the c library differs on these
 platforms?  Should we expect or hope that this behavior would be the
 same across platforms?

 Heh. I think the conversion is basically undefined because 2**31 won't fit
 in int32. The Intel example just takes the bottom 32 bits of 2**31 expressed
 as a binary integer, the PPC throws up its hands and returns the maximum
 value supported by int32. Numpy supports casts from unsigned to signed 32
 bit numbers by using the same bits, as does C, and that would comport with
 the Intel example. It would probably be useful to have a Numpy convention
 for this so that the behavior was consistent across platforms. Maybe for
 float types we should raise an error if the value is out of bounds.

Just to see what happens:

#include stdio.h
#include math.h

int main(int argc, char* argv) {
double x;
int y;
x = pow(2, 31);
y = (int)x;
printf(%d, %d\n, sizeof(int), y);
}

Intel, gcc:
4, -2147483648
PPC, gcc:
4, 2147483647

I think that's what you predicted.  Is it strange that the same
compiler gives different results?

It would be good if the behavior was the same across platforms - the
unexpected negative overflow caught me out at least.  An error sounds
sensible to me.  Would it cost lots of cycles?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Int casting different across platforms

2011-11-05 Thread Matthew Brett
Hi,

On Sun, Nov 6, 2011 at 2:39 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Nov 5, 2011 at 7:35 PM, Nathaniel Smith n...@pobox.com wrote:

 On Sat, Nov 5, 2011 at 4:07 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Intel, gcc:
  4, -2147483648
  PPC, gcc:
  4, 2147483647
 
  I think that's what you predicted.  Is it strange that the same
  compiler gives different results?
 
  It would be good if the behavior was the same across platforms - the
  unexpected negative overflow caught me out at least.  An error sounds
  sensible to me.  Would it cost lots of cycles?

 C99 says (section F.4):

 If the floating value is infinite or NaN or if the integral part of
 the floating value exceeds the range of the integer type, then the
 ‘‘invalid’’ floating-point exception is raised and the resulting value
 is unspecified. Whether conversion of non-integer floating values
 whose integral part is within the range of the integer type raises the
 ‘‘inexact’’ floating-point exception is unspecified.

 So it sounds like the compiler is allowed to return whatever nonsense
 it likes in this case. But, you should be able to cause this to raise
 an exception by fiddling with np.seterr.

 However, that doesn't seem to work for me with numpy 1.5.1 on x86-64 linux
 :-(

  np.int32(np.float32(2**31))
 -2147483648
  np.seterr(all=raise)
  np.int32(np.float32(2**31))
 -2147483648

 I think this must be a numpy or compiler bug?


 I don't believe the floating point status is checked in the numpy conversion
 routines. That looks like a nice small project for someone interested in
 learning the numpy  - .

To my shame I doubt that I will have the time to do this, but just in
case I or someone does get time, is there a good place to start to
look?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Int casting different across platforms

2011-11-04 Thread Matthew Brett
Hi,

I noticed this:

(Intel Mac):

In [2]: np.int32(np.float32(2**31))
Out[2]: -2147483648

(PPC):

In [3]: np.int32(np.float32(2**31))
Out[3]: 2147483647

I assume what is happening is that the casting is handing off to the c
library, and that behavior of the c library differs on these
platforms?  Should we expect or hope that this behavior would be the
same across platforms?

Thanks for any pointers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float64 / int comparison different from float / int comparison

2011-11-01 Thread Matthew Brett
Hi,

On Tue, Nov 1, 2011 at 8:39 AM, Chris.Barker chris.bar...@noaa.gov wrote:
 On 10/31/11 6:38 PM, Stéfan van der Walt wrote:
 On Mon, Oct 31, 2011 at 6:25 PM, Matthew Brettmatthew.br...@gmail.com  
 wrote:
 Oh, dear, I'm suffering now:

 In [12]: res  2**31-1
 Out[12]: array([False], dtype=bool)

 I'm seeing:
 ...

 Your result seems very strange, because the numpy scalars should
 perform exactly the same inside and outside an array.

 I get what Stéfan  gets:

 In [32]: res = np.array((2**31,), dtype=np.float32)

 In [33]: res  2**31-1
 Out[33]: array([ True], dtype=bool)

 In [34]: res[0]  2**31-1
 Out[34]: True

 In [35]: res[0].dtype
 Out[35]: dtype('float32')


 In [36]: np.__version__
 Out[36]: '1.6.1'

 (OS-X, Intel, Python2.7)


 Something is very odd with your build!

Well - numpy 1.4.1 on Debian squeeze.  I get the same as you with
current numpy trunk.

Stefan and I explored the issue a bit further and concluded that, in
numpy trunk, the current behavior is explicable by upcasting to
float64 during the comparison:

In [86]: np.array(2**63, dtype=np.float)  2**63 - 1
Out[86]: False

In [87]: np.array(2**31, dtype=np.float)  2**31 - 1
Out[87]: True

because 2**31 and 2**31-1 are both exactly representable in float64,
but 2**31-1 is not exactly representable in float32.

Maybe this:

In [88]: np.promote_types('f4', int)
Out[88]: dtype('float64')

tells us this information.  The command is not available for numpy 1.4.1.

I suppose it's possible that the upcasting rules were different in
1.4.1 and that is the cause of the different behavior.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Float128 integer comparison

2011-11-01 Thread Matthew Brett
Hi,

On Sat, Oct 15, 2011 at 1:34 PM, Derek Homeier
de...@astro.physik.uni-goettingen.de wrote:
 On 15.10.2011, at 9:42PM, Aronne Merrelli wrote:


 On Sat, Oct 15, 2011 at 1:12 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 Continuing the exploration of float128 - can anyone explain this behavior?

  np.float64(9223372036854775808.0) == 9223372036854775808L
 True
  np.float128(9223372036854775808.0) == 9223372036854775808L
 False
  int(np.float128(9223372036854775808.0)) == 9223372036854775808L
 True
  np.round(np.float128(9223372036854775808.0)) == 
  np.float128(9223372036854775808.0)
 True


 I know little about numpy internals, but while fiddling with this, I noticed 
 a possible clue:

  np.float128(9223372036854775808.0) == 9223372036854775808L
 False
  np.float128(4611686018427387904.0) == 4611686018427387904L
 True
  np.float128(9223372036854775808.0) - 9223372036854775808L
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: unsupported operand type(s) for -: 'numpy.float128' and 'long'
  np.float128(4611686018427387904.0) - 4611686018427387904L
 0.0


 My speculation - 9223372036854775808L is the first integer that is too big 
 to fit into a signed 64 bit integer. Python is OK with this but that means 
 it must be containing that value in some more complicated object. Since you 
 don't get the type error between float64() and long:

  np.float64(9223372036854775808.0) - 9223372036854775808L
 0.0

 Maybe there are some unimplemented pieces in numpy for dealing with 
 operations between float128 and python arbitrary longs? I could see the == 
 test just producing false in that case, because it defaults back to some 
 object equality test which isn't actually looking at the numbers.

 That seems to make sense, since even upcasting from a np.float64 still lets 
 the test fail:
 np.float128(np.float64(9223372036854775808.0)) == 9223372036854775808L
 False
 while
 np.float128(9223372036854775808.0) == np.uint64(9223372036854775808L)
 True

 and
 np.float128(9223372036854775809) == np.uint64(9223372036854775809L)
 False
 np.float128(np.uint(9223372036854775809L) == 
 np.uint64(9223372036854775809L)
 True

 Showing again that the normal casting to, or reading in of, a np.float128 
 internally inevitably
 calls the python float(), as already suggested in one of the parallel threads 
 (I think this
 also came up with some of the tests for precision) - leading to different 
 results than
 when you can convert from a np.int64 - this makes the outcome look even 
 weirder:

 np.float128(9223372036854775807.0) - 
 np.float128(np.int64(9223372036854775807))
 1.0
 np.float128(9223372036854775296.0) - 
 np.float128(np.int64(9223372036854775807))
 1.0
 np.float128(9223372036854775295.0) - 
 np.float128(np.int64(9223372036854775807))
 -1023.0
 np.float128(np.int64(9223372036854775296)) - 
 np.float128(np.int64(9223372036854775807))
 -511.0

 simply due to the nearest np.float64 always being equal to MAX_INT64 in the 
 two first cases
 above (or anything in between)...

Right - just for the record, I think there are four relevant problems.

1: values being cast to float128 appear to go through float64
--

In [119]: np.float128(2**54-1)
Out[119]: 18014398509481984.0

In [120]: np.float128(2**54)-1
Out[120]: 18014398509481983.0

2: values being cast from float128 to int appear to go through float64 again
---

In [121]: int(np.float128(2**54-1))
Out[121]: 18014398509481984

http://projects.scipy.org/numpy/ticket/1395

3: comparison to python long ints is always unequal
---

In [139]: 2**63 # 2*63 correctly represented in float128
Out[139]: 9223372036854775808L

In [140]: int(np.float64(2**63))
Out[140]: 9223372036854775808L

In [141]: int(np.float128(2**63))
Out[141]: 9223372036854775808L

In [142]: np.float128(2**63) == 2**63
Out[142]: False

In [143]: np.float128(2**63)-1 == 2**63-1
Out[143]: True

In [144]: np.float128(2**63) == np.float128(2**63)
Out[144]: True

Probably because, as y'all are saying, numpy tries to convert to
np.int64, fails, and falls back to an object array:

In [145]: np.array(2**63)
Out[145]: array(9223372036854775808L, dtype=object)

In [146]: np.array(2**63-1)
Out[146]: array(9223372036854775807L)

4 : any other operation of float128 with python long ints fails
--

In [148]: np.float128(0) + 2**63
---
TypeError Traceback (most recent call last)
/home/mb312/ipython-input-148-5cc20524867d in module()
 1 np.float128(0) + 2**63

TypeError: unsupported operand type(s) for +: 'numpy.float128' and 'long'

In [149

Re: [Numpy-discussion] Nice float - integer conversion?

2011-11-01 Thread Matthew Brett
Hi,

On Sat, Oct 15, 2011 at 12:20 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Oct 11, 2011 at 7:32 PM, Benjamin Root ben.r...@ou.edu wrote:
 On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier
 de...@astro.physik.uni-goettingen.de wrote:

 On 11 Oct 2011, at 20:06, Matthew Brett wrote:

  Have I missed a fast way of doing nice float to integer conversion?
 
  By nice I mean, rounding to the nearest integer, converting NaN to 0,
  inf, -inf to the max and min of the integer range?  The astype method
  and cast functions don't do what I need here:
 
  In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
  Out[40]: array([1, 0, 0, 0], dtype=int16)
 
  In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf]))
  Out[41]: array([1, 0, 0, 0], dtype=int16)
 
  Have I missed something obvious?

 np.[a]round comes closer to what you wish (is there consensus
 that NaN should map to 0?), but not quite there, and it's not really
 consistent either!


 In a way, there is already consensus in the code.  np.nan_to_num() by
 default converts nans to zero, and the infinities go to very large and very
 small.

      np.set_printoptions(precision=8)
      x = np.array([np.inf, -np.inf, np.nan, -128, 128])
      np.nan_to_num(x)
     array([  1.79769313e+308,  -1.79769313e+308,   0.e+000,
     -1.2800e+002,   1.2800e+002])

 Right - but - we'd still need to round, and take care of the nasty
 issue of thresholding:

 x = np.array([np.inf, -np.inf, np.nan, -128, 128])
 x
 array([  inf,  -inf,   nan, -128.,  128.])
 nnx = np.nan_to_num(x)
 nnx

 array([  1.79769313e+308,  -1.79769313e+308,   0.e+000,
        -1.2800e+002,   1.2800e+002])
 np.rint(nnx).astype(np.int8)
 array([   0,    0,    0, -128, -128], dtype=int8)

 So, I think nice_round would look something like:

 def nice_round(arr, out_type):
    in_type = arr.dtype.type
    mx = floor_exact(np.iinfo(out_type).max, in_type)
    mn = floor_exact(np.iinfo(out_type).max, in_type)
    nans = np.isnan(arr)
    out = np.rint(np.clip(arr, mn, mx)).astype(out_type)
    out[nans] = 0
    return out

 with floor_exact being something like:

 https://github.com/matthew-brett/nibabel/blob/range-dtype-conversions/nibabel/floating.py

In case anyone is interested or for the sake of anyone later googling
this thread -

I made a working version of nice_round:

https://github.com/matthew-brett/nibabel/blob/floating-stash/nibabel/casting.py

Docstring:
def nice_round(arr, int_type, nan2zero=True, infmax=False):
 Round floating point array `arr` to type `int_type`

Parameters
--
arr : array-like
Array of floating point type
int_type : object
Numpy integer type
nan2zero : {True, False}
Whether to convert NaN value to zero. Default is True. If False, and
NaNs are present, raise CastingError
infmax : {False, True}
If True, set np.inf values in `arr` to be `int_type` integer maximum
value, -np.inf as `int_type` integer minimum. If False, merely set infs
to be numbers at or near the maximum / minumum number in `arr` that can be
contained in `int_type`. Therefore False gives faster conversion at the
expense of infs that are further from infinity.

Returns
---
iarr : ndarray
of type `int_type`

Examples

 nice_round([np.nan, np.inf, -np.inf, 1.1, 6.6], np.int16)
array([ 0, 32767, -32768, 1, 7], dtype=int16)

It wasn't straightforward to find the right place to clip the array to
stop overflow on casting, but I think it's working and tested now.

See y'all,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] float64 / int comparison different from float / int comparison

2011-10-31 Thread Matthew Brett
Hi,

I just ran into this confusing difference between np.float and np.float64:

In [8]: np.float(2**63) == 2**63
Out[8]: True

In [9]: np.float(2**63)  2**63-1
Out[9]: True

In [10]: np.float64(2**63) == 2**63
Out[10]: True

In [11]: np.float64(2**63)  2**63-1
Out[11]: False

In [16]: np.float64(2**63-1) == np.float(2**63-1)
Out[16]: True

I believe values above 2*52 are all represented as integers in float64.

http://matthew-brett.github.com/pydagogue/floating_point.html

Is this this int64 issue that came up earlier in float128 comparison?
Why the difference between np.float and np.float64?

Thanks for any insight,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float64 / int comparison different from float / int comparison

2011-10-31 Thread Matthew Brett
Hi,

2011/10/31 Stéfan van der Walt ste...@sun.ac.za:
 On Mon, Oct 31, 2011 at 11:23 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 In [8]: np.float(2**63) == 2**63
 Out[8]: True

 In [9]: np.float(2**63)  2**63-1
 Out[9]: True

 In [10]: np.float64(2**63) == 2**63
 Out[10]: True

 In [11]: np.float64(2**63)  2**63-1
 Out[11]: False

 In [16]: np.float64(2**63-1) == np.float(2**63-1)
 Out[16]: True

 Interesting.  Turns out that np.float(x) returns a Python float
 object.  If you change the experiment to only use numpy array scalars,
 things are more consistent:

 In [36]: np.array(2**63, dtype=np.float)  2**63 - 1
 Out[36]: False

 In [37]: np.array(2**63, dtype=np.float32)  2**63 - 1
 Out[37]: False

 In [38]: np.array(2**63, dtype=np.float64)  2**63 - 1

Oh, dear, I'm suffering now:

In [11]: res = np.array((2**31,), dtype=np.float32)

In [12]: res  2**31-1
Out[12]: array([False], dtype=bool)

OK - that's what I was expecting from the above, but now:

In [13]: res[0]  2**31-1
Out[13]: True

In [14]: res[0].dtype
Out[14]: dtype('float32')

Sorry, maybe I'm not thinking straight, but I'm confused...

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-30 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 10:02 PM, Travis Oliphant
oliph...@enthought.com wrote:

 Here are my needs:

 1) How NAs are implemented cannot be end user visible. Having to pass
 maskna=True is a problem. I suppose a solution is to set the flag to
 true on every array inside of pandas so the user never knows (you
 mentioned someone else had some other solution, i could go back and
 dig it up?)

 I guess this would be the same with bitpatterns, in that the user
 would have to specify a custom dtype.

 Is it possible to add a bitpattern NA (in the NaN values) to the
 current floating point types, at least in principle?  So that np.float
 etc would have bitpattern NAs without a custom dtype?

 That is an interesting idea.   It's essentially what people like Wes McKinney 
 are doing now.    However, the issue is going to be whether or not you do 
 something special or not with the NA values in the low-level C function the 
 dtype dispatches to.  This is the reason for the special bit-pattern dtype.

 I've always thought that requiring NA checks for code that doesn't want to 
 worry about it would slow things down un-necessarily for those use-cases.

Right - now that the caffeine has run through my system adequately, I
have a few glasses of wine to disrupt my logic and / or social skills
but:

Is there any way you could imagine something like this?:

In [3]: a = np.arange(10, dtype=np.float)

In [4]: a.flags
Out[4]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
  MAYBE_NA : False

In [5]: a[0] = np.NA

In [6]: a.flags
Out[6]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
  MAYBE_NA : True

Obviously extension writers would have to keep the flag maintained...

Sorry if that doesn't make sense, I do not claim to be in full
possession of my faculties,

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-30 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 11:19 PM, Travis Oliphant
oliph...@enthought.com wrote:

Thanks again for your email, I'm sure I'm not the only one who
breathes a deep sigh of relief when I see your posts.

 I appreciate Nathaniel's idea to pull the changes and I can respect his 
 desire to do that.   It seemed like there was a lot more heat than light in 
 the discussion this summer.   The differences seemed to be enflamed by the 
 discussion instead of illuminated by it.  Perhaps, that is why Nathaniel felt 
 like merging Mark's pull request was too strong-armed and not a proper 
 resolution.

 However, I did not interpret Matthew or Nathaniel's explanations of their 
 position as manipulative or inappropriate.  Nonetheless, I don't think 
 removing Mark's changes are a productive direction to take at this point.   I 
 agree, it would have been much better to reach a rough consensus before the 
 code was committed.  At least, those who felt like their ideas where not 
 accounted for should have felt like there was some plan to either accommodate 
 them, or some explanation of why that was not a good idea.  The only thing I 
 recall being said was that there was nobody to implement their ideas.   I 
 wish that weren't the case.   I think we can still continue to discuss their 
 concerns and look for ways to reasonably incorporate their use-cases if 
 possible.

 I have probably contributed in the past to the idea that he who writes the 
 code gets the final say.    In early-stage efforts, this is approximately 
 right, but success of anything relies on satisfied users and as projects 
 mature the voice of users becomes more relevant than the voice of 
 contributors in my mind.   I've certainly had to learn that in terms of ABI 
 changes to NumPy.

I think that's right though - that the person who wrote the code has
the final say.  But that's the final say.   The question I wanted to
ask was the one Nathaniel brought up at the beginning of the thread,
which is, before the final say, how hard do we try for consensus?  Is
that - the numpy way?   Here Chuck was saying 'I listen to you in
proportion to your code contribution' (I hope I'm not misrepresenting
him).   I think that's different way of working than the consensus
building that Karl Fogel describes.  But maybe that is just the numpy
way.  I would feel happier to know what that way is.   Then, when we
get into this kind of dispute Chuck can say 'Matthew, change the numpy
constitution or accept the situation because that's how we've agreed
to work'.   And I'll say - 'OK - I don't like it, but I agree those
are the rules'.  And we'll get on with it.  But at the moment, it
feels as if it isn't clear, and, as Ben pointed out, that means we are
having a discussion and a discussion about the discussion at the same
time.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Large numbers into float128

2011-10-30 Thread Matthew Brett
Hi,

On Sun, Oct 30, 2011 at 2:38 AM, Berthold Höllmann
berth...@xn--hllmanns-n4a.de wrote:
 Matthew Brett matthew.br...@gmail.com writes:

 Hi,

 Can anyone think of a good way to set a float128 value to an
 arbitrarily large number?

 As in

 v = int_to_float128(some_value)

 ?

 I'm trying things like

 v = np.float128(2**64+2)

 but, because (in other threads) the float128 seems to be going through
 float64 on assignment, this loses precision, so although 2**64+2 is
 representable in float128, in fact I get:

 In [35]: np.float128(2**64+2)
 Out[35]: 18446744073709551616.0

 In [36]: 2**64+2
 Out[36]: 18446744073709551618L

 So - can anyone think of another way to assign values to float128 that
 will keep the precision?

 Just use float128 all the was through, and avoid casting to float in
 between:

 . %20.1f%float(2**64+2)
 '18446744073709551616.0'
 . np.float128(np.float128(2)**64+2)
 18446744073709551618.0

Ah yes - sorry - that would work in this example where I know the
component parts of the number, but I was thinking in the general case
where I have been given any int.

I think my code works for that, by casting to float64 to break up the
number into parts:

In [35]: def int_to_float128(val):
   :f64 = np.float64(val)
   :res = val - int(f64)
   :return np.float128(f64) + np.float128(res)
   :

In [36]: int_to_float128(2**64)
Out[36]: 18446744073709551616.0

In [37]: int_to_float128(2**64+2)
Out[37]: 18446744073709551618.0

Thanks,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus

2011-10-30 Thread Matthew Brett
Hi,

On Sun, Oct 30, 2011 at 11:37 AM, Chris Barker chris.bar...@noaa.gov wrote:
 On 10/29/11 2:59 PM, Charles R Harris wrote:

 I'm much opposed to ripping the current code out. It isn't like it is
 (known to be) buggy, nor has anyone made the case that it isn't a basis
 on which build other options. It also smacks of gratuitous violence
 committed by someone yet to make a positive contribution to the project.

 1) contributing to the discussion IS a positive contribution to the project.

Yes, but, personally I'd rather the discussion was not about who was
saying something, but what they were saying.

That is, if someone proposes something, or offers a discussion, we
don't first ask 'who are you', but try and engage with the substance
of the argument.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-30 Thread Matthew Brett
Hi,

On Sun, Oct 30, 2011 at 12:24 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 11:55 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
 matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 

 No, that's not what Nathaniel and I are saying at all.
 Nathaniel
 was
 pointing to links for projects that care that everyone agrees
 before
 they go ahead.

 It looked to me like there was a serious intent to come to an
 agreement,
 or
 at least closer together. The discussion in the summer was
 going
 around
 in
 circles though, and was too abstract and complex to follow.
 Therefore
 Mark's
 choice of implementing something and then asking for feedback
 made
 sense
 to
 me.
   
I should point out that the implementation hasn't - as far as I
can
see - changed the discussion.  The discussion was about the API.
   
Implementations are useful for agreed APIs because they can
point
out
where the API does not make sense or cannot be implemented.  In
this
case, the API Mark said he was going to implement - he did
implement -
at least as far as I can see.  Again, I'm happy to be corrected.
   
Implementations can also help the discussion along, by allowing
people
to
try out some of the proposed changes. It also allows to construct
examples
that show weaknesses, possibly to be solved by an alternative
API.
Maybe
you
can hold the complete history of this topic in your head and
comprehend
it,
but for me it would be very helpful if someone said:
- here's my dataset
- this is what I want to do with it
- this is the best I can do with the current implementation
- here's how API X would allow me to solve this better or simpler
This can be done much better with actual data and an actual
implementation
than with a design proposal. You seem to disagree with this
statement.
That's fine. I would hope though that you recognize that concrete
examples
help people like me, and construct one or two to help us out.
   That's what use-cases are for in designing APIs.  There are
   examples
   of use in the NEP:
  
  
   https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
  
   the alterNEP:
  
   https://gist.github.com/1056379
  
   and my longer email to Travis:
  
  
  
  
   http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
  
   Mark has done a nice job of documentation:
  
   http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
  
   If you want to understand what the alterNEP case is, I'd suggest
   the
   email, just because it's the most recent and I think the
   terminology
   is slightly clearer.
  
   Doing the same examples on a larger array won't make the point
   easier
   to understand.  The discussion is about what the right concepts
   are,
   and you can help by looking at the snippets of code in those
   documents, and deciding for yourself whether you think the current
   masking / NA implementation seems natural and easy to explain, or
   rather forced and difficult to explain, and then email back trying
   to
   explain your impression (which is not always easy).
  
   If you seriously believe that looking at a few snippets is as
   helpful
   and
   instructive as being able to play around with them in IPython and
   modify
   them, then I guess we won't make progress in this part of the
   discussion.
   You're just telling me to go back and re-read things I'd already
   read.
  
   The snippets are in ipython or doctest format - aren't they?
 
  Oops - 10 minute rule.  Now I see that you mean that you can't
  experiment with the alternative implementation without working code.
 
  Indeed.
 
 
  That's true, but I am hoping that the difference between - say:
 
  a[0:2] = np.NA
 
  and
 
  a.mask[0:2] = False
 
  would be easy enough to imagine.
 
  It is in this case. I agree

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 8:38 PM, Benjamin Root ben.r...@ou.edu wrote:
 Matt,

 On Friday, October 28, 2011, Matthew Brett matthew.br...@gmail.com wrote:

 Forget about rudeness or decision processes.

 No, that's a common mistake, which is to assume that any conversation
 about things which aren't technical, is not important.   Nathaniel's
 point is important.  Rudeness is important. The reason we've got into
 this mess is because we clearly don't have an agreed way of making
 decisions.  That's why countries and open-source projects have
 constitutions, so this doesn't happen.

 Don't get me wrong. In general, you are right.  And maybe we all should
 discuss something to that effect for numpy.  But I would rather do that when
 there isn't such contention and tempers.

That's a reasonable point.

 As for allegations of rudeness, I believe that we are actually very close to
 consensus that I immediately wanted to squelch any sort of
 meta-meta-disagreements about who was being rude to who.  As a quick
 band-aide, anybody who felt slighted by me gets a drink on me at the next
 scipy conference.  From this point on, let's institute a 10 minute rule --
 write your email, wait ten minutes, read it again and edit it.

Good offer.  I make the same one.

 I will start by saying that I am willing to separate ignore and absent,
 but
 only on the write side of things.  On read, I want a single way to
 identify
 the missing values.  I also want only a single way to perform
 calculations
 (either skip or propagate).

 Thank you - that is very helpful.

 Are you saying that you'd be OK setting missing values like this?

 a.mask[0:2] = False


 Probably not that far, because that would be an attribute that may or may
 not exist.  Rather, I might like the idea of a NA to always mean absent
 (and destroys - even through views), and MA (or some other name) which
 always means ignore (and has the masking behavior with views). This makes
 specific behaviors tied distinctly to specific objects.

Ah - yes - thank you.  I think you and I at least have somewhere to go
for agreement, but, I don't know how to work towards a numpy-wide
agreement.  Do you have any thoughts?

 For the read side, do you mean you're OK with this

 a.isna()

 To identify the missing values, as is currently the case?  Or something
 else?


 Yes.  A missing value is a missing value, regardless of it being absent or
 marked as ignored.  But it is a bit more subtle than that.  I should just be
 able to add two arrays together and the data should know what to do. When
 the core ufuncs get this right (like min, max, sum, cumsum, diff, etc), then
 I don't have to do much to prepare higher level funcs for missing data.

 If so, then I think we're very close, it's just a discussion about names.


 And what does ignore + absent equals. ;-)

ignore + absent == special_value_of_some_sort :)

Just joking,

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
 
  No, that's not what Nathaniel and I are saying at all. Nathaniel was
  pointing to links for projects that care that everyone agrees before
  they go ahead.
 
  It looked to me like there was a serious intent to come to an agreement,
  or
  at least closer together. The discussion in the summer was going around
  in
  circles though, and was too abstract and complex to follow. Therefore
  Mark's
  choice of implementing something and then asking for feedback made sense
  to
  me.

 I should point out that the implementation hasn't - as far as I can
 see - changed the discussion.  The discussion was about the API.

 Implementations are useful for agreed APIs because they can point out
 where the API does not make sense or cannot be implemented.  In this
 case, the API Mark said he was going to implement - he did implement -
 at least as far as I can see.  Again, I'm happy to be corrected.

 Implementations can also help the discussion along, by allowing people to
 try out some of the proposed changes. It also allows to construct examples
 that show weaknesses, possibly to be solved by an alternative API. Maybe you
 can hold the complete history of this topic in your head and comprehend it,
 but for me it would be very helpful if someone said:
 - here's my dataset
 - this is what I want to do with it
 - this is the best I can do with the current implementation
 - here's how API X would allow me to solve this better or simpler
 This can be done much better with actual data and an actual implementation
 than with a design proposal. You seem to disagree with this statement.
 That's fine. I would hope though that you recognize that concrete examples
 help people like me, and construct one or two to help us out.
That's what use-cases are for in designing APIs.  There are examples
of use in the NEP:

https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

the alterNEP:

https://gist.github.com/1056379

and my longer email to Travis:

http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored

Mark has done a nice job of documentation:

http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

If you want to understand what the alterNEP case is, I'd suggest the
email, just because it's the most recent and I think the terminology
is slightly clearer.

Doing the same examples on a larger array won't make the point easier
to understand.  The discussion is about what the right concepts are,
and you can help by looking at the snippets of code in those
documents, and deciding for yourself whether you think the current
masking / NA implementation seems natural and easy to explain, or
rather forced and difficult to explain, and then email back trying to
explain your impression (which is not always easy).

  In saying that we are insisting on our way, you are saying, implicitly,
  'I
  am not going to negotiate'.
 
  That is only your interpretation. The observation that Mark compromised
  quite a bit while you didn't seems largely correct to me.

 The problem here stems from our inability to work towards agreement,
 rather than standing on set positions.  I set out what changes I think
 would make the current implementation OK.  Can we please, please have
 a discussion about those points instead of trying to argue about who
 has given more ground.

  That commitment would of course be good. However, even if that were
  possible
  before writing code and everyone agreed that the ideas of you and
  Nathaniel
  should be implemented in full, it's still not clear that either of you
  would
  be willing to write any code. Agreement without code still doesn't help
  us
  very much.

 I'm going to return to Nathaniel's point - it is a highly valuable
 thing to set ourselves the target of resolving substantial discussions
 by consensus.   The route you are endorsing here is 'implementor
 wins'.

 I'm not. All I want to point out is is that design and implementation are
 not completely separated either.

No, they often interact.  I was trying to explain why, in this case,
the implementation hasn't changed the issues substantially, as far as
I can see.   If you think otherwise, then that is helpful information,
because you can feed back about where the initial discussion has been
overtaken by the implementation, and so we can strip down the
discussion to its essential parts.

 We don't need to do it that way.  We're a mature sensible
 bunch of adults

 Agreed:)

Ah - if only it was that easy :)

 who can talk out the issues until we agree

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 12:19 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 1:04 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
   charlesr.har...@gmail.com wrote:
   
  
   No, that's not what Nathaniel and I are saying at all. Nathaniel was
   pointing to links for projects that care that everyone agrees before
   they go ahead.
  
   It looked to me like there was a serious intent to come to an
   agreement,
   or
   at least closer together. The discussion in the summer was going
   around
   in
   circles though, and was too abstract and complex to follow. Therefore
   Mark's
   choice of implementing something and then asking for feedback made
   sense
   to
   me.
 
  I should point out that the implementation hasn't - as far as I can
  see - changed the discussion.  The discussion was about the API.
 
  Implementations are useful for agreed APIs because they can point out
  where the API does not make sense or cannot be implemented.  In this
  case, the API Mark said he was going to implement - he did implement -
  at least as far as I can see.  Again, I'm happy to be corrected.
 
  Implementations can also help the discussion along, by allowing people
  to
  try out some of the proposed changes. It also allows to construct
  examples
  that show weaknesses, possibly to be solved by an alternative API. Maybe
  you
  can hold the complete history of this topic in your head and comprehend
  it,
  but for me it would be very helpful if someone said:
  - here's my dataset
  - this is what I want to do with it
  - this is the best I can do with the current implementation
  - here's how API X would allow me to solve this better or simpler
  This can be done much better with actual data and an actual
  implementation
  than with a design proposal. You seem to disagree with this statement.
  That's fine. I would hope though that you recognize that concrete
  examples
  help people like me, and construct one or two to help us out.
 That's what use-cases are for in designing APIs.  There are examples
 of use in the NEP:

 https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

 the alterNEP:

 https://gist.github.com/1056379

 and my longer email to Travis:


 http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored

 Mark has done a nice job of documentation:

 http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

 If you want to understand what the alterNEP case is, I'd suggest the
 email, just because it's the most recent and I think the terminology
 is slightly clearer.

 Doing the same examples on a larger array won't make the point easier
 to understand.  The discussion is about what the right concepts are,
 and you can help by looking at the snippets of code in those
 documents, and deciding for yourself whether you think the current
 masking / NA implementation seems natural and easy to explain, or
 rather forced and difficult to explain, and then email back trying to
 explain your impression (which is not always easy).

   In saying that we are insisting on our way, you are saying,
   implicitly,
   'I
   am not going to negotiate'.
  
   That is only your interpretation. The observation that Mark
   compromised
   quite a bit while you didn't seems largely correct to me.
 
  The problem here stems from our inability to work towards agreement,
  rather than standing on set positions.  I set out what changes I think
  would make the current implementation OK.  Can we please, please have
  a discussion about those points instead of trying to argue about who
  has given more ground.
 
   That commitment would of course be good. However, even if that were
   possible
   before writing code and everyone agreed that the ideas of you and
   Nathaniel
   should be implemented in full, it's still not clear that either of
   you
   would
   be willing to write any code. Agreement without code still doesn't
   help
   us
   very much.
 
  I'm going to return to Nathaniel's point - it is a highly valuable
  thing to set ourselves the target of resolving substantial discussions
  by consensus.   The route you are endorsing here is 'implementor
  wins'.
 
  I'm not. All I want to point out is is that design and implementation
  are
  not completely separated either.

 No, they often interact.  I was trying to explain why, in this case,
 the implementation hasn't changed the issues substantially, as far as
 I can see.   If you think otherwise, then that is helpful information,
 because you can feed back

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 12:41 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 1:26 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 12:19 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 1:04 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
charlesr.har...@gmail.com wrote:

   
No, that's not what Nathaniel and I are saying at all. Nathaniel
was
pointing to links for projects that care that everyone agrees
before
they go ahead.
   
It looked to me like there was a serious intent to come to an
agreement,
or
at least closer together. The discussion in the summer was going
around
in
circles though, and was too abstract and complex to follow.
Therefore
Mark's
choice of implementing something and then asking for feedback made
sense
to
me.
  
   I should point out that the implementation hasn't - as far as I can
   see - changed the discussion.  The discussion was about the API.
  
   Implementations are useful for agreed APIs because they can point
   out
   where the API does not make sense or cannot be implemented.  In this
   case, the API Mark said he was going to implement - he did implement
   -
   at least as far as I can see.  Again, I'm happy to be corrected.
  
   Implementations can also help the discussion along, by allowing
   people
   to
   try out some of the proposed changes. It also allows to construct
   examples
   that show weaknesses, possibly to be solved by an alternative API.
   Maybe
   you
   can hold the complete history of this topic in your head and
   comprehend
   it,
   but for me it would be very helpful if someone said:
   - here's my dataset
   - this is what I want to do with it
   - this is the best I can do with the current implementation
   - here's how API X would allow me to solve this better or simpler
   This can be done much better with actual data and an actual
   implementation
   than with a design proposal. You seem to disagree with this
   statement.
   That's fine. I would hope though that you recognize that concrete
   examples
   help people like me, and construct one or two to help us out.
  That's what use-cases are for in designing APIs.  There are examples
  of use in the NEP:
 
  https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
 
  the alterNEP:
 
  https://gist.github.com/1056379
 
  and my longer email to Travis:
 
 
 
  http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
 
  Mark has done a nice job of documentation:
 
  http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
 
  If you want to understand what the alterNEP case is, I'd suggest the
  email, just because it's the most recent and I think the terminology
  is slightly clearer.
 
  Doing the same examples on a larger array won't make the point easier
  to understand.  The discussion is about what the right concepts are,
  and you can help by looking at the snippets of code in those
  documents, and deciding for yourself whether you think the current
  masking / NA implementation seems natural and easy to explain, or
  rather forced and difficult to explain, and then email back trying to
  explain your impression (which is not always easy).
 
In saying that we are insisting on our way, you are saying,
implicitly,
'I
am not going to negotiate'.
   
That is only your interpretation. The observation that Mark
compromised
quite a bit while you didn't seems largely correct to me.
  
   The problem here stems from our inability to work towards agreement,
   rather than standing on set positions.  I set out what changes I
   think
   would make the current implementation OK.  Can we please, please
   have
   a discussion about those points instead of trying to argue about who
   has given more ground.
  
That commitment would of course be good. However, even if that
were
possible
before writing code and everyone agreed that the ideas of you and
Nathaniel
should be implemented in full, it's still not clear that either of
you
would
be willing to write any code. Agreement without code still doesn't
help
us
very much.
  
   I'm going to return to Nathaniel's point - it is a highly valuable
   thing to set ourselves the target of resolving substantial
   discussions
   by consensus.   The route you are endorsing here is 'implementor
   wins

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 1:05 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 1:41 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Saturday, October 29, 2011, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
  Who is counted in building a consensus? I tend to pay attention to those
  who have made consistent contributions over the years, reviewed code, fixed
  bugs, and have generally been active in numpy development. In any group
  participation is important, people who just walk in the door and demand
  things be done their way aren't going to get a lot of respect. I'll happily
  listen to politely expressed feedback, especially if the feedback comes 
  from
  someone who shows up to work, but that hasn't been my impression of the
  disagreements in this case. Heck, Nathaniel wasn't even tracking the Numpy
  pull requests or Mark's repository. That doesn't spell participant in my
  dictionary.
 
  Chuck
 

 This is a very good point, but I would highly caution against alienating
 anybody here.  Frankly, I am surprised how much my opinion has been taken
 here given the very little numpy code I have submitted (I think maybe two or
 three patches).  The Numpy community is far more than just those who use the
 core library. There is pandas, bottleneck, mpl, the scikits, and much more.
  Numpy would be nearly useless without them, and certainly vice versa.


 I was quite impressed by your comments on Mark's work, I thought they were
 excellent. It doesn't really take much to make an impact in a small
 community overburdened by work.


 We are all indebted to each other for our works. We must never lose that
 perspective.

 We all seem to have a different set of assumptions of how development
 should work.  Each project follows its own workflow.  Numpy should be free
 to adopt their own procedures, and we are free to discuss them.

 I do agree with chuck that he shouldn't have to make a written invitation
 to each and every person to review each pull.  However, maybe some work can
 be done to bring the pull request and issues discussion down to the mailing
 list. I would like to do something similar with mpl.

 As for voting rights, let's make that a separate discussion.


 With such a small community, I'd rather avoid the whole voting thing if
 possible.

But, if there is one thing worse than voting, it is implicit voting.
Implicit voting is where you ignore people who you don't think should
have a voice.  Unless I'm mistaken, that's what you are suggesting
should be the norm.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
   charlesr.har...@gmail.com wrote:
   
  
   No, that's not what Nathaniel and I are saying at all. Nathaniel was
   pointing to links for projects that care that everyone agrees before
   they go ahead.
  
   It looked to me like there was a serious intent to come to an
   agreement,
   or
   at least closer together. The discussion in the summer was going
   around
   in
   circles though, and was too abstract and complex to follow. Therefore
   Mark's
   choice of implementing something and then asking for feedback made
   sense
   to
   me.
 
  I should point out that the implementation hasn't - as far as I can
  see - changed the discussion.  The discussion was about the API.
 
  Implementations are useful for agreed APIs because they can point out
  where the API does not make sense or cannot be implemented.  In this
  case, the API Mark said he was going to implement - he did implement -
  at least as far as I can see.  Again, I'm happy to be corrected.
 
  Implementations can also help the discussion along, by allowing people
  to
  try out some of the proposed changes. It also allows to construct
  examples
  that show weaknesses, possibly to be solved by an alternative API. Maybe
  you
  can hold the complete history of this topic in your head and comprehend
  it,
  but for me it would be very helpful if someone said:
  - here's my dataset
  - this is what I want to do with it
  - this is the best I can do with the current implementation
  - here's how API X would allow me to solve this better or simpler
  This can be done much better with actual data and an actual
  implementation
  than with a design proposal. You seem to disagree with this statement.
  That's fine. I would hope though that you recognize that concrete
  examples
  help people like me, and construct one or two to help us out.
 That's what use-cases are for in designing APIs.  There are examples
 of use in the NEP:

 https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

 the alterNEP:

 https://gist.github.com/1056379

 and my longer email to Travis:


 http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored

 Mark has done a nice job of documentation:

 http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

 If you want to understand what the alterNEP case is, I'd suggest the
 email, just because it's the most recent and I think the terminology
 is slightly clearer.

 Doing the same examples on a larger array won't make the point easier
 to understand.  The discussion is about what the right concepts are,
 and you can help by looking at the snippets of code in those
 documents, and deciding for yourself whether you think the current
 masking / NA implementation seems natural and easy to explain, or
 rather forced and difficult to explain, and then email back trying to
 explain your impression (which is not always easy).

 If you seriously believe that looking at a few snippets is as helpful and
 instructive as being able to play around with them in IPython and modify
 them, then I guess we won't make progress in this part of the discussion.
 You're just telling me to go back and re-read things I'd already read.

The snippets are in ipython or doctest format - aren't they?

 OK, update: I took Ben's 10 minutes to go back and read the reference doc
 and your email again, just in case. The current implementation still seems
 natural to me to explain. It fits my use-cases. Perhaps that's different for
 you because you and I deal with different kinds of data. I don't have to
 explicitly treat absent and ignored data differently; those two are actually
 mixed and indistinguishable already in much of my data. Therefore the
 current implementation works well for me, having to make a distinction would
 be a needless complication.

OK - I'm not sure that contributes much to the discussion, because the
problem is being able to explain to each other in details why one
solution is preferable to another.  To follow your own advice, you'd
post some code snippets showing how you'd see the two ideas playing
out and why one is clearer than the other.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
   charlesr.har...@gmail.com wrote:
   
  
   No, that's not what Nathaniel and I are saying at all. Nathaniel was
   pointing to links for projects that care that everyone agrees before
   they go ahead.
  
   It looked to me like there was a serious intent to come to an
   agreement,
   or
   at least closer together. The discussion in the summer was going
   around
   in
   circles though, and was too abstract and complex to follow. Therefore
   Mark's
   choice of implementing something and then asking for feedback made
   sense
   to
   me.
 
  I should point out that the implementation hasn't - as far as I can
  see - changed the discussion.  The discussion was about the API.
 
  Implementations are useful for agreed APIs because they can point out
  where the API does not make sense or cannot be implemented.  In this
  case, the API Mark said he was going to implement - he did implement -
  at least as far as I can see.  Again, I'm happy to be corrected.
 
  Implementations can also help the discussion along, by allowing people
  to
  try out some of the proposed changes. It also allows to construct
  examples
  that show weaknesses, possibly to be solved by an alternative API. Maybe
  you
  can hold the complete history of this topic in your head and comprehend
  it,
  but for me it would be very helpful if someone said:
  - here's my dataset
  - this is what I want to do with it
  - this is the best I can do with the current implementation
  - here's how API X would allow me to solve this better or simpler
  This can be done much better with actual data and an actual
  implementation
  than with a design proposal. You seem to disagree with this statement.
  That's fine. I would hope though that you recognize that concrete
  examples
  help people like me, and construct one or two to help us out.
 That's what use-cases are for in designing APIs.  There are examples
 of use in the NEP:

 https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

 the alterNEP:

 https://gist.github.com/1056379

 and my longer email to Travis:


 http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored

 Mark has done a nice job of documentation:

 http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

 If you want to understand what the alterNEP case is, I'd suggest the
 email, just because it's the most recent and I think the terminology
 is slightly clearer.

 Doing the same examples on a larger array won't make the point easier
 to understand.  The discussion is about what the right concepts are,
 and you can help by looking at the snippets of code in those
 documents, and deciding for yourself whether you think the current
 masking / NA implementation seems natural and easy to explain, or
 rather forced and difficult to explain, and then email back trying to
 explain your impression (which is not always easy).

 If you seriously believe that looking at a few snippets is as helpful and
 instructive as being able to play around with them in IPython and modify
 them, then I guess we won't make progress in this part of the discussion.
 You're just telling me to go back and re-read things I'd already read.

 The snippets are in ipython or doctest format - aren't they?

Oops - 10 minute rule.  Now I see that you mean that you can't
experiment with the alternative implementation without working code.
That's true, but I am hoping that the difference between - say:

a[0:2] = np.NA

and

a.mask[0:2] = False

would be easy enough to imagine.   If it isn't then, let me know,
preferably with something like I can't see exactly how the following
[code snippet] would work in your conception of the problem - and
then I can either try and give fake examples, or write a mock up.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
charlesr.har...@gmail.com wrote:

   
No, that's not what Nathaniel and I are saying at all. Nathaniel
was
pointing to links for projects that care that everyone agrees
before
they go ahead.
   
It looked to me like there was a serious intent to come to an
agreement,
or
at least closer together. The discussion in the summer was going
around
in
circles though, and was too abstract and complex to follow.
Therefore
Mark's
choice of implementing something and then asking for feedback
made
sense
to
me.
  
   I should point out that the implementation hasn't - as far as I can
   see - changed the discussion.  The discussion was about the API.
  
   Implementations are useful for agreed APIs because they can point
   out
   where the API does not make sense or cannot be implemented.  In
   this
   case, the API Mark said he was going to implement - he did
   implement -
   at least as far as I can see.  Again, I'm happy to be corrected.
  
   Implementations can also help the discussion along, by allowing
   people
   to
   try out some of the proposed changes. It also allows to construct
   examples
   that show weaknesses, possibly to be solved by an alternative API.
   Maybe
   you
   can hold the complete history of this topic in your head and
   comprehend
   it,
   but for me it would be very helpful if someone said:
   - here's my dataset
   - this is what I want to do with it
   - this is the best I can do with the current implementation
   - here's how API X would allow me to solve this better or simpler
   This can be done much better with actual data and an actual
   implementation
   than with a design proposal. You seem to disagree with this
   statement.
   That's fine. I would hope though that you recognize that concrete
   examples
   help people like me, and construct one or two to help us out.
  That's what use-cases are for in designing APIs.  There are examples
  of use in the NEP:
 
  https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
 
  the alterNEP:
 
  https://gist.github.com/1056379
 
  and my longer email to Travis:
 
 
 
  http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
 
  Mark has done a nice job of documentation:
 
  http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
 
  If you want to understand what the alterNEP case is, I'd suggest the
  email, just because it's the most recent and I think the terminology
  is slightly clearer.
 
  Doing the same examples on a larger array won't make the point easier
  to understand.  The discussion is about what the right concepts are,
  and you can help by looking at the snippets of code in those
  documents, and deciding for yourself whether you think the current
  masking / NA implementation seems natural and easy to explain, or
  rather forced and difficult to explain, and then email back trying to
  explain your impression (which is not always easy).
 
  If you seriously believe that looking at a few snippets is as helpful
  and
  instructive as being able to play around with them in IPython and
  modify
  them, then I guess we won't make progress in this part of the
  discussion.
  You're just telling me to go back and re-read things I'd already read.
 
  The snippets are in ipython or doctest format - aren't they?

 Oops - 10 minute rule.  Now I see that you mean that you can't
 experiment with the alternative implementation without working code.

 Indeed.


 That's true, but I am hoping that the difference between - say:

 a[0:2] = np.NA

 and

 a.mask[0:2] = False

 would be easy enough to imagine.

 It is in this case. I agree the explicit ``a.mask`` is clearer. This is a
 quite specific point that could be improved in the current implementation.

Thanks - this is helpful.

 It doesn't require ripping everything out.

Nathaniel wasn't proposing 'ripping everything out' - but backing off
until consensus has been reached.  That's different.If you think
we should not do that, and you are interested, please say why

[Numpy-discussion] Large numbers into float128

2011-10-29 Thread Matthew Brett
Hi,

Can anyone think of a good way to set a float128 value to an
arbitrarily large number?

As in

v = int_to_float128(some_value)

?

I'm trying things like

v = np.float128(2**64+2)

but, because (in other threads) the float128 seems to be going through
float64 on assignment, this loses precision, so although 2**64+2 is
representable in float128, in fact I get:

In [35]: np.float128(2**64+2)
Out[35]: 18446744073709551616.0

In [36]: 2**64+2
Out[36]: 18446744073709551618L

So - can anyone think of another way to assign values to float128 that
will keep the precision?

Thanks a lot,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 2:59 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
 matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 

 No, that's not what Nathaniel and I are saying at all.
 Nathaniel
 was
 pointing to links for projects that care that everyone agrees
 before
 they go ahead.

 It looked to me like there was a serious intent to come to an
 agreement,
 or
 at least closer together. The discussion in the summer was
 going
 around
 in
 circles though, and was too abstract and complex to follow.
 Therefore
 Mark's
 choice of implementing something and then asking for feedback
 made
 sense
 to
 me.
   
I should point out that the implementation hasn't - as far as I
can
see - changed the discussion.  The discussion was about the API.
   
Implementations are useful for agreed APIs because they can
point
out
where the API does not make sense or cannot be implemented.  In
this
case, the API Mark said he was going to implement - he did
implement -
at least as far as I can see.  Again, I'm happy to be corrected.
   
Implementations can also help the discussion along, by allowing
people
to
try out some of the proposed changes. It also allows to construct
examples
that show weaknesses, possibly to be solved by an alternative
API.
Maybe
you
can hold the complete history of this topic in your head and
comprehend
it,
but for me it would be very helpful if someone said:
- here's my dataset
- this is what I want to do with it
- this is the best I can do with the current implementation
- here's how API X would allow me to solve this better or simpler
This can be done much better with actual data and an actual
implementation
than with a design proposal. You seem to disagree with this
statement.
That's fine. I would hope though that you recognize that concrete
examples
help people like me, and construct one or two to help us out.
   That's what use-cases are for in designing APIs.  There are
   examples
   of use in the NEP:
  
  
   https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
  
   the alterNEP:
  
   https://gist.github.com/1056379
  
   and my longer email to Travis:
  
  
  
  
   http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
  
   Mark has done a nice job of documentation:
  
   http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
  
   If you want to understand what the alterNEP case is, I'd suggest
   the
   email, just because it's the most recent and I think the
   terminology
   is slightly clearer.
  
   Doing the same examples on a larger array won't make the point
   easier
   to understand.  The discussion is about what the right concepts
   are,
   and you can help by looking at the snippets of code in those
   documents, and deciding for yourself whether you think the current
   masking / NA implementation seems natural and easy to explain, or
   rather forced and difficult to explain, and then email back trying
   to
   explain your impression (which is not always easy).
  
   If you seriously believe that looking at a few snippets is as
   helpful
   and
   instructive as being able to play around with them in IPython and
   modify
   them, then I guess we won't make progress in this part of the
   discussion.
   You're just telling me to go back and re-read things I'd already
   read.
  
   The snippets are in ipython or doctest format - aren't they?
 
  Oops - 10 minute rule.  Now I see that you mean that you can't
  experiment with the alternative implementation without working code.
 
  Indeed.
 
 
  That's true, but I am hoping that the difference between - say:
 
  a[0:2] = np.NA
 
  and
 
  a.mask[0:2] = False
 
  would be easy enough to imagine.
 
  It is in this case. I agree

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 4:18 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 5:11 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 2:59 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
 matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
 
  No, that's not what Nathaniel and I are saying at all.
  Nathaniel
  was
  pointing to links for projects that care that everyone
  agrees
  before
  they go ahead.
 
  It looked to me like there was a serious intent to come to
  an
  agreement,
  or
  at least closer together. The discussion in the summer was
  going
  around
  in
  circles though, and was too abstract and complex to follow.
  Therefore
  Mark's
  choice of implementing something and then asking for
  feedback
  made
  sense
  to
  me.

 I should point out that the implementation hasn't - as far as
 I
 can
 see - changed the discussion.  The discussion was about the
 API.

 Implementations are useful for agreed APIs because they can
 point
 out
 where the API does not make sense or cannot be implemented.
  In
 this
 case, the API Mark said he was going to implement - he did
 implement -
 at least as far as I can see.  Again, I'm happy to be
 corrected.

 Implementations can also help the discussion along, by
 allowing
 people
 to
 try out some of the proposed changes. It also allows to
 construct
 examples
 that show weaknesses, possibly to be solved by an alternative
 API.
 Maybe
 you
 can hold the complete history of this topic in your head and
 comprehend
 it,
 but for me it would be very helpful if someone said:
 - here's my dataset
 - this is what I want to do with it
 - this is the best I can do with the current implementation
 - here's how API X would allow me to solve this better or
 simpler
 This can be done much better with actual data and an actual
 implementation
 than with a design proposal. You seem to disagree with this
 statement.
 That's fine. I would hope though that you recognize that
 concrete
 examples
 help people like me, and construct one or two to help us out.
That's what use-cases are for in designing APIs.  There are
examples
of use in the NEP:
   
   
   
https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
   
the alterNEP:
   
https://gist.github.com/1056379
   
and my longer email to Travis:
   
   
   
   
   
http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
   
Mark has done a nice job of documentation:
   
http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
   
If you want to understand what the alterNEP case is, I'd suggest
the
email, just because it's the most recent and I think the
terminology
is slightly clearer.
   
Doing the same examples on a larger array won't make the point
easier
to understand.  The discussion is about what the right concepts
are,
and you can help by looking at the snippets of code in those
documents, and deciding for yourself whether you think the
current
masking / NA implementation seems natural and easy to explain,
or
rather forced and difficult to explain, and then email back
trying
to
explain your impression (which is not always easy).
   
If you seriously believe that looking at a few snippets is as
helpful
and
instructive as being able to play around with them in IPython and
modify
them, then I guess we won't make progress in this part of the
discussion.
You're just telling me to go back and re-read things I'd already
read

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 4:28 PM, Han Genuit hangen...@gmail.com wrote:
 To be honest, you have been slandering a lot, also in previous
 discussions, to get what you wanted. This is not a healthy way of
 discussion, nor does it help in any way.

That's a severe accusation.  Please quote something I said that was
false, or unfair.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 11:14 AM, Wes McKinney wesmck...@gmail.com wrote:
 On Fri, Oct 28, 2011 at 9:32 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Fri, Oct 28, 2011 at 6:45 PM, Wes McKinney wesmck...@gmail.com wrote:

 On Fri, Oct 28, 2011 at 7:53 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
  On Friday, October 28, 2011, Matthew Brett matthew.br...@gmail.com
  wrote:
  Hi,
 
  On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
  
   On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
   
   
On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith
n...@pobox.com
wrote:
   
On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant
oliph...@enthought.com
wrote:
 I think Nathaniel and Matthew provided very
 specific feedback that was helpful in understanding other
 perspectives
 of a
 difficult problem.     In particular, I really wanted
 bit-patterns
 implemented.    However, I also understand that Mark did
 quite
 a
 bit
 of
 work
 and altered his original designs quite a bit in response to
 community
 feedback.   I wasn't a major part of the pull request
 discussion,
 nor
 did I
 merge the changes, but I support Charles if he reviewed the
 code
 and
 felt
 like it was the right thing to do.  I likely would have done
 the
 same
 thing
 rather than let Mark Wiebe's work languish.
   
My connectivity is spotty this week, so I'll stay out of the
technical
discussion for now, but I want to share a story.
   
Maybe a year ago now, Jonathan Taylor and I were debating what
the
best API for describing statistical models would be -- whether
we
wanted something like R's formulas (which I supported), or
another
approach based on sympy (his idea). To summarize, I thought
his
API
was confusing, pointlessly complicated, and didn't actually
solve
the
problem; he thought R-style formulas were superficially
simpler
but
hopelessly confused and inconsistent underneath. Now,
obviously,
I
was
right and he was wrong. Well, obvious to me, anyway... ;-) But
it
wasn't like I could just wave a wand and make his arguments go
away,
no I should point out that the implementation hasn't - as far
as
I can
  see - changed the discussion.  The discussion was about the API.
  Implementations are useful for agreed APIs because they can point out
  where the API does not make sense or cannot be implemented.  In this
  case, the API Mark said he was going to implement - he did implement -
  at least as far as I can see.  Again, I'm happy to be corrected.
 
  In saying that we are insisting on our way, you are saying,
  implicitly,
  'I
  am not going to negotiate'.
 
  That is only your interpretation. The observation that Mark
  compromised
  quite a bit while you didn't seems largely correct to me.
 
  The problem here stems from our inability to work towards agreement,
  rather than standing on set positions.  I set out what changes I think
  would make the current implementation OK.  Can we please, please have
  a discussion about those points instead of trying to argue about who
  has given more ground.
 
  That commitment would of course be good. However, even if that were
  possible
  before writing code and everyone agreed that the ideas of you and
  Nathaniel
  should be implemented in full, it's still not clear that either of you
  would
  be willing to write any code. Agreement without code still doesn't
  help
  us
  very much.
 
  I'm going to return to Nathaniel's point - it is a highly valuable
  thing to set ourselves the target of resolving substantial discussions
  by consensus.   The route you are endorsing here is 'implementor
  wins'.   We don't need to do it that way.  We're a mature sensible
  bunch of adults who can talk out the issues until we agree they are
  ready for implementation, and then implement.  That's all Nathaniel is
  saying.  I think he's obviously right, and I'm sad that it isn't as
  clear to y'all as it is to me.
 
  Best,
 
  Matthew
 
 
  Everyone, can we please not do this?! I had enough of adults doing
  finger
  pointing back over the summer during the whole debt ceiling debate.  I
  think
  we can all agree that we are better than the US congress?
 
  Forget about rudeness or decision processes.
 
  I will start by saying that I am willing to separate ignore and absent,
  but
  only on the write

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 4:11 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Oct 29, 2011 at 2:59 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
   ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
 matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 

 No, that's not what Nathaniel and I are saying at all.
 Nathaniel
 was
 pointing to links for projects that care that everyone agrees
 before
 they go ahead.

 It looked to me like there was a serious intent to come to an
 agreement,
 or
 at least closer together. The discussion in the summer was
 going
 around
 in
 circles though, and was too abstract and complex to follow.
 Therefore
 Mark's
 choice of implementing something and then asking for feedback
 made
 sense
 to
 me.
   
I should point out that the implementation hasn't - as far as I
can
see - changed the discussion.  The discussion was about the API.
   
Implementations are useful for agreed APIs because they can
point
out
where the API does not make sense or cannot be implemented.  In
this
case, the API Mark said he was going to implement - he did
implement -
at least as far as I can see.  Again, I'm happy to be corrected.
   
Implementations can also help the discussion along, by allowing
people
to
try out some of the proposed changes. It also allows to construct
examples
that show weaknesses, possibly to be solved by an alternative
API.
Maybe
you
can hold the complete history of this topic in your head and
comprehend
it,
but for me it would be very helpful if someone said:
- here's my dataset
- this is what I want to do with it
- this is the best I can do with the current implementation
- here's how API X would allow me to solve this better or simpler
This can be done much better with actual data and an actual
implementation
than with a design proposal. You seem to disagree with this
statement.
That's fine. I would hope though that you recognize that concrete
examples
help people like me, and construct one or two to help us out.
   That's what use-cases are for in designing APIs.  There are
   examples
   of use in the NEP:
  
  
   https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
  
   the alterNEP:
  
   https://gist.github.com/1056379
  
   and my longer email to Travis:
  
  
  
  
   http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
  
   Mark has done a nice job of documentation:
  
   http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
  
   If you want to understand what the alterNEP case is, I'd suggest
   the
   email, just because it's the most recent and I think the
   terminology
   is slightly clearer.
  
   Doing the same examples on a larger array won't make the point
   easier
   to understand.  The discussion is about what the right concepts
   are,
   and you can help by looking at the snippets of code in those
   documents, and deciding for yourself whether you think the current
   masking / NA implementation seems natural and easy to explain, or
   rather forced and difficult to explain, and then email back trying
   to
   explain your impression (which is not always easy).
  
   If you seriously believe that looking at a few snippets is as
   helpful
   and
   instructive as being able to play around with them in IPython and
   modify
   them, then I guess we won't make progress in this part of the
   discussion.
   You're just telling me to go back and re-read things I'd already
   read.
  
   The snippets are in ipython or doctest format - aren't they?
 
  Oops - 10 minute rule.  Now I see that you mean that you can't
  experiment with the alternative implementation without working code.
 
  Indeed.
 
 
  That's true, but I am hoping that the difference between - say:
 
  a[0:2] = np.NA
 
  and
 
  a.mask[0:2

Re: [Numpy-discussion] Large numbers into float128

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Can anyone think of a good way to set a float128 value to an
 arbitrarily large number?

 As in

 v = int_to_float128(some_value)

 ?

 I'm trying things like

 v = np.float128(2**64+2)

 but, because (in other threads) the float128 seems to be going through
 float64 on assignment, this loses precision, so although 2**64+2 is
 representable in float128, in fact I get:

 In [35]: np.float128(2**64+2)
 Out[35]: 18446744073709551616.0

 In [36]: 2**64+2
 Out[36]: 18446744073709551618L

 So - can anyone think of another way to assign values to float128 that
 will keep the precision?

To answer my own question - I found an unpleasant way of doing this.

Basically it is this:

def int_to_float128(val):
f64 = np.float64(val)
res = val - int(f64)
return np.float128(f64) + np.float128(res)

Used in various places here:

https://github.com/matthew-brett/nibabel/blob/e18e94c5b0f54775c46b1c690491b8bd6f07eb49/nibabel/floating.py

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-29 Thread Matthew Brett
Hi,

On Sat, Oct 29, 2011 at 7:48 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Oct 29, 2011 at 7:47 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Oct 29, 2011 at 4:11 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Sat, Oct 29, 2011 at 2:59 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
  ralf.gomm...@googlemail.com wrote:
  
  
   On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
  
   Hi,
  
   On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
   matthew.br...@gmail.com
   wrote:
Hi,
   
On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:
   
   
On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
matthew.br...@gmail.com
wrote:
   
Hi,
   
On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
 matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:
 
 
  On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
  
 
  No, that's not what Nathaniel and I are saying at all.
  Nathaniel
  was
  pointing to links for projects that care that everyone
  agrees
  before
  they go ahead.
 
  It looked to me like there was a serious intent to come to
  an
  agreement,
  or
  at least closer together. The discussion in the summer was
  going
  around
  in
  circles though, and was too abstract and complex to
  follow.
  Therefore
  Mark's
  choice of implementing something and then asking for
  feedback
  made
  sense
  to
  me.

 I should point out that the implementation hasn't - as far
 as I
 can
 see - changed the discussion.  The discussion was about the
 API.

 Implementations are useful for agreed APIs because they can
 point
 out
 where the API does not make sense or cannot be implemented.
  In
 this
 case, the API Mark said he was going to implement - he did
 implement -
 at least as far as I can see.  Again, I'm happy to be
 corrected.

 Implementations can also help the discussion along, by
 allowing
 people
 to
 try out some of the proposed changes. It also allows to
 construct
 examples
 that show weaknesses, possibly to be solved by an alternative
 API.
 Maybe
 you
 can hold the complete history of this topic in your head and
 comprehend
 it,
 but for me it would be very helpful if someone said:
 - here's my dataset
 - this is what I want to do with it
 - this is the best I can do with the current implementation
 - here's how API X would allow me to solve this better or
 simpler
 This can be done much better with actual data and an actual
 implementation
 than with a design proposal. You seem to disagree with this
 statement.
 That's fine. I would hope though that you recognize that
 concrete
 examples
 help people like me, and construct one or two to help us out.
That's what use-cases are for in designing APIs.  There are
examples
of use in the NEP:
   
   
   
https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
   
the alterNEP:
   
https://gist.github.com/1056379
   
and my longer email to Travis:
   
   
   
   
   
http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
   
Mark has done a nice job of documentation:
   
http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
   
If you want to understand what the alterNEP case is, I'd
suggest
the
email, just because it's the most recent and I think the
terminology
is slightly clearer.
   
Doing the same examples on a larger array won't make the point
easier
to understand.  The discussion is about what the right concepts
are,
and you can help by looking at the snippets of code in those
documents, and deciding for yourself whether you think the
current
masking / NA implementation seems natural and easy to explain,
or
rather forced and difficult to explain, and then email back
trying
to
explain your impression (which is not always easy).
   
If you seriously believe that looking at a few snippets is as
helpful
and
instructive as being able to play around with them in IPython
and
modify
them, then I guess we won't make progress

Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Thu, Oct 27, 2011 at 10:56 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Thursday, October 27, 2011, Charles R Harris charlesr.har...@gmail.com
 wrote:


 On Thu, Oct 27, 2011 at 7:16 PM, Travis Oliphant oliph...@enthought.com
 wrote:

 That is a pretty good explanation.   I find myself convinced by Matthew's
 arguments.    I think that being able to separate ABSENT from IGNORED is a
 good idea.   I also like being able to control SKIP and PROPAGATE (but I
 think the current implementation allows this already).

 What is the counter-argument to this proposal?


 What exactly do you find convincing? The current masks propagate by
 default:

 In [1]: a = ones(5, maskna=1)

 In [2]: a[2] = NA

 In [3]: a
 Out[3]: array([ 1.,  1.,  NA,  1.,  1.])

 In [4]: a + 1
 Out[4]: array([ 2.,  2.,  NA,  2.,  2.])

 In [5]: a[2] = 10

 In [5]: a
 Out[5]: array([  1.,   1.,  10.,   1.,   1.], maskna=True)


 I don't see an essential difference between the implementation using masks
 and one using bit patterns, the mask when attached to the original array
 just adds a bit pattern by extending all the types by one byte, an approach
 that easily extends to all existing and future types, which is why Mark went
 that way for the first implementation given the time available. The masks
 are hidden because folks wanted something that behaved more like R and also
 because of the desire to combine the missing, ignore, and later possibly bit
 patterns in a unified manner. Note that the pseudo assignment was also meant
 to look like R. Adding true bit patterns to numpy isn't trivial and I
 believe Mark was thinking of parametrized types for that.

 The main problems I see with masks are unified storage and possibly memory
 use. The rest is just behavor and desired API and that can be adjusted
 within the current implementation. There is nothing essentially masky about
 masks.

 Chuck



 I  think chuck sums it up quite nicely.  The implementation detail about
 using mask versus bit patterns can still be discussed and addressed.
 Personally, I just don't see how parameterized dtypes would be easier to use
 than the pseudo assignment.

 The elegance of mark's solution was to consider the treatment of missing
 data in a unified manner.  This puts missing data in a more prominent spot
 for extension builders, which should greatly improve support throughout the
 ecosystem.

Are extension builders then required to use the numpy C API to get
their data?  Speaking as an extension builder, I would rather you gave
me the mask and the bitpattern information and let me do that myself.

 By letting there be a single missing data framework (instead of
 two) all that users need to figure out is when they want nan-like behavior
 (propagate) or to be more like masks (skip).  Numpy takes care of the rest.
  There is a reason why I like using masked arrays because I don't have to
 use nansum in my library functions to guard against the possibility of
 receiving nans.  Duck-typing is a good thing.

 My argument against separating IGNORE and PROPAGATE is that it becomes too
 tempting to want to mix these in an array, but the desired behavior would
 likely become ambiguous..

 There is one other proplem that I just thought of that I don't think has
 been outlined in either NEP.  What if I perform an operation between an
 array set up with propagate NAs and an array with skip NAs?

These are explicitly covered in the alterNEP:

https://gist.github.com/1056379/

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 9:21 AM, Chris.Barker chris.bar...@noaa.gov wrote:
 On 10/27/11 7:51 PM, Travis Oliphant wrote:
 As I mentioned. I find the ability to separate an ABSENT idea from an
 IGNORED idea convincing. In other words, I think distinguishing between
 masks and bit-patterns is not just an implementation detail, but
 provides a useful concept for multiple use-cases.

 Exactly -- while one can implement ABSENT with a mask, one can not
 implement IGNORE with a bit-pattern. So it is not an implementation detail.

 I also think bit-patterns are a bit of a dead end:

 - there is only a standard for one data type family: i.e. NaN for ieee
 float types

 - So we would be coming up with our own standard (or adopting an
 existing one, but I don't think there is one widely supported) for other
 types. This means:
   1) a lot of work to do

Largest possible negative integer for ints / largest integer for uints
/ not allowed for bool?

   2) a binary format incompatible with other code, compilers, etc. This
 is a BIG deal -- a major strength of numpy is that it serves as a
 wrapper for a data block that is compatible with C, Fortran or whatever
 code -- special bit patterns would make this a lot harder.

Extension code is going to get harder.   At the moment, as far as I
understand it, our extension code can receive a masked array and
(without an explicit check from us) ignore the mask and process all
the values.  Then you're in the unfortunate situation of caring what's
under the mask.

Bitpatterns would - I imagine - be safer in that respect in that they
would be new dtypes and thus extension code would by default reject
them as unknown.

 We also talked about the fact that a 8-bit mask provides the ability to
 carry other information in the mask -- not jsut missing or ignored,
 but a handful of other possible reasons for masking. I think that has a
 lot of possibilities.

 On 10/28/11 2:11 AM, Stéfan van der Walt wrote:
 Another data point:  I've been spending some time on scikits-image
 recently, and although masked values would be highly useful in that
 context, the cost of doubling memory use (for uint8 images, e.g.) is
 too high.

 2) that we make a concerted effort to implement the bitmask mode of
 operation as soon as possible.

 I wonder if that might be handled as a scikits-image extension, rather
 than core numpy?

I think Stefan and Nathaniel and Gary Strangman and others are saying
we don't want to pay the price of a large memory hike for masking.   I
suspect that Nathaniel is right, and that a large majority of those of
us who want 'missing data' functionality, also want what we've called
ABSENT missing values, and care about memory.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 11:16 AM, Benjamin Root ben.r...@ou.edu wrote:
 On Fri, Oct 28, 2011 at 12:39 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Thu, Oct 27, 2011 at 10:56 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
  On Thursday, October 27, 2011, Charles R Harris
  charlesr.har...@gmail.com
  wrote:
 
 
  On Thu, Oct 27, 2011 at 7:16 PM, Travis Oliphant
  oliph...@enthought.com
  wrote:
 
  That is a pretty good explanation.   I find myself convinced by
  Matthew's
  arguments.    I think that being able to separate ABSENT from IGNORED
  is a
  good idea.   I also like being able to control SKIP and PROPAGATE (but
  I
  think the current implementation allows this already).
 
  What is the counter-argument to this proposal?
 
 
  What exactly do you find convincing? The current masks propagate by
  default:
 
  In [1]: a = ones(5, maskna=1)
 
  In [2]: a[2] = NA
 
  In [3]: a
  Out[3]: array([ 1.,  1.,  NA,  1.,  1.])
 
  In [4]: a + 1
  Out[4]: array([ 2.,  2.,  NA,  2.,  2.])
 
  In [5]: a[2] = 10
 
  In [5]: a
  Out[5]: array([  1.,   1.,  10.,   1.,   1.], maskna=True)
 
 
  I don't see an essential difference between the implementation using
  masks
  and one using bit patterns, the mask when attached to the original
  array
  just adds a bit pattern by extending all the types by one byte, an
  approach
  that easily extends to all existing and future types, which is why Mark
  went
  that way for the first implementation given the time available. The
  masks
  are hidden because folks wanted something that behaved more like R and
  also
  because of the desire to combine the missing, ignore, and later
  possibly bit
  patterns in a unified manner. Note that the pseudo assignment was also
  meant
  to look like R. Adding true bit patterns to numpy isn't trivial and I
  believe Mark was thinking of parametrized types for that.
 
  The main problems I see with masks are unified storage and possibly
  memory
  use. The rest is just behavor and desired API and that can be adjusted
  within the current implementation. There is nothing essentially masky
  about
  masks.
 
  Chuck
 
 
 
  I  think chuck sums it up quite nicely.  The implementation detail about
  using mask versus bit patterns can still be discussed and addressed.
  Personally, I just don't see how parameterized dtypes would be easier to
  use
  than the pseudo assignment.
 
  The elegance of mark's solution was to consider the treatment of missing
  data in a unified manner.  This puts missing data in a more prominent
  spot
  for extension builders, which should greatly improve support throughout
  the
  ecosystem.

 Are extension builders then required to use the numpy C API to get
 their data?  Speaking as an extension builder, I would rather you gave
 me the mask and the bitpattern information and let me do that myself.


 Forgive me, I wasn't clear.  What I am speaking of is more about a typical
 human failing.  If a programmer for a module never encounters masked arrays,
 then when they code up a function to operate on numpy data, it is quite
 likely that they would never take it into consideration.  Notice the
 prolific use of np.asarray() even within the numpy codebase, which
 destroys masked arrays.

Hmm - that sounds like it could cause some surprises.

So, what you were saying was just that it was good that masked arrays
were now closer to the core?   That's reasonable, but I don't think
it's relevant to the current discussion.  I think we all agree it is
nice to have masked arrays in the core.

 However, by making missing data support more integral into the core of
 numpy, then it is far more likely that a programmer would take it into
 consideration when designing their algorithm, or at least explicitly
 document that their module does not support missing data.  Both NEPs does
 this by making missing data front-and-center.  However, my belief is that
 Mark's approach is easier to comprehend and is cleaner.  Cleaner features
 means that it is more likely to be used.

The main motivation for the alterNEP was our strong feeling that
separating ABSENT and IGNORE was easier to comprehend and cleaner.  I
think it would be hard to argue that the aterNEP idea is not more
explicit.

  By letting there be a single missing data framework (instead of
  two) all that users need to figure out is when they want nan-like
  behavior
  (propagate) or to be more like masks (skip).  Numpy takes care of the
  rest.
   There is a reason why I like using masked arrays because I don't have
  to
  use nansum in my library functions to guard against the possibility of
  receiving nans.  Duck-typing is a good thing.
 
  My argument against separating IGNORE and PROPAGATE is that it becomes
  too
  tempting to want to mix these in an array, but the desired behavior
  would
  likely become ambiguous..
 
  There is one other proplem that I just thought of that I don't think has
  been outlined in either NEP.  What if I perform

Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 12:15 PM, Lluís xscr...@gmx.net wrote:

 Summarizing: let's forget for a moment that mask has a meaning in english:

This is at the core of the problem.  You and I know what's really
going on - there's a mask over the data.   But in what follows we're
going to try and pretend that is not what is going on.  The result is
something that is rather hard to understand, and, when you do
understand it, it's surprising and inconvenient.   This is all because
we tried to conceal what was really going on.

             - maskna corresponds to ABSENT
             - ownmaskna corresponds to IGNORED

 The problem here is that of the two implementation mechanisms (masks and
 bitpatterns), only the first can provide both semantics.

But let's be clear.   The current masked array implementation is made
so it looks like ABSENT, and makes IGNORED hard to get to.

 Let's start with an array that already supports NAs:

 In [1]: a = np.array([1, 2, 3], maskna = True)



 ABSENT (destructive NA assignment)
 --

 Once you assign NA, even if you're using NA masks, the value seems to be lost
 forever (i.e., the assignment is destructive regardless of the value):

 In [2]: b = a.view()
 In [3]: c = a.view(maskna = True)
 In [4]: b[0] = np.NA
 In [5]: a
 Out[5]: array([NA, 2, 3])
 In [6]: b
 Out[6]: array([NA, 2, 3])
 In [7]: c
 Out[7]: array([NA, 2, 3])

Right - the mask (fundamentally an IGNORED signal) is pretending to
implement ABSENT.  But - as you point out below - I'm pasting it here
- in fact it's IGNORED.

 In [21]: a = np.array([1, 2, 3])
 Out[21]: array([1, 2, 3])
 In [22]: b = a.view(maskna = True)
 In [23]: b[0] = np.NA
 In [24]: a
 Out[24]: array([1, 2, 3])
 In [25]: b
 Out[25]: array([NA, 2, 3])

But now - I've done this:

 a = np.array([99, 100, 3], maskna=True)
 a[0:2] = np.NA

You and I know that I've got an array with values [99, 100, 3] and a
mask with values [False, False, True].  So maybe I'd like to see what
happens if I take off the mask from the second value.   I know that's
what I want to do, but I don't know how to do it, because you won't
let me manipulate the mask, because I'm not allowed to know that the
NA values come from the mask.

The alterNEP is just saying - please - be straight with me.   If
you're doing masking, show me the mask, and don't try and hide that
there are stored values underneath.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 1:14 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Fri, Oct 28, 2011 at 3:02 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 You and I know that I've got an array with values [99, 100, 3] and a
 mask with values [False, False, True].  So maybe I'd like to see what
 happens if I take off the mask from the second value.   I know that's
 what I want to do, but I don't know how to do it, because you won't
 let me manipulate the mask, because I'm not allowed to know that the
 NA values come from the mask.

 The alterNEP is just saying - please - be straight with me.   If
 you're doing masking, show me the mask, and don't try and hide that
 there are stored values underneath.


 Considering that you have admitted before to not regularly using masked
 arrays, I seriously doubt that you would be able to judge whether this is a
 significant detriment or not.  My entire point that I have been making is
 that Mark's implementation is not the same as the current masked arrays.
 Instead, it is a cleaner, more mature implementation that gets rid of
 extraneous features.

This may explain why we don't seem to be getting anywhere.  I am sure
that Mark's implementation of masking is great.   We're not talking
about that.  We're talking about whether it's a good idea to make
masking look as though it is implementing the ABSENT idea.   That's
what I think is confusing, and that's the conversation I have been
trying to pursue.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 1:52 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Fri, Oct 28, 2011 at 3:22 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 1:14 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
  On Fri, Oct 28, 2011 at 3:02 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  You and I know that I've got an array with values [99, 100, 3] and a
  mask with values [False, False, True].  So maybe I'd like to see what
  happens if I take off the mask from the second value.   I know that's
  what I want to do, but I don't know how to do it, because you won't
  let me manipulate the mask, because I'm not allowed to know that the
  NA values come from the mask.
 
  The alterNEP is just saying - please - be straight with me.   If
  you're doing masking, show me the mask, and don't try and hide that
  there are stored values underneath.
 
 
  Considering that you have admitted before to not regularly using masked
  arrays, I seriously doubt that you would be able to judge whether this
  is a
  significant detriment or not.  My entire point that I have been making
  is
  that Mark's implementation is not the same as the current masked arrays.
  Instead, it is a cleaner, more mature implementation that gets rid of
  extraneous features.

 This may explain why we don't seem to be getting anywhere.  I am sure
 that Mark's implementation of masking is great.   We're not talking
 about that.  We're talking about whether it's a good idea to make
 masking look as though it is implementing the ABSENT idea.   That's
 what I think is confusing, and that's the conversation I have been
 trying to pursue.

 Best,

 Matthew

 Sorry if I came across too strongly there. No disrespect was intended.

I wasn't worried about the disrespect.  It's just I feel the
discussion has not been to the point.

 Personally, I think we are getting somewhere.  We have been whittling away
 what it is that we do agree upon, and have begun to specify *exactly* what
 it is that we disagree on.  I have understand your concern, and -- like I
 said in my previous email -- it makes sense from the perspective of numpy.ma
 users have had up to now.

But I'm not a numpy.ma user, I'm just someone who knows that what you
are doing is masking out values.  The fact that I do not use numpy.ma
points out that it's possible to find this highly counter-intuitive
without prior bias.

 But, I re-raise my point that I have been making
 about the need to re-think masked arrays.  If we consider masks as advanced
 slicing or boolean indexing, then being unable to access the underlying
 values actually makes a lot of sense.

 Consider it a contract when I pass a set of data with only certain values
 exposed.  Because I passed the data with only those values exposed, then it
 must have been entirely my intention to let the function know of only those
 values.  It would be a violation of that contract if the function obtained
 those masked values.  If I want to communicate both the original values and
 a particular mask, then I pass the array and a view with a particular mask.

This is the old discussion about what Python users expect.  I think
they expect to be treated as adults.  That is, breaking the contract
should not be easy to do by accident, but it should be allowed.

 Maybe it would be helpful that an array can never have its own mask, but
 rather, only views can carry masks?

 In conclusion, I submit that this is largely a problem that can be solved
 with the proper documentation.  New users who never used numpy.ma before do
 not have to concern themselves with the old way of thinking and are just
 simply taught what masked arrays are.  Meanwhile, a special section of the
 documentation should be made that teaches numpy.ma users how masked arrays
 should be.

I don't think documentation will solve it.  In a way, the ideal user
is someone who doesn't know what's going on, because, for a while,
they may not realize that when they thought they were doing
assignment, in fact they are doing masking.  Unfortunately, I suspect
almost everyone using these things will start to realize that, and
then they will start getting confused.  I find it confusing, and I
believe myself to understand the issues pretty well, and be of
numpy-user-range comprehension powers.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 2:16 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant oliph...@enthought.com 
 wrote:
 I think Nathaniel and Matthew provided very
 specific feedback that was helpful in understanding other perspectives of a
 difficult problem.     In particular, I really wanted bit-patterns
 implemented.    However, I also understand that Mark did quite a bit of work
 and altered his original designs quite a bit in response to community
 feedback.   I wasn't a major part of the pull request discussion, nor did I
 merge the changes, but I support Charles if he reviewed the code and felt
 like it was the right thing to do.  I likely would have done the same thing
 rather than let Mark Wiebe's work languish.

 My connectivity is spotty this week, so I'll stay out of the technical
 discussion for now, but I want to share a story.

 Maybe a year ago now, Jonathan Taylor and I were debating what the
 best API for describing statistical models would be -- whether we
 wanted something like R's formulas (which I supported), or another
 approach based on sympy (his idea). To summarize, I thought his API
 was confusing, pointlessly complicated, and didn't actually solve the
 problem; he thought R-style formulas were superficially simpler but
 hopelessly confused and inconsistent underneath. Now, obviously, I was
 right and he was wrong. Well, obvious to me, anyway... ;-) But it
 wasn't like I could just wave a wand and make his arguments go away,
 no matter how annoying and wrong-headed I thought they were... I could
 write all the code I wanted but no-one would use it unless I could
 convince them it's actually the right solution, so I had to engage
 with him, and dig deep into his arguments.

 What I discovered was that (as I thought) R-style formulas *do* have a
 solid theoretical basis -- but (as he thought) all the existing
 implementations *are* broken and inconsistent! I'm still not sure I
 can actually convince Jonathan to go my way, but, because of his
 stubbornness, I had to invent a better way of handling these formulas,
 and so my library[1] is actually the first implementation of these
 things that has a rigorous theory behind it, and in the process it
 avoids two fundamental, decades-old bugs in R. (And I'm not sure the R
 folks can fix either of them at this point without breaking a ton of
 code, since they both have API consequences.)

 --

 It's extremely common for healthy FOSS projects to insist on consensus
 for almost all decisions, where consensus means something like every
 interested party has a veto[2]. This seems counterintuitive, because
 if everyone's vetoing all the time, how does anything get done? The
 trick is that if anyone *can* veto, then vetoes turn out to actually
 be very rare. Everyone knows that they can't just ignore alternative
 points of view -- they have to engage with them if they want to get
 anything done. So you get buy-in on features early, and no vetoes are
 necessary. And by forcing people to engage with each other, like me
 with Jonathan, you get better designs.

 But what about the cost of all that code that doesn't get merged, or
 written, because everyone's spending all this time debating instead?
 Better designs are nice and all, but how does that justify letting
 working code languish?

 The greatest risk for a FOSS project is that people will ignore you.
 Projects and features live and die by community buy-in. Consider the
 NA mask feature right now. It works (at least the parts of it that
 are implemented). It's in mainline. But IIRC, Pierre said last time
 that he doesn't think the current design will help him improve or
 replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring
 this feature in favor of his library pandas' current hacky NA support.
 Members of the neuroimaging crowd are saying that the memory overhead
 is too high and the benefits too marginal, so they'll stick with NaNs.
 Together these folk a huge proportion of the this feature's target
 audience. So what have we actually accomplished by merging this to
 mainline? Are we going to be stuck supporting a feature that only a
 fraction of the target audience actually uses? (Maybe they're being
 dumb, but if people are ignoring your code for dumb reasons... they're
 still ignoring your code.)

 The consensus rule forces everyone to do the hardest and riskiest part
 -- building buy-in -- up front. Because you *have* to do it sooner or
 later, and doing it sooner doesn't just generate better designs. It
 drastically reduces the risk of ending up in a huge trainwreck.

 --

 In my story at the beginning, I wished I had a magic wand to skip this
 annoying debate and political stuff. But giving it to me would have
 been a bad idea. I think that's went wrong with the NA discussion in
 the first place. Mark's an excellent programmer, and he tried his best
 to act in the good of everyone in the project -- but in the end, he
 did have 

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
On Fri, Oct 28, 2011 at 2:32 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Oct 28, 2011 at 2:16 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant oliph...@enthought.com 
 wrote:
 I think Nathaniel and Matthew provided very
 specific feedback that was helpful in understanding other perspectives of a
 difficult problem.     In particular, I really wanted bit-patterns
 implemented.    However, I also understand that Mark did quite a bit of work
 and altered his original designs quite a bit in response to community
 feedback.   I wasn't a major part of the pull request discussion, nor did I
 merge the changes, but I support Charles if he reviewed the code and felt
 like it was the right thing to do.  I likely would have done the same thing
 rather than let Mark Wiebe's work languish.

 My connectivity is spotty this week, so I'll stay out of the technical
 discussion for now, but I want to share a story.

 Maybe a year ago now, Jonathan Taylor and I were debating what the
 best API for describing statistical models would be -- whether we
 wanted something like R's formulas (which I supported), or another
 approach based on sympy (his idea). To summarize, I thought his API
 was confusing, pointlessly complicated, and didn't actually solve the
 problem; he thought R-style formulas were superficially simpler but
 hopelessly confused and inconsistent underneath. Now, obviously, I was
 right and he was wrong. Well, obvious to me, anyway... ;-) But it
 wasn't like I could just wave a wand and make his arguments go away,
 no matter how annoying and wrong-headed I thought they were... I could
 write all the code I wanted but no-one would use it unless I could
 convince them it's actually the right solution, so I had to engage
 with him, and dig deep into his arguments.

 What I discovered was that (as I thought) R-style formulas *do* have a
 solid theoretical basis -- but (as he thought) all the existing
 implementations *are* broken and inconsistent! I'm still not sure I
 can actually convince Jonathan to go my way, but, because of his
 stubbornness, I had to invent a better way of handling these formulas,
 and so my library[1] is actually the first implementation of these
 things that has a rigorous theory behind it, and in the process it
 avoids two fundamental, decades-old bugs in R. (And I'm not sure the R
 folks can fix either of them at this point without breaking a ton of
 code, since they both have API consequences.)

 --

 It's extremely common for healthy FOSS projects to insist on consensus
 for almost all decisions, where consensus means something like every
 interested party has a veto[2]. This seems counterintuitive, because
 if everyone's vetoing all the time, how does anything get done? The
 trick is that if anyone *can* veto, then vetoes turn out to actually
 be very rare. Everyone knows that they can't just ignore alternative
 points of view -- they have to engage with them if they want to get
 anything done. So you get buy-in on features early, and no vetoes are
 necessary. And by forcing people to engage with each other, like me
 with Jonathan, you get better designs.

 But what about the cost of all that code that doesn't get merged, or
 written, because everyone's spending all this time debating instead?
 Better designs are nice and all, but how does that justify letting
 working code languish?

 The greatest risk for a FOSS project is that people will ignore you.
 Projects and features live and die by community buy-in. Consider the
 NA mask feature right now. It works (at least the parts of it that
 are implemented). It's in mainline. But IIRC, Pierre said last time
 that he doesn't think the current design will help him improve or
 replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring
 this feature in favor of his library pandas' current hacky NA support.
 Members of the neuroimaging crowd are saying that the memory overhead
 is too high and the benefits too marginal, so they'll stick with NaNs.
 Together these folk a huge proportion of the this feature's target
 audience. So what have we actually accomplished by merging this to
 mainline? Are we going to be stuck supporting a feature that only a
 fraction of the target audience actually uses? (Maybe they're being
 dumb, but if people are ignoring your code for dumb reasons... they're
 still ignoring your code.)

 The consensus rule forces everyone to do the hardest and riskiest part
 -- building buy-in -- up front. Because you *have* to do it sooner or
 later, and doing it sooner doesn't just generate better designs. It
 drastically reduces the risk of ending up in a huge trainwreck.

 --

 In my story at the beginning, I wished I had a magic wand to skip this
 annoying debate and political stuff. But giving it to me would have
 been a bad idea. I think that's went wrong with the NA discussion in
 the first place. Mark's an excellent programmer, and he tried his best

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant oliph...@enthought.com
 wrote:
  I think Nathaniel and Matthew provided very
  specific feedback that was helpful in understanding other perspectives
  of a
  difficult problem.     In particular, I really wanted bit-patterns
  implemented.    However, I also understand that Mark did quite a bit of
  work
  and altered his original designs quite a bit in response to community
  feedback.   I wasn't a major part of the pull request discussion, nor
  did I
  merge the changes, but I support Charles if he reviewed the code and
  felt
  like it was the right thing to do.  I likely would have done the same
  thing
  rather than let Mark Wiebe's work languish.

 My connectivity is spotty this week, so I'll stay out of the technical
 discussion for now, but I want to share a story.

 Maybe a year ago now, Jonathan Taylor and I were debating what the
 best API for describing statistical models would be -- whether we
 wanted something like R's formulas (which I supported), or another
 approach based on sympy (his idea). To summarize, I thought his API
 was confusing, pointlessly complicated, and didn't actually solve the
 problem; he thought R-style formulas were superficially simpler but
 hopelessly confused and inconsistent underneath. Now, obviously, I was
 right and he was wrong. Well, obvious to me, anyway... ;-) But it
 wasn't like I could just wave a wand and make his arguments go away,
 no matter how annoying and wrong-headed I thought they were... I could
 write all the code I wanted but no-one would use it unless I could
 convince them it's actually the right solution, so I had to engage
 with him, and dig deep into his arguments.

 What I discovered was that (as I thought) R-style formulas *do* have a
 solid theoretical basis -- but (as he thought) all the existing
 implementations *are* broken and inconsistent! I'm still not sure I
 can actually convince Jonathan to go my way, but, because of his
 stubbornness, I had to invent a better way of handling these formulas,
 and so my library[1] is actually the first implementation of these
 things that has a rigorous theory behind it, and in the process it
 avoids two fundamental, decades-old bugs in R. (And I'm not sure the R
 folks can fix either of them at this point without breaking a ton of
 code, since they both have API consequences.)

 --

 It's extremely common for healthy FOSS projects to insist on consensus
 for almost all decisions, where consensus means something like every
 interested party has a veto[2]. This seems counterintuitive, because
 if everyone's vetoing all the time, how does anything get done? The
 trick is that if anyone *can* veto, then vetoes turn out to actually
 be very rare. Everyone knows that they can't just ignore alternative
 points of view -- they have to engage with them if they want to get
 anything done. So you get buy-in on features early, and no vetoes are
 necessary. And by forcing people to engage with each other, like me
 with Jonathan, you get better designs.

 But what about the cost of all that code that doesn't get merged, or
 written, because everyone's spending all this time debating instead?
 Better designs are nice and all, but how does that justify letting
 working code languish?

 The greatest risk for a FOSS project is that people will ignore you.
 Projects and features live and die by community buy-in. Consider the
 NA mask feature right now. It works (at least the parts of it that
 are implemented). It's in mainline. But IIRC, Pierre said last time
 that he doesn't think the current design will help him improve or
 replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring
 this feature in favor of his library pandas' current hacky NA support.
 Members of the neuroimaging crowd are saying that the memory overhead
 is too high and the benefits too marginal, so they'll stick with NaNs.
 Together these folk a huge proportion of the this feature's target
 audience. So what have we actually accomplished by merging this to
 mainline? Are we going to be stuck supporting a feature that only a
 fraction of the target audience actually uses? (Maybe they're being
 dumb, but if people are ignoring your code for dumb reasons... they're
 still ignoring your code.)

 The consensus rule forces everyone to do the hardest and riskiest part
 -- building buy-in -- up front. Because you *have* to do it sooner or
 later, and doing it sooner doesn't just generate better designs. It
 drastically reduces the risk of ending up in a huge trainwreck.

 --

 In my story at the beginning, I wished I had a magic wand to skip this
 annoying debate and political stuff. But giving it to me would have
 been a bad idea. I think that's went wrong with the NA discussion in
 the first place. Mark's an excellent 

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant oliph...@enthought.com
 wrote:
  I think Nathaniel and Matthew provided very
  specific feedback that was helpful in understanding other perspectives
  of a
  difficult problem.     In particular, I really wanted bit-patterns
  implemented.    However, I also understand that Mark did quite a bit of
  work
  and altered his original designs quite a bit in response to community
  feedback.   I wasn't a major part of the pull request discussion, nor
  did I
  merge the changes, but I support Charles if he reviewed the code and
  felt
  like it was the right thing to do.  I likely would have done the same
  thing
  rather than let Mark Wiebe's work languish.

 My connectivity is spotty this week, so I'll stay out of the technical
 discussion for now, but I want to share a story.

 Maybe a year ago now, Jonathan Taylor and I were debating what the
 best API for describing statistical models would be -- whether we
 wanted something like R's formulas (which I supported), or another
 approach based on sympy (his idea). To summarize, I thought his API
 was confusing, pointlessly complicated, and didn't actually solve the
 problem; he thought R-style formulas were superficially simpler but
 hopelessly confused and inconsistent underneath. Now, obviously, I was
 right and he was wrong. Well, obvious to me, anyway... ;-) But it
 wasn't like I could just wave a wand and make his arguments go away,
 no matter how annoying and wrong-headed I thought they were... I could
 write all the code I wanted but no-one would use it unless I could
 convince them it's actually the right solution, so I had to engage
 with him, and dig deep into his arguments.

 What I discovered was that (as I thought) R-style formulas *do* have a
 solid theoretical basis -- but (as he thought) all the existing
 implementations *are* broken and inconsistent! I'm still not sure I
 can actually convince Jonathan to go my way, but, because of his
 stubbornness, I had to invent a better way of handling these formulas,
 and so my library[1] is actually the first implementation of these
 things that has a rigorous theory behind it, and in the process it
 avoids two fundamental, decades-old bugs in R. (And I'm not sure the R
 folks can fix either of them at this point without breaking a ton of
 code, since they both have API consequences.)

 --

 It's extremely common for healthy FOSS projects to insist on consensus
 for almost all decisions, where consensus means something like every
 interested party has a veto[2]. This seems counterintuitive, because
 if everyone's vetoing all the time, how does anything get done? The
 trick is that if anyone *can* veto, then vetoes turn out to actually
 be very rare. Everyone knows that they can't just ignore alternative
 points of view -- they have to engage with them if they want to get
 anything done. So you get buy-in on features early, and no vetoes are
 necessary. And by forcing people to engage with each other, like me
 with Jonathan, you get better designs.

 But what about the cost of all that code that doesn't get merged, or
 written, because everyone's spending all this time debating instead?
 Better designs are nice and all, but how does that justify letting
 working code languish?

 The greatest risk for a FOSS project is that people will ignore you.
 Projects and features live and die by community buy-in. Consider the
 NA mask feature right now. It works (at least the parts of it that
 are implemented). It's in mainline. But IIRC, Pierre said last time
 that he doesn't think the current design will help him improve or
 replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring
 this feature in favor of his library pandas' current hacky NA support.
 Members of the neuroimaging crowd are saying that the memory overhead
 is too high and the benefits too marginal, so they'll stick with NaNs.
 Together these folk a huge proportion of the this feature's target
 audience. So what have we actually accomplished by merging this to
 mainline? Are we going to be stuck supporting a feature that only a
 fraction of the target audience actually uses? (Maybe they're being
 dumb, but if people are ignoring your code for dumb reasons... they're
 still ignoring your code.)

 The consensus rule forces everyone to do the hardest and riskiest part
 -- building buy-in -- up front. Because you *have* to do it sooner or
 later, and doing it sooner doesn't just generate better designs. It
 drastically reduces the risk of ending up in a huge trainwreck.

 --

 In my story at the beginning, I wished I had a magic wand to skip this
 annoying debate and political stuff. But giving it to me would have
 been a bad idea. I think

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
 
 
  On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith n...@pobox.com wrote:
 
  On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant
  oliph...@enthought.com
  wrote:
   I think Nathaniel and Matthew provided very
   specific feedback that was helpful in understanding other
   perspectives
   of a
   difficult problem.     In particular, I really wanted bit-patterns
   implemented.    However, I also understand that Mark did quite a bit
   of
   work
   and altered his original designs quite a bit in response to
   community
   feedback.   I wasn't a major part of the pull request discussion,
   nor
   did I
   merge the changes, but I support Charles if he reviewed the code and
   felt
   like it was the right thing to do.  I likely would have done the
   same
   thing
   rather than let Mark Wiebe's work languish.
 
  My connectivity is spotty this week, so I'll stay out of the technical
  discussion for now, but I want to share a story.
 
  Maybe a year ago now, Jonathan Taylor and I were debating what the
  best API for describing statistical models would be -- whether we
  wanted something like R's formulas (which I supported), or another
  approach based on sympy (his idea). To summarize, I thought his API
  was confusing, pointlessly complicated, and didn't actually solve the
  problem; he thought R-style formulas were superficially simpler but
  hopelessly confused and inconsistent underneath. Now, obviously, I was
  right and he was wrong. Well, obvious to me, anyway... ;-) But it
  wasn't like I could just wave a wand and make his arguments go away,
  no matter how annoying and wrong-headed I thought they were... I could
  write all the code I wanted but no-one would use it unless I could
  convince them it's actually the right solution, so I had to engage
  with him, and dig deep into his arguments.
 
  What I discovered was that (as I thought) R-style formulas *do* have a
  solid theoretical basis -- but (as he thought) all the existing
  implementations *are* broken and inconsistent! I'm still not sure I
  can actually convince Jonathan to go my way, but, because of his
  stubbornness, I had to invent a better way of handling these formulas,
  and so my library[1] is actually the first implementation of these
  things that has a rigorous theory behind it, and in the process it
  avoids two fundamental, decades-old bugs in R. (And I'm not sure the R
  folks can fix either of them at this point without breaking a ton of
  code, since they both have API consequences.)
 
  --
 
  It's extremely common for healthy FOSS projects to insist on consensus
  for almost all decisions, where consensus means something like every
  interested party has a veto[2]. This seems counterintuitive, because
  if everyone's vetoing all the time, how does anything get done? The
  trick is that if anyone *can* veto, then vetoes turn out to actually
  be very rare. Everyone knows that they can't just ignore alternative
  points of view -- they have to engage with them if they want to get
  anything done. So you get buy-in on features early, and no vetoes are
  necessary. And by forcing people to engage with each other, like me
  with Jonathan, you get better designs.
 
  But what about the cost of all that code that doesn't get merged, or
  written, because everyone's spending all this time debating instead?
  Better designs are nice and all, but how does that justify letting
  working code languish?
 
  The greatest risk for a FOSS project is that people will ignore you.
  Projects and features live and die by community buy-in. Consider the
  NA mask feature right now. It works (at least the parts of it that
  are implemented). It's in mainline. But IIRC, Pierre said last time
  that he doesn't think the current design will help him improve or
  replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring
  this feature in favor of his library pandas' current hacky NA support.
  Members of the neuroimaging crowd are saying that the memory overhead
  is too high and the benefits too marginal, so they'll stick with NaNs.
  Together these folk a huge proportion of the this feature's target
  audience. So what have we actually accomplished by merging this to
  mainline? Are we going to be stuck supporting a feature that only a
  fraction of the target audience actually uses? (Maybe they're being
  dumb, but if people are ignoring your code for dumb reasons... they're
  still ignoring your code.)
 
  The consensus rule forces everyone to do the hardest and riskiest part
  -- building buy-in -- up front. Because you *have* to do it sooner or
  later, and doing it sooner

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
On Fri, Oct 28, 2011 at 3:49 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 2011/10/28 Stéfan van der Walt ste...@sun.ac.za

 On Fri, Oct 28, 2011 at 3:21 PM, Benjamin Root ben.r...@ou.edu wrote:
  The space issues was never ignored and Mark left room for that to be
  addressed.  Parameterized dtypes can still be added (and isn't all that
  different from multi-na). Perhaps I could be convinced of a having np.MA
  assignments mean ignore and np.NA mean absent.  How far off are we
  really from consensus?

 Do you know whether Mark is around?  I think his feedback would be
 useful at this point; having written the code, he'll be able to
 evaluate some of the technical suggestions made.


 Yes, Mark is around, but I assume he is interested in his school work at
 this point. And he might not be inclined to get back into this particular
 discussion. I don't feel he was treated very well by some last time around.

We have not always been good at separating the concept of disagreement
from that of rudeness.

As I've said before, one form of rudeness (and not disagreement) is
ignoring people.

We should all be careful to point out - respectfully, and with reasons
- when we find our colleagues replies (or non-replies) to be rude,
because rudeness is very bad for the spirit of open discussion.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 4:21 PM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Fri, Oct 28, 2011 at 5:09 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Fri, Oct 28, 2011 at 3:49 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  2011/10/28 Stéfan van der Walt ste...@sun.ac.za
 
  On Fri, Oct 28, 2011 at 3:21 PM, Benjamin Root ben.r...@ou.edu wrote:
   The space issues was never ignored and Mark left room for that to be
   addressed.  Parameterized dtypes can still be added (and isn't all
   that
   different from multi-na). Perhaps I could be convinced of a having
   np.MA
   assignments mean ignore and np.NA mean absent.  How far off are
   we
   really from consensus?
 
  Do you know whether Mark is around?  I think his feedback would be
  useful at this point; having written the code, he'll be able to
  evaluate some of the technical suggestions made.
 
 
  Yes, Mark is around, but I assume he is interested in his school work at
  this point. And he might not be inclined to get back into this
  particular
  discussion. I don't feel he was treated very well by some last time
  around.

 We have not always been good at separating the concept of disagreement
 from that of rudeness.

 As I've said before, one form of rudeness (and not disagreement) is
 ignoring people.

 We should all be careful to point out - respectfully, and with reasons
 - when we find our colleagues replies (or non-replies) to be rude,
 because rudeness is very bad for the spirit of open discussion.


 Trying things out in preparation for discussion is also a mark of respect.
 Have you worked with the current implementation?

OK - this seems to me to be rude.  Why?  Because you have presumably
already read what my concerns were, and my discussion of the current
implementation in my reply to Travis.  You haven't made any effort to
point out to me where I may be wrong or failing to understand.  I
infer that you are merely saying 'go away and come back later'.  And
that is rude.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
   charlesr.har...@gmail.com wrote:
  
  
   On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith n...@pobox.com
   wrote:
  
   On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant
   oliph...@enthought.com
   wrote:
I think Nathaniel and Matthew provided very
specific feedback that was helpful in understanding other
perspectives
of a
difficult problem.     In particular, I really wanted
bit-patterns
implemented.    However, I also understand that Mark did quite a
bit
of
work
and altered his original designs quite a bit in response to
community
feedback.   I wasn't a major part of the pull request discussion,
nor
did I
merge the changes, but I support Charles if he reviewed the code
and
felt
like it was the right thing to do.  I likely would have done the
same
thing
rather than let Mark Wiebe's work languish.
  
   My connectivity is spotty this week, so I'll stay out of the
   technical
   discussion for now, but I want to share a story.
  
   Maybe a year ago now, Jonathan Taylor and I were debating what the
   best API for describing statistical models would be -- whether we
   wanted something like R's formulas (which I supported), or
   another
   approach based on sympy (his idea). To summarize, I thought his API
   was confusing, pointlessly complicated, and didn't actually solve
   the
   problem; he thought R-style formulas were superficially simpler but
   hopelessly confused and inconsistent underneath. Now, obviously, I
   was
   right and he was wrong. Well, obvious to me, anyway... ;-) But it
   wasn't like I could just wave a wand and make his arguments go
   away,
   no matter how annoying and wrong-headed I thought they were... I
   could
   write all the code I wanted but no-one would use it unless I could
   convince them it's actually the right solution, so I had to engage
   with him, and dig deep into his arguments.
  
   What I discovered was that (as I thought) R-style formulas *do*
   have a
   solid theoretical basis -- but (as he thought) all the existing
   implementations *are* broken and inconsistent! I'm still not sure I
   can actually convince Jonathan to go my way, but, because of his
   stubbornness, I had to invent a better way of handling these
   formulas,
   and so my library[1] is actually the first implementation of these
   things that has a rigorous theory behind it, and in the process it
   avoids two fundamental, decades-old bugs in R. (And I'm not sure
   the R
   folks can fix either of them at this point without breaking a ton
   of
   code, since they both have API consequences.)
  
   --
  
   It's extremely common for healthy FOSS projects to insist on
   consensus
   for almost all decisions, where consensus means something like
   every
   interested party has a veto[2]. This seems counterintuitive,
   because
   if everyone's vetoing all the time, how does anything get done? The
   trick is that if anyone *can* veto, then vetoes turn out to
   actually
   be very rare. Everyone knows that they can't just ignore
   alternative
   points of view -- they have to engage with them if they want to get
   anything done. So you get buy-in on features early, and no vetoes
   are
   necessary. And by forcing people to engage with each other, like me
   with Jonathan, you get better designs.
  
   But what about the cost of all that code that doesn't get merged,
   or
   written, because everyone's spending all this time debating
   instead?
   Better designs are nice and all, but how does that justify letting
   working code languish?
  
   The greatest risk for a FOSS project is that people will ignore
   you.
   Projects and features live and die by community buy-in. Consider
   the
   NA mask feature right now. It works (at least the parts of it
   that
   are implemented). It's in mainline. But IIRC, Pierre said last time
   that he doesn't think the current design will help him improve or
   replace numpy.ma. Up-thread, Wes McKinney is leaning towards
   ignoring
   this feature in favor of his library pandas' current hacky NA
   support.
   Members of the neuroimaging crowd are saying that the memory
   overhead
   is too high and the benefits too marginal, so they'll stick with
   NaNs.
   Together these folk a huge proportion of the this feature's target
   audience. So what have we actually accomplished by merging this to
   mainline? Are we going to be stuck supporting a feature

Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)

2011-10-28 Thread Matthew Brett
Hi,

On Fri, Oct 28, 2011 at 4:53 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Friday, October 28, 2011, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
 ralf.gomm...@googlemail.com wrote:


 On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett
  matthew.br...@gmail.com
  wrote:
   Hi,
  
   On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
   charlesr.har...@gmail.com wrote:
  
  
   On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith n...@pobox.com
   wrote:
  
   On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant
   oliph...@enthought.com
   wrote:
I think Nathaniel and Matthew provided very
specific feedback that was helpful in understanding other
perspectives
of a
difficult problem. In particular, I really wanted
bit-patterns
implemented.However, I also understand that Mark did quite
a
bit
of
work
and altered his original designs quite a bit in response to
community
feedback.   I wasn't a major part of the pull request
discussion,
nor
did I
merge the changes, but I support Charles if he reviewed the
code
and
felt
like it was the right thing to do.  I likely would have done
the
same
thing
rather than let Mark Wiebe's work languish.
  
   My connectivity is spotty this week, so I'll stay out of the
   technical
   discussion for now, but I want to share a story.
  
   Maybe a year ago now, Jonathan Taylor and I were debating what
   the
   best API for describing statistical models would be -- whether we
   wanted something like R's formulas (which I supported), or
   another
   approach based on sympy (his idea). To summarize, I thought his
   API
   was confusing, pointlessly complicated, and didn't actually solve
   the
   problem; he thought R-style formulas were superficially simpler
   but
   hopelessly confused and inconsistent underneath. Now, obviously,
   I
   was
   right and he was wrong. Well, obvious to me, anyway... ;-) But it
   wasn't like I could just wave a wand and make his arguments go
   away,
   no I should point out that the implementation hasn't - as far as
   I can
 see - changed the discussion.  The discussion was about the API.
 Implementations are useful for agreed APIs because they can point out
 where the API does not make sense or cannot be implemented.  In this
 case, the API Mark said he was going to implement - he did implement -
 at least as far as I can see.  Again, I'm happy to be corrected.

 In saying that we are insisting on our way, you are saying, implicitly,
 'I
 am not going to negotiate'.

 That is only your interpretation. The observation that Mark compromised
 quite a bit while you didn't seems largely correct to me.

 The problem here stems from our inability to work towards agreement,
 rather than standing on set positions.  I set out what changes I think
 would make the current implementation OK.  Can we please, please have
 a discussion about those points instead of trying to argue about who
 has given more ground.

 That commitment would of course be good. However, even if that were
 possible
 before writing code and everyone agreed that the ideas of you and
 Nathaniel
 should be implemented in full, it's still not clear that either of you
 would
 be willing to write any code. Agreement without code still doesn't help
 us
 very much.

 I'm going to return to Nathaniel's point - it is a highly valuable
 thing to set ourselves the target of resolving substantial discussions
 by consensus.   The route you are endorsing here is 'implementor
 wins'.   We don't need to do it that way.  We're a mature sensible
 bunch of adults who can talk out the issues until we agree they are
 ready for implementation, and then implement.  That's all Nathaniel is
 saying.  I think he's obviously right, and I'm sad that it isn't as
 clear to y'all as it is to me.

 Best,

 Matthew


 Everyone, can we please not do this?! I had enough of adults doing finger
 pointing back over the summer during the whole debt ceiling debate.  I think
 we can all agree that we are better than the US congress?

Yes, please.

 Forget about rudeness or decision processes.

No, that's a common mistake, which is to assume that any conversation
about things which aren't technical, is not important.   Nathaniel's
point is important.  Rudeness is important. The reason we've got into
this mess is because we clearly don't have an agreed way of making
decisions.  That's why countries and open-source projects have
constitutions, so this doesn't happen.

 I will start by saying that I am willing to separate ignore and absent, but
 only on the write side of things.  On read, I want

Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-27 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 7:56 PM, Travis Oliphant oliph...@enthought.com wrote:
 So, I am very interested in making sure I remember the details of the 
 counterproposal.    What I recall is that you wanted to be able to 
 differentiate between a bit-pattern mask and a boolean-array mask in the 
 API.   I believe currently even when bit-pattern masks are implemented the 
 difference will be hidden from the user on the Python level.

 I am sure to be missing other parts of the discussion as I have been in and 
 out of it.

The ideas
--

The question that we were addressing in the alter-NEP was: should
missing values implemented as bitpatterns appear to be the same as
missing values implemented with masks?  We said no, and Mark said yes.

To restate the argument in brief; Nathaniel and I and some others
thought that there were two separable ideas in play:

1) A value that is finally and completely missing. == ABSENT
2) A value that we would like to ignore for the moment but might want
back at some future time == IGNORED

(I'm using the adjectives ABSENT and IGNORED here to be short for the
objects 'absent value'  and 'ignored value'.  This is to distinguish
from the verbs below).

We thought bitpatterns were a good match for the former, and masking
was a good match for the latter.

We all agreed there were two things you might like to do with values
that were missing in both senses above:

A) PROPAGATE; V + 1 == V
B) SKIP; K + 1 == 1

(Note verbs for the behaviors).

I believe the original np.ma masked arrays always SKIP.

In [2]: a = np.ma.masked_array?
In [3]: a = np.ma.masked_array([99, 2], mask=[True, False])
In [4]: a
Out[4]:
masked_array(data = [-- 2],
 mask = [ True False],
   fill_value = 99)
In [5]: a.sum()
Out[5]: 2

There was some discussion as to whether there was a reason to think
that ABSENT should always or by default PROPAGATE, and IGNORED should
always or by default SKIP.  Chuck is referring to this idea when he
said further up this thread:

 For instance, I'm thinking skipna=1 is the natural default for the masked 
 arrays.

The current implementation
---

What we have now is an implementation of masked arrays, but more
tightly integrated into the numpy core.  In our language we have an
implementation of IGNORED that is tuned to be nearly indistinguishable
from the behavior we are expecting of ABSENT.

Specifically, once you have done this:

In [9]: a = np.array([99, 2], maskna=True)

you can get something representing the mask:

In [11]: np.isna(a)
Out[11]: array([False, False], dtype=bool)

but I believe there is no way of setting the mask directly.  In order
to set the mask, you have to do what looks like an assignment:

In [12]: a[0] = np.NA
In [14]: a
Out[14]: array([NA, 2])

In fact, what has happened is the mask has changed, but the underlying
value has not:

In [18]: orig = np.array([99, 2])

In [19]: a = orig.view(maskna=True)

In [20]: a[0] = np.NA

In [21]: a
Out[21]: array([NA, 2])

In [22]: orig
Out[22]: array([99,  2])

This is different from real assignment:

In [23]: a[0] = 0

In [24]: a
Out[24]: array([0, 2], maskna=True)

In [25]: orig
Out[25]: array([0, 2])

Some effort has gone into making it difficult to pull off the mask:

In [30]: a.view(np.int64)
Out[30]: array([NA, 2])

In [31]: a.view(np.int64).flags
Out[31]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  MASKNA : True
  OWNMASKNA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [32]: a.astype(np.int64)
---
ValueErrorTraceback (most recent call last)
/home/mb312/ipython-input-32-e7f3381c9692 in module()
 1 a.astype(np.int64)

ValueError: Cannot assign NA to an array which does not support NAs

The default behavior of the masked values is PROPAGATE, but they can
be individually made to SKIP:

In [28]: a.sum() # PROPAGATE
Out[28]: NA(dtype='int64')

In [29]: a.sum(skipna=True) # SKIP
Out[29]: 2

Where's the beef?
-

I personally still think that it is confusing to fuse the concept of:

1) Masked arrays
2) Arrays with bitpattern codes for missing

and the concepts of

A) ABSENT and
B) IGNORED

Consequences for current code


Specifically, it still seems to me to make sense to prefer this:

 a = np.array([99, 2[, masking=True)
 a.mask
[ True, True ]
 a.sum()
101
 a.mask[0] = False
 a.sum()
2

It might make sense, as Chuck suggests, to change the default to
'skipna=True', and I'd further suggest renaming np.NA to np.IGNORED
and 'skipna' to skipignored' for clarity.

I still think the pseudo-assignment:

In [20]: a[0] = np.NA

is confusing, and should be removed.

Later, should we ever have bitpatterns, there would be something like
np.ABSENT.  This of course would make sense for assignment:

In [20]: a[0] = np.ABSENT

There would be 

Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-26 Thread Matthew Brett
Hi,

On Wed, Oct 26, 2011 at 1:07 AM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Oct 25, 2011 at 4:49 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 I guess from your answer that such a warning would be complicated to
 implement, and if that's the case, I can imagine it would be low
 priority.

 I assume the problem is more that it would be a weirdo check that
 becomes a maintenance burden (what is this doing here? Do we still
 need it? who knows?) than that it would be hard to do.

 You can easily do it yourself as a workaround...

 if not str(np.longdouble(2)**64 - 1).startswith(1844):
  warn(Printing of longdoubles is fubared! Beware! Beware!)

Thanks - yes - I was only thinking of someone like me getting confused
and thinking badly of us if they run into this.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-26 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 7:56 PM, Travis Oliphant oliph...@enthought.com wrote:
 So, I am very interested in making sure I remember the details of the 
 counterproposal.    What I recall is that you wanted to be able to 
 differentiate between a bit-pattern mask and a boolean-array mask in the 
 API.   I believe currently even when bit-pattern masks are implemented the 
 difference will be hidden from the user on the Python level.

 I am sure to be missing other parts of the discussion as I have been in and 
 out of it.

Nathaniel - are you online today?  Do you have time to review the
current implementation and see if it affects the initial discussion?

I'm running around most of today but I should have time to do some
thinking later this afternoon CA time.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 7:31 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Mon, Oct 24, 2011 at 10:59 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 I just ran into this on a PPC machine:

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '2.0.0.dev-4daf949'

 In [3]: res = np.longdouble(2)**64

 In [4]: res
 Out[4]: 18446744073709551616.0

 In [5]: 2**64
 Out[5]: 18446744073709551616L

 In [6]: res-1
 Out[6]: 36893488147419103231.0

 Same for numpy 1.4.1.

 I don't have a SPARC to test on but I believe it's the same double-double
 type?


 The PPC uses two doubles to represent long doubles, the SPARC uses software
 emulation of ieee quad precision for long doubles, very different.

Yes, thanks - I read more after my post.  I guess from this:

http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.genprogc/doc/genprogc/128bit_long_double_floating-point_datatype.htm

that AIX does use double-double.

 The
 subtraction of 1 working like multiplication by two is strange, perhaps the
 one is getting subtracted from the exponent somehow? It would be interesting
 to see if the same problem happens in pure c.

 As a work around, can I ask what you are trying to do with the long doubles?

I was trying to use them as an intermediate format for high-precision
floating point calculations, before converting to integers.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 2:43 AM, Pauli Virtanen p...@iki.fi wrote:
 25.10.2011 06:59, Matthew Brett kirjoitti:
 res = np.longdouble(2)**64
 res-1
 36893488147419103231.0

 Can you check if long double works properly (not a given) in C on that
 platform:

        long double x;
        x = powl(2, 64);
        x -= 1;
        printf(%g %Lg\n, (double)x, x);

 or, in case the platform doesn't have powl:

        long double x;
        x = pow(2, 64);
        x -= 1;
        printf(%g %Lg\n, (double)x, x);

Both the same as numpy:

[mb312@jerry ~]$ gcc test.c
test.c: In function 'main':
test.c:5: warning: incompatible implicit declaration of built-in function 'powl'
[mb312@jerry ~]$ ./a.out
1.84467e+19 3.68935e+19

Thanks,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 8:04 AM, Lluís xscr...@gmx.net wrote:
 Matthew Brett writes:
 I'm afraid I find this whole thread very unpleasant.

 I have the odd impression of being back at high school.  Some of the
 big kids are pushing me around and then the other kids join in.

 It didn't have to be this way.

 Someone could have replied like this to Nathaniel:

 Oh - yes - I'm sorry -  we actually had the discussion on the pull
 request.  Looking back, I see that we didn't flag this up on the
 mailing list and maybe we should have.  Thanks for pointing that out.
  Maybe we could start another discussion of the API in view of the
 changes that have gone in.

 But that didn't happen.

 Well, I really thought that all the interested parties would take a look at 
 [1].

 While it's true that the pull requests are not obvious if you're not using the
 functionalities of the github web (or unless announced in this list), I think
 that Mark's announcement was precisely directed at having a new round of
 discussions after having some code to play around with and see how intuitive 
 or
 counter-intuitive the implemented concepts could be.

I just wanted to be clear what I meant.

The key point is not whether or not the pull-request or request for
testing was in fact the right place for the discussion that Travis
suggested.   I guess you can argue that either way.   I'd say no, but
I can see how you would disagree on that.

The key point is - how much do we value constructive disagreement?

If we do value constructive disagreement then we'll go out of our way
to talk through the points of contention, and make sure that the
people who disagree, especially the minority, feel that they have been
fully heard.

If we don't value constructive disagreement then we'll let the other
side know that further disagreement will be taken as a sign of bad
faith.

Now - what do you see here?  I see the second and that worries me.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 10:52 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Tue, Oct 25, 2011 at 11:45 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Tue, Oct 25, 2011 at 2:43 AM, Pauli Virtanen p...@iki.fi wrote:
  25.10.2011 06:59, Matthew Brett kirjoitti:
  res = np.longdouble(2)**64
  res-1
  36893488147419103231.0
 
  Can you check if long double works properly (not a given) in C on that
  platform:
 
         long double x;
         x = powl(2, 64);
         x -= 1;
         printf(%g %Lg\n, (double)x, x);
 
  or, in case the platform doesn't have powl:
 
         long double x;
         x = pow(2, 64);
         x -= 1;
         printf(%g %Lg\n, (double)x, x);

 Both the same as numpy:

 [mb312@jerry ~]$ gcc test.c
 test.c: In function 'main':
 test.c:5: warning: incompatible implicit declaration of built-in function
 'powl'

 I think implicit here means that that the arguments and the return values
 are treated as integers. Did you #include math.h?

Ah - you've detected my severe ignorance of c.   But with math.h, the
result is the same,

#include stdio.h
#include math.h

int main(int argc, char* argv) {
   long double x;
   x = pow(2, 64);
   x -= 1;
   printf(%g %Lg\n, (double)x, x);
}

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
On Tue, Oct 25, 2011 at 11:05 AM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Oct 25, 2011 at 10:52 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:


 On Tue, Oct 25, 2011 at 11:45 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Tue, Oct 25, 2011 at 2:43 AM, Pauli Virtanen p...@iki.fi wrote:
  25.10.2011 06:59, Matthew Brett kirjoitti:
  res = np.longdouble(2)**64
  res-1
  36893488147419103231.0
 
  Can you check if long double works properly (not a given) in C on that
  platform:
 
         long double x;
         x = powl(2, 64);
         x -= 1;
         printf(%g %Lg\n, (double)x, x);
 
  or, in case the platform doesn't have powl:
 
         long double x;
         x = pow(2, 64);
         x -= 1;
         printf(%g %Lg\n, (double)x, x);

 Both the same as numpy:

 [mb312@jerry ~]$ gcc test.c
 test.c: In function 'main':
 test.c:5: warning: incompatible implicit declaration of built-in function
 'powl'

 I think implicit here means that that the arguments and the return values
 are treated as integers. Did you #include math.h?

 Ah - you've detected my severe ignorance of c.   But with math.h, the
 result is the same,

 #include stdio.h
 #include math.h

 int main(int argc, char* argv) {
       long double x;
       x = pow(2, 64);
       x -= 1;
       printf(%g %Lg\n, (double)x, x);
 }

By the way - if you want a login to this machine, let me know - it's
always on and we're using it as a buildslave already.

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 11:14 AM, Pauli Virtanen p...@iki.fi wrote:
 25.10.2011 19:45, Matthew Brett kirjoitti:
 [clip]
 or, in case the platform doesn't have powl:

         long double x;
         x = pow(2, 64);
         x -= 1;
         printf(%g %Lg\n, (double)x, x);

 Both the same as numpy:

 [mb312@jerry ~]$ gcc test.c
 test.c: In function 'main':
 test.c:5: warning: incompatible implicit declaration of built-in function 
 'powl'
 [mb312@jerry ~]$ ./a.out
 1.84467e+19 3.68935e+19

 This result may indicate that it's the *printing* of long doubles that's
 broken. Note how the value cast as double prints the correct result,
 whereas the %Lg format code gives something wrong.

Ah - sorry - I see now what you were trying to do.

 Can you try to check this by doing something like:

 - do some set of calculations using np.longdouble in Numpy
   (that requires the extra accuracy)

 - at the end, cast the result back to double

In [1]: import numpy as np

In [2]: res = np.longdouble(2)**64

In [6]: res / 2**32
Out[6]: 4294967296.0

In [7]: (res-1) / 2**32
Out[7]: 8589934591.98

In [8]: np.float((res-1) / 2**32)
Out[8]: 4294967296.0

In [9]: np.float((res) / 2**32)
Out[9]: 4294967296.0

Thanks,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 11:24 AM, Benjamin Root ben.r...@ou.edu wrote:
 On Tue, Oct 25, 2011 at 1:03 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Tue, Oct 25, 2011 at 8:04 AM, Lluís xscr...@gmx.net wrote:
  Matthew Brett writes:
  I'm afraid I find this whole thread very unpleasant.
 
  I have the odd impression of being back at high school.  Some of the
  big kids are pushing me around and then the other kids join in.
 
  It didn't have to be this way.
 
  Someone could have replied like this to Nathaniel:
 
  Oh - yes - I'm sorry -  we actually had the discussion on the pull
  request.  Looking back, I see that we didn't flag this up on the
  mailing list and maybe we should have.  Thanks for pointing that out.
   Maybe we could start another discussion of the API in view of the
  changes that have gone in.
 
  But that didn't happen.
 
  Well, I really thought that all the interested parties would take a look
  at [1].
 
  While it's true that the pull requests are not obvious if you're not
  using the
  functionalities of the github web (or unless announced in this list), I
  think
  that Mark's announcement was precisely directed at having a new round of
  discussions after having some code to play around with and see how
  intuitive or
  counter-intuitive the implemented concepts could be.

 I just wanted to be clear what I meant.

 The key point is not whether or not the pull-request or request for
 testing was in fact the right place for the discussion that Travis
 suggested.   I guess you can argue that either way.   I'd say no, but
 I can see how you would disagree on that.


 This is getting very meta... a disagreement about the disagreement.

Yes, the important point is a social one.  The other points are details.

 The key point is - how much do we value constructive disagreement?


 Personally, I value it very much.

Well - I think everyone believes that that they value constructive
discussion, but the question is, what happens when people really
disagree?

 My impression of the discussion we all
 had at the beginning was that the needs of the two distinct communities
 (R-users and masked array users) were both heard and largely addressed.
 Aspects of both approaches were used, and the final result is, IMHO,
 inspired and elegant.  Is it perfect? No.  Are there ways to improve it?
 Absolutely, and I fully expect that to happen.

To be clear once more, I personally feel we don't need to discuss:

1) Whether Mark did a good job on the code (I have high bias to imagine so).
2) Whether something along these lines would be good to have in numpy

 If we do value constructive disagreement then we'll go out of our way
 to talk through the points of contention, and make sure that the
 people who disagree, especially the minority, feel that they have been
 fully heard.

 If we don't value constructive disagreement then we'll let the other
 side know that further disagreement will be taken as a sign of bad
 faith.

 Now - what do you see here?  I see the second and that worries me.


 It is disappointing that you choose not to participate in the thread linked
 above or in the pull request itself.  If I remember correctly, you were
 working on finishing up your dissertation, so I fully understand the time
 constraints involved there.  However, the pull request and the email
 notification is the de facto method of staging and discussing changes in any
 development project.  No objections were raised in that pull request, so it
 went in after some time passed.  To hold off the merge, all one would need
 to do is fire off a quick comment requesting a delay to have a chance to
 review the pull request.

I think the pull-request was not the right vehicle for the discussion,
you think it was, that's fine, I don't think we need to rehearse that.

My question (if you are answering my question) is: if you put yourself
in my or Nathaniel's shoes, would you feel that you had been warmly
encouraged to express disagreement, or would you feel something else.

 Luckily, git is a VCS, so we are fully capable of reverting any necessary
 changes if warranted.  If you have any concerns or suggestions for changes
 in the current implementation, feel free to raise them and open additional
 pull requests.  There is no ganging up here or any other subterfuge.  Tell
 us exactly what are your issues with the current setup, provide example code
 demonstrating the issues, and we can certainly discuss ways to improve this.

Has the situation changed since the counter-NEP that Nathaniel and I wrote up?

 Remember, we *all* have a common agreement here.  NumPy needs better support
 for missing data (in whatever form).  Let's work from that assumption and
 make NumPy a better library to use for everybody!

I remember walking past a church in a small town in the California
desert.  It had a sign outside saying 'People who are busy rowing do
not have time to rock the boat'.  This seemed to me a total failure

Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 12:01 PM, Derek Homeier
de...@astro.physik.uni-goettingen.de wrote:
 On 25 Oct 2011, at 20:05, Matthew Brett wrote:

 Both the same as numpy:

 [mb312@jerry ~]$ gcc test.c
 test.c: In function 'main':
 test.c:5: warning: incompatible implicit declaration of built-in function
 'powl'

 I think implicit here means that that the arguments and the return values
 are treated as integers. Did you #include math.h?

 Ah - you've detected my severe ignorance of c.   But with math.h, the
 result is the same,

 #include stdio.h
 #include math.h

 int main(int argc, char* argv) {
       long double x;
       x = pow(2, 64);
       x -= 1;
       printf(%g %Lg\n, (double)x, x);
 }

 What system/compiler is this? I am getting
 ./ldouble
 1.84467e+19 1.84467e+19

 and

 res = np.longdouble(2)**64
 res
 18446744073709551616.0
 2**64
 18446744073709551616L
 res-1
 18446744073709551615.0
 np.__version__
 '1.6.1'

 as well as with

 np.__version__
 '2.0.0.dev-3d06f02'
 [yes, not very up to date]

 and for all gcc versions
 /usr/bin/gcc -v
 Using built-in specs.
 Target: powerpc-apple-darwin9
 Configured with: /var/tmp/gcc/gcc-5493~1/src/configure --disable-checking 
 -enable-werror --prefix=/usr --mandir=/share/man 
 --enable-languages=c,objc,c++,obj-c++ 
 --program-transform-name=/^[cg][^.-]*$/s/$/-4.0/ 
 --with-gxx-include-dir=/include/c++/4.0.0 --with-slibdir=/usr/lib 
 --build=i686-apple-darwin9 --program-prefix= --host=powerpc-apple-darwin9 
 --target=powerpc-apple-darwin9
 Thread model: posix
 gcc version 4.0.1 (Apple Inc. build 5493)

 to

 /sw/bin/gcc-fsf-4.6 -v
 Using built-in specs.
 COLLECT_GCC=/sw/bin/gcc-fsf-4.6
 COLLECT_LTO_WRAPPER=/sw/lib/gcc4.6/libexec/gcc/powerpc-apple-darwin9.8.0/4.6.1/lto-wrapper
 Target: powerpc-apple-darwin9.8.0
 Configured with: ../gcc-4.6.1/configure --prefix=/sw --prefix=/sw/lib/gcc4.6 
 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.6/info 
 --enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw 
 --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw 
 --with-system-zlib --x-includes=/usr/X11R6/include 
 --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.6 
 --enable-cloog-backend=isl --disable-libjava-multilib --disable-libquadmath
 Thread model: posix
 gcc version 4.6.1 (GCC)

 uname -a
 Darwin osiris.astro.physik.uni-goettingen.de 9.8.0 Darwin Kernel Version 
 9.8.0: Wed Jul 15 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC Power 
 Macintosh

mb312@jerry ~]$ gcc -v
Using built-in specs.
Target: powerpc-apple-darwin8
Configured with: /var/tmp/gcc/gcc-5370~2/src/configure
--disable-checking -enable-werror --prefix=/usr --mandir=/share/man
--enable-languages=c,objc,c++,obj-c++
--program-transform-name=/^[cg][^.-]*$/s/$/-4.0/
--with-gxx-include-dir=/include/c++/4.0.0 --with-slibdir=/usr/lib
--build=powerpc-apple-darwin8 --host=powerpc-apple-darwin8
--target=powerpc-apple-darwin8
Thread model: posix
gcc version 4.0.1 (Apple Computer, Inc. build 5370)
[mb312@jerry ~]$ uname -a
Darwin jerry.bic.berkeley.edu 8.11.0 Darwin Kernel Version 8.11.0: Wed
Oct 10 18:26:00 PDT 2007; root:xnu-792.24.17~1/RELEASE_PPC Power
Macintosh powerpc

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 12:14 PM, Pauli Virtanen p...@iki.fi wrote:
 25.10.2011 20:29, Matthew Brett kirjoitti:
 [clip]
 In [7]: (res-1) / 2**32
 Out[7]: 8589934591.98

 In [8]: np.float((res-1) / 2**32)
 Out[8]: 4294967296.0

 Looks like a bug in the C library installed on the machine, then.

 It's either in wontfix territory for us, or in the cast to doubles
 before formatting one. In the latter case, one would have to maintain a
 list of broken C libraries (ugh).

How about a check at import time and a warning when printing?  Is that
hard to do?

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-25 Thread Matthew Brett
Hi,

On Tue, Oct 25, 2011 at 2:58 PM, David Cournapeau courn...@gmail.com wrote:
 On Tue, Oct 25, 2011 at 8:22 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Tue, Oct 25, 2011 at 12:14 PM, Pauli Virtanen p...@iki.fi wrote:
 25.10.2011 20:29, Matthew Brett kirjoitti:
 [clip]
 In [7]: (res-1) / 2**32
 Out[7]: 8589934591.98

 In [8]: np.float((res-1) / 2**32)
 Out[8]: 4294967296.0

 Looks like a bug in the C library installed on the machine, then.

 It's either in wontfix territory for us, or in the cast to doubles
 before formatting one. In the latter case, one would have to maintain a
 list of broken C libraries (ugh).

 How about a check at import time and a warning when printing?  Is that
 hard to do?

 That's fragile IMO. I think that Chuck summed it well: long double are
 not portable, don't use them unless you have to or you can rely on
 platform-specificities.

That reminds me of the old joke about the Irishman giving directions -
If I were you, I wouldn't start from here.

 I would rather spend some time on implementing/integrating portable
 quad precision in software,

I guess from your answer that such a warning would be complicated to
implement, and if that's the case, I can imagine it would be low
priority.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NA masks in the next numpy release?

2011-10-25 Thread Matthew Brett
Hi,

Thank you for your gracious email.

On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant oliph...@enthought.com wrote:
 It is a shame that Nathaniel and perhaps Matthew do not feel like their
 voice was heard.   I wish I could have participated more fully in some of
 the discussions.  I don't know if I could have really helped, but I would
 have liked to have tried to perhaps work alongside Mark to integrate some of
 the other ideas that had been expressed during the discussion.
 Unfortunately,  I was traveling in NYC most of the time that Mark was
 working on this project and did not get a chance to interact with him as
 much as I would have liked.
 My view is that we didn't get quite to where I thought we would get, nor
 where I think we could be.  I think Nathaniel and Matthew provided very
 specific feedback that was helpful in understanding other perspectives of a
 difficult problem.     In particular, I really wanted bit-patterns
 implemented.    However, I also understand that Mark did quite a bit of work
 and altered his original designs quite a bit in response to community
 feedback.   I wasn't a major part of the pull request discussion, nor did I
 merge the changes, but I support Charles if he reviewed the code and felt
 like it was the right thing to do.  I likely would have done the same thing
 rather than let Mark Wiebe's work languish.
 Merging Mark's code does not mean there is not more work to be done, but it
 is consistent with the reality that currently development on NumPy happens
 when people have the time to do it.    I have not seen anything to convince
 me that there is not still time to make specific API changes that address
 some of the concerns.
 Perhaps, Nathaniel and or Matthew could summarize their concerns again and
 if desired submit a pull request to revert the changes.   However, there is
 a definite bias against removing working code unless the arguments are very
 strong and receive a lot of support from others.

Honestly - I am not sure whether there is any interest now, in the
arguments we made before.   If there is, who is interested?  I mean,
past politeness.

I wasn't trying to restart that discussion, because I didn't know what
good it could do.   At first I was hoping that we could ask whether
there was a better way of dealing with disagreements like this.
Later it seemed to me that the atmosphere was getting bad, and I
wanted to say that because I thought it was important.

 Thank you for continuing to voice your opinions even when it may feel that
 the tide is against you.   My view is that we only learn from people who
 disagree with us.

Thank you for saying that.   I hope that y'all will tell me if I am
making it harder for you to disagree,  and I am sorry if I did so
here.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] float128 / longdouble on PPC - is it broken?

2011-10-24 Thread Matthew Brett
Hi,

I just ran into this on a PPC machine:

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '2.0.0.dev-4daf949'

In [3]: res = np.longdouble(2)**64

In [4]: res
Out[4]: 18446744073709551616.0

In [5]: 2**64
Out[5]: 18446744073709551616L

In [6]: res-1
Out[6]: 36893488147419103231.0

Same for numpy 1.4.1.

I don't have a SPARC to test on but I believe it's the same double-double type?

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


<    2   3   4   5   6   7   8   9   10   11   >