Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-24 Thread Luis Pedro Coelho
On one of my papers, we put up the code online. Years afterwards, I still get
emails every six months or so because the version of the code which was used
for the paper now returns the wrong result!

The problem is that it was written for the old histogram and, although I have
a new version of the code, sometimes people still download the old version.

just my two cent,
Luis

On Wednesday, May 23, 2012 12:06:45 AM Travis Oliphant wrote:
 I just realized that the pull request doesn't do what I thought it did which
 is just add the flag to warn users who are writing to an array that is a
 view when it used to be a copy. It's more cautious and also copies
 the data for 1.7.

 Is this really a necessary step?   I guess it depends on how many use-cases
 there are where people are relying on .diagonal() being a copy.   Given
 that this is such an easy thing for people who encounter the warning to fix
 their code, it seems overly cautious to *also* make a copy (especially for
 a rare code-path like this --- although I admit that I don't have any
 reproducible data to support that assertion that it's a rare code-path).

 I think we have a mixed record of being cautious (not cautious enough in
 some changes), but this seems like swinging in the other direction of being
 overly cautious on a minor point.

 I wonder if I'm the only one who feels that way about this PR.   This is not
 a major issue, so I am fine with the current strategy, but the drawback of
 being this cautious on this point is 1) it is not really reflective of
 other changes and 2) it does mean that someone who wants to fix their code
 for the future will end up with two copies for 1.7.

 -Travis

 On May 16, 2012, at 3:51 PM, Travis Oliphant wrote:
  This Pull Request looks like a good idea to me as well.
 
  -Travis
 
  On May 16, 2012, at 3:10 PM, Ralf Gommers wrote:
  On Wed, May 16, 2012 at 3:55 PM, Nathaniel Smith n...@pobox.com wrote:
 
  On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org
wrote:
   Hi,
  
   In fact, I would arg to never change the current behavior, but add the
   flag for people that want to use it.
  
   Why?
  
   1) There is probably 10k script that use it that will need to be
   checked for correctness. There won't be easy to see crash or error
   that allow user to see it.
 
  My suggestion is that we follow the scheme, which I think gives ample
  opportunity for people to notice problems:
 
  1.7: works like 1.6, except that a DeprecationWarning is produced if
  (and only if) someone writes to an array returned by np.diagonal (or
  friends). This gives a pleasant heads-up for those who pay attention
  to DeprecationWarnings.
 
  1.8: return a view, but mark this view read-only. This causes crashes
  for anyone who ignored the DeprecationWarnings, guaranteeing that
  they'll notice the issue.
 
  1.9: return a writeable view, transition complete.
 
  I've written a pull request implementing the first part of this; I
 
  hope everyone interested will take a look:
   https://github.com/numpy/numpy/pull/280
 
  Thanks for doing that. Seems like a good way forward.
 
  When the PR gets merged, can you please also open a ticket for this with
  Milestone 1.8? Then we won't forget to make the required changes for
  that release.
 
  Ralf
 
   2) This is a globally not significant speed up by this change. Due to
   1), i think it is not work it. Why this is not a significant speed up?
   First, the user already create and use the original tensor. Suppose a
   matrix of size n x n. If it don't fit in the cache, creating it will
   cost n * n. But coping it will cost cst * n. The cst is the price of
   loading a full cache line. But if you return a view, you will pay this
   cst price later when you do the computation. But it all case, this is
   cheap compared to the cost of creating the matrix. Also, you will do
   work on the matrix and this work will be much more costly then the
   price of the copy.
  
   In the case the matrix fix in the cache, the price of the copy is even
   lower.
  
   So in conclusion, optimizing the diagonal won't give speed up in the
   global user script, but will break many of them.
 
  I agree that the speed difference is small. I'm more worried about the
  cost to users of having to remember odd inconsistencies like this, and
  to think about whether there actually is a speed difference or not,
  etc. (If we do add a copyúlse option, then I guarantee many people
  will use it religiously just in case the speed difference is enough
  to matter! And that would suck for them.)
 
  Returning a view makes the API slightly nicer, cleaner, more
  consistent, more useful. (I believe the reason this was implemented in
  the first place was that providing a convenient way to *write* to the
  diagonal of an arbitrary array made it easier to implement numpy.eye
  for masked arrays.) And the whole point of numpy is to trade off a
  little speed in favor of having a simple, easy-to-work 

Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Nathaniel Smith
On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant tra...@continuum.io wrote:
 I just realized that the pull request doesn't do what I thought it did which
 is just add the flag to warn users who are writing to an array that is a
 view when it used to be a copy.     It's more cautious and also copies the
 data for 1.7.

 Is this really a necessary step?   I guess it depends on how many use-cases
 there are where people are relying on .diagonal() being a copy.   Given that
 this is such an easy thing for people who encounter the warning to fix their
 code, it seems overly cautious to *also* make a copy (especially for a rare
 code-path like this --- although I admit that I don't have any reproducible
 data to support that assertion that it's a rare code-path).

 I think we have a mixed record of being cautious (not cautious enough in
 some changes), but this seems like swinging in the other direction of being
 overly cautious on a minor point.

The reason this isn't a minor point is that if we just switched it
then it's possible that existing, working code would start returning
incorrect answers, and the only indication would be some console spew.
I think that such changes should be absolutely verboten for a library
like numpy. I'm already paranoid enough about my own code!

That's why people up-thread were arguing that we just shouldn't risk
the change at all, ever.

I admit to some ulterior motive here: I'd like to see numpy be able to
continue to evolve, but I am also, like I said, completely paranoid
about fundamental libraries changing under me. So this is partly my
attempt to find a way to make a potentially dangerous change in a
responsible way. If we can't learn to do this, then honestly I think
the only responsible alternative going forward would be to never
change any existing API except in trivial ways (like removing
deprecated functions).

Basically my suggestion is that every time we alter the behaviour of
existing, working code, there should be (a) a period when that
existing code produces a warning, and (b) a period when that existing
code produces an error. For a change like removing a function, this is
easy. For something like this diagonal change, it's trickier, but
still doable.

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Frédéric Bastien
+1

Don't forget that many user always update to each version. So they
will skip many version. This is especially true for people that rely
on the distribution package that skip many version when they update.
So this is not just a question of how many version we warn/err, but
also how many times we wait that those warning/error get propagated to
all user.

Fred

On Wed, May 23, 2012 at 9:02 AM, Olivier Delalleau sh...@keba.be wrote:

 /agree with Nathaniel. Overly cautious is good!

 -=- Olivier

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Nathaniel Smith
On Wed, May 23, 2012 at 2:11 PM, Frédéric Bastien no...@nouiz.org wrote:
 +1

 Don't forget that many user always update to each version. So they
 will skip many version. This is especially true for people that rely
 on the distribution package that skip many version when they update.
 So this is not just a question of how many version we warn/err, but
 also how many times we wait that those warning/error get propagated to
 all user.

Probably a question that we should revisit if/when numpy starts doing
more than 1 release per year...

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Travis Oliphant

On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:

 2012/5/23 Nathaniel Smith n...@pobox.com
 On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant tra...@continuum.io wrote:
  I just realized that the pull request doesn't do what I thought it did which
  is just add the flag to warn users who are writing to an array that is a
  view when it used to be a copy. It's more cautious and also copies the
  data for 1.7.
 
  Is this really a necessary step?   I guess it depends on how many use-cases
  there are where people are relying on .diagonal() being a copy.   Given that
  this is such an easy thing for people who encounter the warning to fix their
  code, it seems overly cautious to *also* make a copy (especially for a rare
  code-path like this --- although I admit that I don't have any reproducible
  data to support that assertion that it's a rare code-path).
 
  I think we have a mixed record of being cautious (not cautious enough in
  some changes), but this seems like swinging in the other direction of being
  overly cautious on a minor point.
 
 The reason this isn't a minor point is that if we just switched it
 then it's possible that existing, working code would start returning
 incorrect answers, and the only indication would be some console spew.
 I think that such changes should be absolutely verboten for a library
 like numpy. I'm already paranoid enough about my own code!
 
 That's why people up-thread were arguing that we just shouldn't risk
 the change at all, ever.
 
 I admit to some ulterior motive here: I'd like to see numpy be able to
 continue to evolve, but I am also, like I said, completely paranoid
 about fundamental libraries changing under me. So this is partly my
 attempt to find a way to make a potentially dangerous change in a
 responsible way. If we can't learn to do this, then honestly I think
 the only responsible alternative going forward would be to never
 change any existing API except in trivial ways (like removing
 deprecated functions).
 
 Basically my suggestion is that every time we alter the behaviour of
 existing, working code, there should be (a) a period when that
 existing code produces a warning, and (b) a period when that existing
 code produces an error. For a change like removing a function, this is
 easy. For something like this diagonal change, it's trickier, but
 still doable.
 
 /agree with Nathaniel. Overly cautious is good!
 

Then are you suggesting that we need to back out the changes to the casting 
rules as well, because this will also cause code to stop working.   This is 
part of my point.   We are not being consistently cautious. 

-Travis



 -=- Olivier
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Dag Sverre Seljebotn
On 05/23/2012 07:29 PM, Travis Oliphant wrote:

 On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:

 2012/5/23 Nathaniel Smith n...@pobox.com mailto:n...@pobox.com

 On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant
 tra...@continuum.io mailto:tra...@continuum.io wrote:
  I just realized that the pull request doesn't do what I thought
 it did which
  is just add the flag to warn users who are writing to an array
 that is a
  view when it used to be a copy. It's more cautious and also
 copies the
  data for 1.7.
 
  Is this really a necessary step? I guess it depends on how many
 use-cases
  there are where people are relying on .diagonal() being a copy.
 Given that
  this is such an easy thing for people who encounter the warning
 to fix their
  code, it seems overly cautious to *also* make a copy (especially
 for a rare
  code-path like this --- although I admit that I don't have any
 reproducible
  data to support that assertion that it's a rare code-path).
 
  I think we have a mixed record of being cautious (not cautious
 enough in
  some changes), but this seems like swinging in the other
 direction of being
  overly cautious on a minor point.

 The reason this isn't a minor point is that if we just switched it
 then it's possible that existing, working code would start returning
 incorrect answers, and the only indication would be some console spew.
 I think that such changes should be absolutely verboten for a library
 like numpy. I'm already paranoid enough about my own code!

 That's why people up-thread were arguing that we just shouldn't risk
 the change at all, ever.

 I admit to some ulterior motive here: I'd like to see numpy be able to
 continue to evolve, but I am also, like I said, completely paranoid
 about fundamental libraries changing under me. So this is partly my
 attempt to find a way to make a potentially dangerous change in a
 responsible way. If we can't learn to do this, then honestly I think
 the only responsible alternative going forward would be to never
 change any existing API except in trivial ways (like removing
 deprecated functions).

 Basically my suggestion is that every time we alter the behaviour of
 existing, working code, there should be (a) a period when that
 existing code produces a warning, and (b) a period when that existing
 code produces an error. For a change like removing a function, this is
 easy. For something like this diagonal change, it's trickier, but
 still doable.


 /agree with Nathaniel. Overly cautious is good!


 Then are you suggesting that we need to back out the changes to the
 casting rules as well, because this will also cause code to stop
 working. This is part of my point. We are not being consistently cautious.

Two wrongs doesn't make one right?

I'd think the inconvenience to users is mostly per unwarned breakage, 
so that even one unwarned breakage less translates into fewer minutes 
wasted for users scratching their heads.

In the end it's a tradeoff between inconvenience to NumPy developers and 
inconvenience to NumPy users -- not inconveniencing the developers 
further is an argument for not being consistent; but for diagonal() the 
work is already done.

Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Dag Sverre Seljebotn
On 05/23/2012 10:00 PM, Dag Sverre Seljebotn wrote:
 On 05/23/2012 07:29 PM, Travis Oliphant wrote:

 On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:

 2012/5/23 Nathaniel Smithn...@pobox.commailto:n...@pobox.com

  On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant
  tra...@continuum.iomailto:tra...@continuum.io  wrote:
I just realized that the pull request doesn't do what I thought
  it did which
is just add the flag to warn users who are writing to an array
  that is a
view when it used to be a copy. It's more cautious and also
  copies the
data for 1.7.
  
Is this really a necessary step? I guess it depends on how many
  use-cases
there are where people are relying on .diagonal() being a copy.
  Given that
this is such an easy thing for people who encounter the warning
  to fix their
code, it seems overly cautious to *also* make a copy (especially
  for a rare
code-path like this --- although I admit that I don't have any
  reproducible
data to support that assertion that it's a rare code-path).
  
I think we have a mixed record of being cautious (not cautious
  enough in
some changes), but this seems like swinging in the other
  direction of being
overly cautious on a minor point.

  The reason this isn't a minor point is that if we just switched it
  then it's possible that existing, working code would start returning
  incorrect answers, and the only indication would be some console spew.
  I think that such changes should be absolutely verboten for a library
  like numpy. I'm already paranoid enough about my own code!

  That's why people up-thread were arguing that we just shouldn't risk
  the change at all, ever.

  I admit to some ulterior motive here: I'd like to see numpy be able to
  continue to evolve, but I am also, like I said, completely paranoid
  about fundamental libraries changing under me. So this is partly my
  attempt to find a way to make a potentially dangerous change in a
  responsible way. If we can't learn to do this, then honestly I think
  the only responsible alternative going forward would be to never
  change any existing API except in trivial ways (like removing
  deprecated functions).

  Basically my suggestion is that every time we alter the behaviour of
  existing, working code, there should be (a) a period when that
  existing code produces a warning, and (b) a period when that existing
  code produces an error. For a change like removing a function, this is
  easy. For something like this diagonal change, it's trickier, but
  still doable.


 /agree with Nathaniel. Overly cautious is good!


 Then are you suggesting that we need to back out the changes to the
 casting rules as well, because this will also cause code to stop
 working. This is part of my point. We are not being consistently cautious.

 Two wrongs doesn't make one right?

 I'd think the inconvenience to users is mostly per unwarned breakage,
 so that even one unwarned breakage less translates into fewer minutes
 wasted for users scratching their heads.

 In the end it's a tradeoff between inconvenience to NumPy developers and
 inconvenience to NumPy users -- not inconveniencing the developers
 further is an argument for not being consistent; but for diagonal() the
 work is already done.

...and, I missed the point about a future-compatible fix implying 
double-copy.

Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Nathaniel Smith
On Wed, May 23, 2012 at 6:29 PM, Travis Oliphant tra...@continuum.io wrote:
 Then are you suggesting that we need to back out the changes to the casting
 rules as well, because this will also cause code to stop working.   This is
 part of my point.   We are not being consistently cautious.

I never understood exactly what changed with the casting rules, but
yeah, maybe. Still, the question of what our deprecation rules
*should* be is somewhat separate from the question of what we've
actually done (or even will do). You have to have ideals before you
can ask whether you're living up to them :-).

Didn't the casting rules become strictly stricter, i.e. some
questionable operations that used to succeed now throw an error? If so
then that's not a *major* violation of my suggested rules, but yeah, I
guess it'd probably be better if they did warn. I imagine it wouldn't
be terribly difficult to implement (add a new
NPY_WARN_UNSAFE_CASTING_INTERNAL value, use it everywhere that used to
be UNSAFE but now will be SAFE?), but someone who understands better
what actually changed (Mark?) would have do it.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Olivier Delalleau
2012/5/23 Nathaniel Smith n...@pobox.com

 On Wed, May 23, 2012 at 6:29 PM, Travis Oliphant tra...@continuum.io
 wrote:
  Then are you suggesting that we need to back out the changes to the
 casting
  rules as well, because this will also cause code to stop working.   This
 is
  part of my point.   We are not being consistently cautious.

 I never understood exactly what changed with the casting rules, but
 yeah, maybe. Still, the question of what our deprecation rules
 *should* be is somewhat separate from the question of what we've
 actually done (or even will do). You have to have ideals before you
 can ask whether you're living up to them :-).

 Didn't the casting rules become strictly stricter, i.e. some
 questionable operations that used to succeed now throw an error? If so
 then that's not a *major* violation of my suggested rules, but yeah, I
 guess it'd probably be better if they did warn. I imagine it wouldn't
 be terribly difficult to implement (add a new
 NPY_WARN_UNSAFE_CASTING_INTERNAL value, use it everywhere that used to
 be UNSAFE but now will be SAFE?), but someone who understands better
 what actually changed (Mark?) would have do it.


It wasn't just stricter rules. Some operations involving in particular
mixed scalar / array computations resulted in different outputs (with no
warning).

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Travis Oliphant
To be clear, I'm not opposed to the change, and it looks like we should go 
forward.   

In my mind it's not about developers vs. users as satisfying users is the whole 
point.   The purpose of NumPy is not to make its developers happy :-).   But, 
users also want there to *be* developers on NumPy so developer happiness is not 
irrelevant.  

In this case, though, there are consequences for users because of the double 
copy if a user wants to make their code future proof.   We are always trading 
off predicted user-experiences.I hope that we all don't have the same 
perspective on every issue or more than likely their aren't enough voices being 
heard from real users. 

-Travis



On May 23, 2012, at 3:01 PM, Dag Sverre Seljebotn wrote:

 On 05/23/2012 10:00 PM, Dag Sverre Seljebotn wrote:
 On 05/23/2012 07:29 PM, Travis Oliphant wrote:
 
 On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:
 
 2012/5/23 Nathaniel Smithn...@pobox.commailto:n...@pobox.com
 
 On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant
 tra...@continuum.iomailto:tra...@continuum.io  wrote:
 I just realized that the pull request doesn't do what I thought
 it did which
 is just add the flag to warn users who are writing to an array
 that is a
 view when it used to be a copy. It's more cautious and also
 copies the
 data for 1.7.
 
 Is this really a necessary step? I guess it depends on how many
 use-cases
 there are where people are relying on .diagonal() being a copy.
 Given that
 this is such an easy thing for people who encounter the warning
 to fix their
 code, it seems overly cautious to *also* make a copy (especially
 for a rare
 code-path like this --- although I admit that I don't have any
 reproducible
 data to support that assertion that it's a rare code-path).
 
 I think we have a mixed record of being cautious (not cautious
 enough in
 some changes), but this seems like swinging in the other
 direction of being
 overly cautious on a minor point.
 
 The reason this isn't a minor point is that if we just switched it
 then it's possible that existing, working code would start returning
 incorrect answers, and the only indication would be some console spew.
 I think that such changes should be absolutely verboten for a library
 like numpy. I'm already paranoid enough about my own code!
 
 That's why people up-thread were arguing that we just shouldn't risk
 the change at all, ever.
 
 I admit to some ulterior motive here: I'd like to see numpy be able to
 continue to evolve, but I am also, like I said, completely paranoid
 about fundamental libraries changing under me. So this is partly my
 attempt to find a way to make a potentially dangerous change in a
 responsible way. If we can't learn to do this, then honestly I think
 the only responsible alternative going forward would be to never
 change any existing API except in trivial ways (like removing
 deprecated functions).
 
 Basically my suggestion is that every time we alter the behaviour of
 existing, working code, there should be (a) a period when that
 existing code produces a warning, and (b) a period when that existing
 code produces an error. For a change like removing a function, this is
 easy. For something like this diagonal change, it's trickier, but
 still doable.
 
 
 /agree with Nathaniel. Overly cautious is good!
 
 
 Then are you suggesting that we need to back out the changes to the
 casting rules as well, because this will also cause code to stop
 working. This is part of my point. We are not being consistently cautious.
 
 Two wrongs doesn't make one right?
 
 I'd think the inconvenience to users is mostly per unwarned breakage,
 so that even one unwarned breakage less translates into fewer minutes
 wasted for users scratching their heads.
 
 In the end it's a tradeoff between inconvenience to NumPy developers and
 inconvenience to NumPy users -- not inconveniencing the developers
 further is an argument for not being consistent; but for diagonal() the
 work is already done.
 
 ...and, I missed the point about a future-compatible fix implying 
 double-copy.
 
 Dag
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-23 Thread Nathaniel Smith
On Wed, May 23, 2012 at 10:53 PM, Travis Oliphant tra...@continuum.io wrote:
 To be clear, I'm not opposed to the change, and it looks like we should go 
 forward.

 In my mind it's not about developers vs. users as satisfying users is the 
 whole point.   The purpose of NumPy is not to make its developers happy :-).  
  But, users also want there to *be* developers on NumPy so developer 
 happiness is not irrelevant.

 In this case, though, there are consequences for users because of the double 
 copy if a user wants to make their code future proof.   We are always trading 
 off predicted user-experiences.    I hope that we all don't have the same 
 perspective on every issue or more than likely their aren't enough voices 
 being heard from real users.

I'm not really worried about users who have a problem with the
double-copy. It's a totally legitimate concern, but anyone who has
that concern has already understood the issues well enough to be able
to take care of themselves, and decided that it's worth the effort to
special-case this. They can check whether the returned array has .base
set to tell whether it's an array or a view, use a temporary hack to
check for the secret warning flag in arr.flags.num, check the numpy
version, all sorts of things to get them through the one version where
this matters. The suggestion in the docs to make a copy is not exactly
binding :-).

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-22 Thread Travis Oliphant
I just realized that the pull request doesn't do what I thought it did which is 
just add the flag to warn users who are writing to an array that is a view when 
it used to be a copy. It's more cautious and also copies the data for 
1.7.   

Is this really a necessary step?   I guess it depends on how many use-cases 
there are where people are relying on .diagonal() being a copy.   Given that 
this is such an easy thing for people who encounter the warning to fix their 
code, it seems overly cautious to *also* make a copy (especially for a rare 
code-path like this --- although I admit that I don't have any reproducible 
data to support that assertion that it's a rare code-path). 

I think we have a mixed record of being cautious (not cautious enough in some 
changes), but this seems like swinging in the other direction of being overly 
cautious on a minor point. 

I wonder if I'm the only one who feels that way about this PR.   This is not a 
major issue, so I am fine with the current strategy, but the drawback of being 
this cautious on this point is 1) it is not really reflective of other changes 
and 2) it does mean that someone who wants to fix their code for the future 
will end up with two copies for 1.7. 

-Travis



On May 16, 2012, at 3:51 PM, Travis Oliphant wrote:

 This Pull Request looks like a good idea to me as well. 
 
 -Travis
 
 On May 16, 2012, at 3:10 PM, Ralf Gommers wrote:
 
 
 
 On Wed, May 16, 2012 at 3:55 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.
 
 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:
 
 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.
 
 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.
 
 1.9: return a writeable view, transition complete.
 
 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280
 
 Thanks for doing that. Seems like a good way forward.
 
 When the PR gets merged, can you please also open a ticket for this with 
 Milestone 1.8? Then we won't forget to make the required changes for that 
 release.
 
 Ralf
  
 
  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even 
  lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.
 
 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)
 
 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).
 
 -- Nathaniel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org

Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Nathaniel Smith
On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
 Hi,

 In fact, I would arg to never change the current behavior, but add the
 flag for people that want to use it.

 Why?

 1) There is probably 10k script that use it that will need to be
 checked for correctness. There won't be easy to see crash or error
 that allow user to see it.

My suggestion is that we follow the scheme, which I think gives ample
opportunity for people to notice problems:

1.7: works like 1.6, except that a DeprecationWarning is produced if
(and only if) someone writes to an array returned by np.diagonal (or
friends). This gives a pleasant heads-up for those who pay attention
to DeprecationWarnings.

1.8: return a view, but mark this view read-only. This causes crashes
for anyone who ignored the DeprecationWarnings, guaranteeing that
they'll notice the issue.

1.9: return a writeable view, transition complete.

I've written a pull request implementing the first part of this; I
hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280

 2) This is a globally not significant speed up by this change. Due to
 1), i think it is not work it. Why this is not a significant speed up?
 First, the user already create and use the original tensor. Suppose a
 matrix of size n x n. If it don't fit in the cache, creating it will
 cost n * n. But coping it will cost cst * n. The cst is the price of
 loading a full cache line. But if you return a view, you will pay this
 cst price later when you do the computation. But it all case, this is
 cheap compared to the cost of creating the matrix. Also, you will do
 work on the matrix and this work will be much more costly then the
 price of the copy.

 In the case the matrix fix in the cache, the price of the copy is even lower.

 So in conclusion, optimizing the diagonal won't give speed up in the
 global user script, but will break many of them.

I agree that the speed difference is small. I'm more worried about the
cost to users of having to remember odd inconsistencies like this, and
to think about whether there actually is a speed difference or not,
etc. (If we do add a copy=False option, then I guarantee many people
will use it religiously just in case the speed difference is enough
to matter! And that would suck for them.)

Returning a view makes the API slightly nicer, cleaner, more
consistent, more useful. (I believe the reason this was implemented in
the first place was that providing a convenient way to *write* to the
diagonal of an arbitrary array made it easier to implement numpy.eye
for masked arrays.) And the whole point of numpy is to trade off a
little speed in favor of having a simple, easy-to-work with high-level
API :-).

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Benjamin Root
On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.

 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:

 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.

 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.

 1.9: return a writeable view, transition complete.

 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280

  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even
 lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.

 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)

 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).

 -- Nathaniel


Just as a sanity check, do the scipy tests run without producing any such
messages?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Nathaniel Smith
On Wed, May 16, 2012 at 3:04 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.

 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:

 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.

 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.

 1.9: return a writeable view, transition complete.

 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280

  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even
  lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.

 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)

 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).

 -- Nathaniel


 Just as a sanity check, do the scipy tests run without producing any such
 messages?

I tried checking this before, actually, but can't figure out how to
build scipy against a copy of numpy that is installed in either a
virtualenv or just on PYTHONPATH. (Basically, I just don't want to
install some random development numpy into my system python.) Any
suggestions?

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Robert Kern
On Wed, May 16, 2012 at 3:10 PM, Nathaniel Smith n...@pobox.com wrote:
 On Wed, May 16, 2012 at 3:04 PM, Benjamin Root ben.r...@ou.edu wrote:

 Just as a sanity check, do the scipy tests run without producing any such
 messages?

 I tried checking this before, actually, but can't figure out how to
 build scipy against a copy of numpy that is installed in either a
 virtualenv or just on PYTHONPATH. (Basically, I just don't want to
 install some random development numpy into my system python.) Any
 suggestions?

scipy will build against whatever numpy the python executable that
runs the setup.py manages to import. So if you are using virtualenv,
just make sure that the virtualenv is activated and python refers to
the virtualenv's python executable.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Nathaniel Smith
On Wed, May 16, 2012 at 3:04 PM, Benjamin Root ben.r...@ou.edu wrote:


 On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.

 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:

 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.

 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.

 1.9: return a writeable view, transition complete.

 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280

  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even
  lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.

 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)

 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).

 -- Nathaniel


 Just as a sanity check, do the scipy tests run without producing any such
 messages?

Yeah, neither the 0.10.1 nor current scipy master tests trigger the
DeprecationWarning when run against my branch.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Ralf Gommers
On Wed, May 16, 2012 at 3:55 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.

 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:

 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.

 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.

 1.9: return a writeable view, transition complete.

 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280


Thanks for doing that. Seems like a good way forward.

When the PR gets merged, can you please also open a ticket for this with
Milestone 1.8? Then we won't forget to make the required changes for that
release.

Ralf



  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even
 lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.

 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)

 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).

 -- Nathaniel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Travis Oliphant

On May 13, 2012, at 3:11 AM, Nathaniel Smith wrote:

 On Sun, May 13, 2012 at 3:28 AM, Travis Oliphant tra...@continuum.io wrote:
 Another approach would be to introduce a method:
 
 a.diag(copy=False)
 
 and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
 2-3 releases.
 
 This would be a good idea if we didn't already have both
 np.diagonal(a) (which is an alias for a.diagonal()) *and* np.diag(a),
 which do different things. And the new a.diag() would be different
 from the existing np.diag(a)...

I don't see how the new a.diag() would be different than np.diag(a) except for 
view semantics for 2-d arrays.   Is this really a problem? 

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Travis Oliphant
This Pull Request looks like a good idea to me as well. 

-Travis

On May 16, 2012, at 3:10 PM, Ralf Gommers wrote:

 
 
 On Wed, May 16, 2012 at 3:55 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien no...@nouiz.org wrote:
  Hi,
 
  In fact, I would arg to never change the current behavior, but add the
  flag for people that want to use it.
 
  Why?
 
  1) There is probably 10k script that use it that will need to be
  checked for correctness. There won't be easy to see crash or error
  that allow user to see it.
 
 My suggestion is that we follow the scheme, which I think gives ample
 opportunity for people to notice problems:
 
 1.7: works like 1.6, except that a DeprecationWarning is produced if
 (and only if) someone writes to an array returned by np.diagonal (or
 friends). This gives a pleasant heads-up for those who pay attention
 to DeprecationWarnings.
 
 1.8: return a view, but mark this view read-only. This causes crashes
 for anyone who ignored the DeprecationWarnings, guaranteeing that
 they'll notice the issue.
 
 1.9: return a writeable view, transition complete.
 
 I've written a pull request implementing the first part of this; I
 hope everyone interested will take a look:
  https://github.com/numpy/numpy/pull/280
 
 Thanks for doing that. Seems like a good way forward.
 
 When the PR gets merged, can you please also open a ticket for this with 
 Milestone 1.8? Then we won't forget to make the required changes for that 
 release.
 
 Ralf
  
 
  2) This is a globally not significant speed up by this change. Due to
  1), i think it is not work it. Why this is not a significant speed up?
  First, the user already create and use the original tensor. Suppose a
  matrix of size n x n. If it don't fit in the cache, creating it will
  cost n * n. But coping it will cost cst * n. The cst is the price of
  loading a full cache line. But if you return a view, you will pay this
  cst price later when you do the computation. But it all case, this is
  cheap compared to the cost of creating the matrix. Also, you will do
  work on the matrix and this work will be much more costly then the
  price of the copy.
 
  In the case the matrix fix in the cache, the price of the copy is even 
  lower.
 
  So in conclusion, optimizing the diagonal won't give speed up in the
  global user script, but will break many of them.
 
 I agree that the speed difference is small. I'm more worried about the
 cost to users of having to remember odd inconsistencies like this, and
 to think about whether there actually is a speed difference or not,
 etc. (If we do add a copy=False option, then I guarantee many people
 will use it religiously just in case the speed difference is enough
 to matter! And that would suck for them.)
 
 Returning a view makes the API slightly nicer, cleaner, more
 consistent, more useful. (I believe the reason this was implemented in
 the first place was that providing a convenient way to *write* to the
 diagonal of an arbitrary array made it easier to implement numpy.eye
 for masked arrays.) And the whole point of numpy is to trade off a
 little speed in favor of having a simple, easy-to-work with high-level
 API :-).
 
 -- Nathaniel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Fernando Perez
On Wed, May 16, 2012 at 7:10 AM, Nathaniel Smith n...@pobox.com wrote:
 I tried checking this before, actually, but can't figure out how to
 build scipy against a copy of numpy that is installed in either a
 virtualenv or just on PYTHONPATH. (Basically, I just don't want to
 install some random development numpy into my system python.) Any
 suggestions?

manage PYTHONPATH as a stack: push install, play/test, pop stack:

Here's my bashrc machinery to do it, crib at will:

https://gist.github.com/2716714

Usage is trivial:

cd numpy  ./setup.py install --prefix=~/tmp/junk
cd scipy  ./setup.py install --prefix=~/tmp/junk
# play/test
rm -rf ~/tmp/junk


Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Fernando Perez
On Fri, May 11, 2012 at 4:54 PM, Nathaniel Smith n...@pobox.com wrote:
 I
 have lying around my homedir that it would generally be a free speed
 win

Don't forget the case where the copy semantics may actually provide an
*improvement* in performance by allowing a potentially large array to
get deallocated if it was local.  People often forget that a *single
element* that is a view can 'pin' a huge array to memory by INCREF-ing
it.  If that large array trashes your cache (or worse, makes you go
into swapping), the time savings of using a view will be quickly
obliterated.

There are cases where a copy can be the fast solution...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-15 Thread Frédéric Bastien
Hi,

In fact, I would arg to never change the current behavior, but add the
flag for people that want to use it.

Why?

1) There is probably 10k script that use it that will need to be
checked for correctness. There won't be easy to see crash or error
that allow user to see it.

2) This is a globally not significant speed up by this change. Due to
1), i think it is not work it. Why this is not a significant speed up?
First, the user already create and use the original tensor. Suppose a
matrix of size n x n. If it don't fit in the cache, creating it will
cost n * n. But coping it will cost cst * n. The cst is the price of
loading a full cache line. But if you return a view, you will pay this
cst price later when you do the computation. But it all case, this is
cheap compared to the cost of creating the matrix. Also, you will do
work on the matrix and this work will be much more costly then the
price of the copy.

In the case the matrix fix in the cache, the price of the copy is even lower.

So in conclusion, optimizing the diagonal won't give speed up in the
global user script, but will break many of them.

I'm sure there is corner case where speed up of diag will be
significant, but I don't think they happen in real code. And if they
happen, asking them to add a keyword is better then breaking make
script I my opinion.

Fred

On Sun, May 13, 2012 at 4:11 AM, Nathaniel Smith n...@pobox.com wrote:
 On Sun, May 13, 2012 at 3:28 AM, Travis Oliphant tra...@continuum.io wrote:
 Another approach would be to introduce a method:

 a.diag(copy=False)

 and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
 2-3 releases.

 This would be a good idea if we didn't already have both
 np.diagonal(a) (which is an alias for a.diagonal()) *and* np.diag(a),
 which do different things. And the new a.diag() would be different
 from the existing np.diag(a)...

 -- Nathaniel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-13 Thread Sebastian Haase
+1

On Sun, May 13, 2012 at 4:28 AM, Travis Oliphant tra...@continuum.io wrote:
 Another approach would be to introduce a method:

 a.diag(copy=False)

 and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
 2-3 releases.

 -Travis


 On May 12, 2012, at 8:31 AM, Ralf Gommers wrote:



 On Sat, May 12, 2012 at 1:54 AM, Nathaniel Smith n...@pobox.com wrote:

 On Fri, May 11, 2012 at 9:26 PM, T J tjhn...@gmail.com wrote:
  On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:
 
  On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:
 
  11.05.2012 17:54, Frédéric Bastien kirjoitti:
   In Theano we use a view, but that is not relevant as it is the
   compiler that tell what is inplace. So this is invisible to the
   user.
  
   What about a parameter to diagonal() that default to return a view
   not
   writable as you said. The user can then choose what it want and this
   don't break the inferface.
  [clip]
 
  Agreed, it seems this is the appropriate way to go on here
  `diagonal(copy=True)`. A more obscure alternative would be to add a
  separate method that returns a view.
 
 
  This looks like the best way to deal with it, yes.
 
  Cheers,
  Mark
 
 
 
  I don't think changing the default behavior in a later release is a
  good
  idea. It's a sort of an API wart, but IMHO better that than subtle
  code
  breakage.
 
 
 
  copy=True seems fine, but is this the final plan?   What about long
  term,
  should diag() eventually be brought in line with transpose() and
  reshape()
  so that it is a view by default?  Changing default behavior is certainly
  not
  something that should be done all the time, but it *can* be done if
  deprecated appropriately.  A more consistent API is better than one with
  warts (if this particular issue is actually seen as a wart).

 This is my question as well. Adding copy=True default argument is
 certainly a fine solution for 1.7, but IMHO in the long run it would
 be better for diagonal() to return a view by default.


 If you want to get to a situation where the behavior is changed, adding a
 temporary new keyword is not a good solution in general. Because this forces
 you to make the change over 3 different releases:
   1. introduce copy=True
   2. change to copy=False
   3. remove copy kw
 and step 3 is still breaking existing code, because people started using
 copy=False.

 See the histogram new_behavior keyword as an example of this.

 (Aside from it
 seeming generally more numpythonic, I see from auditing the code I
 have lying around my homedir that it would generally be a free speed
 win, and having to remember to type a.diagonal(copy=False) all the
 time in order to get full performance seems a bit annoying.)

 I mean, I'm all for conservatism in these things, which is why I
 raised the issue in the first place :-). But it also seems like there
 should be *some* mechanism for getting there from here (assuming
 others agree we want to). There's been grumblings about trying to do
 more evolving of numpy in-place instead of putting everything off to
 the legendary 2.0, right?

 Unfortunately just putting a deprecation warning on everyone who calls
 diagonal() without an explicit copy= argument seems like it would be
 *really* obnoxious, though. If necessary we could get more creative...
 add a special-purpose ndarray flag like WARN_ON_WRITE so that code
 which writes to the returned array continues to work like now, but
 also triggers a deprecation warning? I dunno.


 Something like this could be a solution. Otherwise, just living with the
 copy is imho much better than introducing a copy kw.

 Ralf

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-13 Thread Nathaniel Smith
On Sun, May 13, 2012 at 3:28 AM, Travis Oliphant tra...@continuum.io wrote:
 Another approach would be to introduce a method:

 a.diag(copy=False)

 and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
 2-3 releases.

This would be a good idea if we didn't already have both
np.diagonal(a) (which is an alias for a.diagonal()) *and* np.diag(a),
which do different things. And the new a.diag() would be different
from the existing np.diag(a)...

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-12 Thread Ralf Gommers
On Sat, May 12, 2012 at 1:54 AM, Nathaniel Smith n...@pobox.com wrote:

 On Fri, May 11, 2012 at 9:26 PM, T J tjhn...@gmail.com wrote:
  On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:
 
  On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:
 
  11.05.2012 17:54, Frédéric Bastien kirjoitti:
   In Theano we use a view, but that is not relevant as it is the
   compiler that tell what is inplace. So this is invisible to the user.
  
   What about a parameter to diagonal() that default to return a view
 not
   writable as you said. The user can then choose what it want and this
   don't break the inferface.
  [clip]
 
  Agreed, it seems this is the appropriate way to go on here
  `diagonal(copy=True)`. A more obscure alternative would be to add a
  separate method that returns a view.
 
 
  This looks like the best way to deal with it, yes.
 
  Cheers,
  Mark
 
 
 
  I don't think changing the default behavior in a later release is a
 good
  idea. It's a sort of an API wart, but IMHO better that than subtle code
  breakage.
 
 
 
  copy=True seems fine, but is this the final plan?   What about long term,
  should diag() eventually be brought in line with transpose() and
 reshape()
  so that it is a view by default?  Changing default behavior is certainly
 not
  something that should be done all the time, but it *can* be done if
  deprecated appropriately.  A more consistent API is better than one with
  warts (if this particular issue is actually seen as a wart).

 This is my question as well. Adding copy=True default argument is
 certainly a fine solution for 1.7, but IMHO in the long run it would
 be better for diagonal() to return a view by default.


If you want to get to a situation where the behavior is changed, adding a
temporary new keyword is not a good solution in general. Because this
forces you to make the change over 3 different releases:
  1. introduce copy=True
  2. change to copy=False
  3. remove copy kw
and step 3 is still breaking existing code, because people started using
copy=False.

See the histogram new_behavior keyword as an example of this.

(Aside from it
 seeming generally more numpythonic, I see from auditing the code I
 have lying around my homedir that it would generally be a free speed
 win, and having to remember to type a.diagonal(copy=False) all the
 time in order to get full performance seems a bit annoying.)

 I mean, I'm all for conservatism in these things, which is why I
 raised the issue in the first place :-). But it also seems like there
 should be *some* mechanism for getting there from here (assuming
 others agree we want to). There's been grumblings about trying to do
 more evolving of numpy in-place instead of putting everything off to
 the legendary 2.0, right?

 Unfortunately just putting a deprecation warning on everyone who calls
 diagonal() without an explicit copy= argument seems like it would be
 *really* obnoxious, though. If necessary we could get more creative...
 add a special-purpose ndarray flag like WARN_ON_WRITE so that code
 which writes to the returned array continues to work like now, but
 also triggers a deprecation warning? I dunno.


Something like this could be a solution. Otherwise, just living with the
copy is imho much better than introducing a copy kw.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-12 Thread Travis Oliphant
Another approach would be to introduce a method: 

a.diag(copy=False)

and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over 2-3 
releases. 

-Travis


On May 12, 2012, at 8:31 AM, Ralf Gommers wrote:

 
 
 On Sat, May 12, 2012 at 1:54 AM, Nathaniel Smith n...@pobox.com wrote:
 On Fri, May 11, 2012 at 9:26 PM, T J tjhn...@gmail.com wrote:
  On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:
 
  On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:
 
  11.05.2012 17:54, Frédéric Bastien kirjoitti:
   In Theano we use a view, but that is not relevant as it is the
   compiler that tell what is inplace. So this is invisible to the user.
  
   What about a parameter to diagonal() that default to return a view not
   writable as you said. The user can then choose what it want and this
   don't break the inferface.
  [clip]
 
  Agreed, it seems this is the appropriate way to go on here
  `diagonal(copy=True)`. A more obscure alternative would be to add a
  separate method that returns a view.
 
 
  This looks like the best way to deal with it, yes.
 
  Cheers,
  Mark
 
 
 
  I don't think changing the default behavior in a later release is a good
  idea. It's a sort of an API wart, but IMHO better that than subtle code
  breakage.
 
 
 
  copy=True seems fine, but is this the final plan?   What about long term,
  should diag() eventually be brought in line with transpose() and reshape()
  so that it is a view by default?  Changing default behavior is certainly not
  something that should be done all the time, but it *can* be done if
  deprecated appropriately.  A more consistent API is better than one with
  warts (if this particular issue is actually seen as a wart).
 
 This is my question as well. Adding copy=True default argument is
 certainly a fine solution for 1.7, but IMHO in the long run it would
 be better for diagonal() to return a view by default.
 
 If you want to get to a situation where the behavior is changed, adding a 
 temporary new keyword is not a good solution in general. Because this forces 
 you to make the change over 3 different releases:
   1. introduce copy=True
   2. change to copy=False
   3. remove copy kw
 and step 3 is still breaking existing code, because people started using 
 copy=False. 
 
 See the histogram new_behavior keyword as an example of this.
 
 (Aside from it
 seeming generally more numpythonic, I see from auditing the code I
 have lying around my homedir that it would generally be a free speed
 win, and having to remember to type a.diagonal(copy=False) all the
 time in order to get full performance seems a bit annoying.)
 
 I mean, I'm all for conservatism in these things, which is why I
 raised the issue in the first place :-). But it also seems like there
 should be *some* mechanism for getting there from here (assuming
 others agree we want to). There's been grumblings about trying to do
 more evolving of numpy in-place instead of putting everything off to
 the legendary 2.0, right?
 
 Unfortunately just putting a deprecation warning on everyone who calls
 diagonal() without an explicit copy= argument seems like it would be
 *really* obnoxious, though. If necessary we could get more creative...
 add a special-purpose ndarray flag like WARN_ON_WRITE so that code
 which writes to the returned array continues to work like now, but
 also triggers a deprecation warning? I dunno.
 
 Something like this could be a solution. Otherwise, just living with the copy 
 is imho much better than introducing a copy kw.
 
 Ralf
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-12 Thread Benjamin Root
On Saturday, May 12, 2012, Travis Oliphant wrote:

 Another approach would be to introduce a method:

 a.diag(copy=False)

 and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
 2-3 releases.

 -Travis



+1

Ben Root


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Nathaniel Smith
I've been trying to sort through the changes that landed in master
from the missingdata branch to figure out how to separate out changes
related to NA support from those that aren't, and noticed that one of
them should probably be flagged to the list. Traditionally,
arr.diagonal() and np.diagonal(arr) return a *copy* of the diagonal.
Now, they return a view onto the original array.

On the one hand, this seems like it's clearly the way it should have
been since the beginning -- I'd expect .diagonal() to be a cheap
operation, like .transpose() and .reshape(). But, it's a potential
compatibility break if there is code out there that assumes diagonal()
returns a copy and can be scribbled on without affecting the original
array:

# 1.6:
 a = np.ones((2, 2))
 d = a.diagonal()
 d[0] = 3
 a
array([[ 1.,  1.],
   [ 1.,  1.]])

# current master/1.7:
 a = np.ones((2, 2))
 d = a.diagonal()
 d[0] = 3
 a
array([[ 3.,  1.],
   [ 1.,  1.]])

This is dangerous, obviously, and tricky to handle, since there's no
clear way to detect it and give a DeprecationWarning.

One option might be to keep the new behavior, but mark the returned
view as not WRITEABLE, and then flip to WRITEABLE=True in 1.8. Going
from read-only to writeable would be a compatible change, so that way
we end up on the behaviour we want eventually (in 1.8), and have only
one backwards compatibility break (1.6 - 1.7), but that break is
clean and obvious.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Frédéric Bastien
In Theano we use a view, but that is not relevant as it is the
compiler that tell what is inplace. So this is invisible to the user.

What about a parameter to diagonal() that default to return a view not
writable as you said. The user can then choose what it want and this
don't break the inferface.

You suggest that in 1.7 we return a view that is not writable, but
tell this is an interface change. I don't see an interface change here
as this change nothing for the user(if the writable flag is respected.
Can we rely on this?). The user will need to update their code in 1.8
if we return a writable view. So I think the interface change is in
1.8.

Why not change the interface in numpy 2.0? Otherwise, we have a high
risk of people just updating numpy without checking the release note
and have bad result. The parameter to diagonal will allow people to
have a view.

Fred

On Fri, May 11, 2012 at 11:42 AM, Nathaniel Smith n...@pobox.com wrote:
 I've been trying to sort through the changes that landed in master
 from the missingdata branch to figure out how to separate out changes
 related to NA support from those that aren't, and noticed that one of
 them should probably be flagged to the list. Traditionally,
 arr.diagonal() and np.diagonal(arr) return a *copy* of the diagonal.
 Now, they return a view onto the original array.

 On the one hand, this seems like it's clearly the way it should have
 been since the beginning -- I'd expect .diagonal() to be a cheap
 operation, like .transpose() and .reshape(). But, it's a potential
 compatibility break if there is code out there that assumes diagonal()
 returns a copy and can be scribbled on without affecting the original
 array:

 # 1.6:
 a = np.ones((2, 2))
 d = a.diagonal()
 d[0] = 3
 a
 array([[ 1.,  1.],
       [ 1.,  1.]])

 # current master/1.7:
 a = np.ones((2, 2))
 d = a.diagonal()
 d[0] = 3
 a
 array([[ 3.,  1.],
       [ 1.,  1.]])

 This is dangerous, obviously, and tricky to handle, since there's no
 clear way to detect it and give a DeprecationWarning.

 One option might be to keep the new behavior, but mark the returned
 view as not WRITEABLE, and then flip to WRITEABLE=True in 1.8. Going
 from read-only to writeable would be a compatible change, so that way
 we end up on the behaviour we want eventually (in 1.8), and have only
 one backwards compatibility break (1.6 - 1.7), but that break is
 clean and obvious.

 -- Nathaniel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Pauli Virtanen
11.05.2012 17:54, Frédéric Bastien kirjoitti:
 In Theano we use a view, but that is not relevant as it is the
 compiler that tell what is inplace. So this is invisible to the user.
 
 What about a parameter to diagonal() that default to return a view not
 writable as you said. The user can then choose what it want and this
 don't break the inferface.
[clip]

Agreed, it seems this is the appropriate way to go on here
`diagonal(copy=True)`. A more obscure alternative would be to add a
separate method that returns a view.

I don't think changing the default behavior in a later release is a good
idea. It's a sort of an API wart, but IMHO better that than subtle code
breakage.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Travis Oliphant

On May 11, 2012, at 2:18 PM, Pauli Virtanen wrote:

 11.05.2012 17:54, Frédéric Bastien kirjoitti:
 In Theano we use a view, but that is not relevant as it is the
 compiler that tell what is inplace. So this is invisible to the user.
 
 What about a parameter to diagonal() that default to return a view not
 writable as you said. The user can then choose what it want and this
 don't break the inferface.
 [clip]
 
 Agreed, it seems this is the appropriate way to go on here
 `diagonal(copy=True)`. A more obscure alternative would be to add a
 separate method that returns a view.
 
 I don't think changing the default behavior in a later release is a good
 idea. It's a sort of an API wart, but IMHO better that than subtle code
 breakage.


Yes,  I think this is true.   We should add the copy keyword, but not change 
the API. 

-Travis



 
   Pauli
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Mark Wiebe
On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:

 11.05.2012 17:54, Frédéric Bastien kirjoitti:
  In Theano we use a view, but that is not relevant as it is the
  compiler that tell what is inplace. So this is invisible to the user.
 
  What about a parameter to diagonal() that default to return a view not
  writable as you said. The user can then choose what it want and this
  don't break the inferface.
 [clip]

 Agreed, it seems this is the appropriate way to go on here
 `diagonal(copy=True)`. A more obscure alternative would be to add a
 separate method that returns a view.


This looks like the best way to deal with it, yes.

Cheers,
Mark



 I don't think changing the default behavior in a later release is a good
 idea. It's a sort of an API wart, but IMHO better that than subtle code
 breakage.

Pauli

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread T J
On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:

 11.05.2012 17:54, Frédéric Bastien kirjoitti:
  In Theano we use a view, but that is not relevant as it is the
  compiler that tell what is inplace. So this is invisible to the user.
 
  What about a parameter to diagonal() that default to return a view not
  writable as you said. The user can then choose what it want and this
  don't break the inferface.
 [clip]

 Agreed, it seems this is the appropriate way to go on here
 `diagonal(copy=True)`. A more obscure alternative would be to add a
 separate method that returns a view.


 This looks like the best way to deal with it, yes.

 Cheers,
 Mark



 I don't think changing the default behavior in a later release is a good
 idea. It's a sort of an API wart, but IMHO better that than subtle code
 breakage.



copy=True seems fine, but is this the final plan?   What about long term,
should diag() eventually be brought in line with transpose() and reshape()
so that it is a view by default?  Changing default behavior is certainly
not something that should be done all the time, but it *can* be done if
deprecated appropriately.  A more consistent API is better than one with
warts (if this particular issue is actually seen as a wart).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-11 Thread Nathaniel Smith
On Fri, May 11, 2012 at 9:26 PM, T J tjhn...@gmail.com wrote:
 On Fri, May 11, 2012 at 1:12 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Fri, May 11, 2012 at 2:18 PM, Pauli Virtanen p...@iki.fi wrote:

 11.05.2012 17:54, Frédéric Bastien kirjoitti:
  In Theano we use a view, but that is not relevant as it is the
  compiler that tell what is inplace. So this is invisible to the user.
 
  What about a parameter to diagonal() that default to return a view not
  writable as you said. The user can then choose what it want and this
  don't break the inferface.
 [clip]

 Agreed, it seems this is the appropriate way to go on here
 `diagonal(copy=True)`. A more obscure alternative would be to add a
 separate method that returns a view.


 This looks like the best way to deal with it, yes.

 Cheers,
 Mark



 I don't think changing the default behavior in a later release is a good
 idea. It's a sort of an API wart, but IMHO better that than subtle code
 breakage.



 copy=True seems fine, but is this the final plan?   What about long term,
 should diag() eventually be brought in line with transpose() and reshape()
 so that it is a view by default?  Changing default behavior is certainly not
 something that should be done all the time, but it *can* be done if
 deprecated appropriately.  A more consistent API is better than one with
 warts (if this particular issue is actually seen as a wart).

This is my question as well. Adding copy=True default argument is
certainly a fine solution for 1.7, but IMHO in the long run it would
be better for diagonal() to return a view by default. (Aside from it
seeming generally more numpythonic, I see from auditing the code I
have lying around my homedir that it would generally be a free speed
win, and having to remember to type a.diagonal(copy=False) all the
time in order to get full performance seems a bit annoying.)

I mean, I'm all for conservatism in these things, which is why I
raised the issue in the first place :-). But it also seems like there
should be *some* mechanism for getting there from here (assuming
others agree we want to). There's been grumblings about trying to do
more evolving of numpy in-place instead of putting everything off to
the legendary 2.0, right?

Unfortunately just putting a deprecation warning on everyone who calls
diagonal() without an explicit copy= argument seems like it would be
*really* obnoxious, though. If necessary we could get more creative...
add a special-purpose ndarray flag like WARN_ON_WRITE so that code
which writes to the returned array continues to work like now, but
also triggers a deprecation warning? I dunno.

-- Nathaniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion