Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

Ralf Gommers Wed, 16 May 2012 13:11:06 -0700

On Wed, May 16, 2012 at 3:55 PM, Nathaniel Smith <n...@pobox.com> wrote:


> On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien <no...@nouiz.org> wrote:
> > Hi,
> >
> > In fact, I would arg to never change the current behavior, but add the
> > flag for people that want to use it.
> >
> > Why?
> >
> > 1) There is probably >10k script that use it that will need to be
> > checked for correctness. There won't be easy to see crash or error
> > that allow user to see it.
>
> My suggestion is that we follow the scheme, which I think gives ample
> opportunity for people to notice problems:
>
> 1.7: works like 1.6, except that a DeprecationWarning is produced if
> (and only if) someone writes to an array returned by np.diagonal (or
> friends). This gives a pleasant heads-up for those who pay attention
> to DeprecationWarnings.
>
> 1.8: return a view, but mark this view read-only. This causes crashes
> for anyone who ignored the DeprecationWarnings, guaranteeing that
> they'll notice the issue.
>
> 1.9: return a writeable view, transition complete.
>
> I've written a pull request implementing the first part of this; I
> hope everyone interested will take a look:
>  https://github.com/numpy/numpy/pull/280
>

Thanks for doing that. Seems like a good way forward.

When the PR gets merged, can you please also open a ticket for this with
Milestone 1.8? Then we won't forget to make the required changes for that
release.

Ralf


>
> > 2) This is a globally not significant speed up by this change. Due to
> > 1), i think it is not work it. Why this is not a significant speed up?
> > First, the user already create and use the original tensor. Suppose a
> > matrix of size n x n. If it don't fit in the cache, creating it will
> > cost n * n. But coping it will cost cst * n. The cst is the price of
> > loading a full cache line. But if you return a view, you will pay this
> > cst price later when you do the computation. But it all case, this is
> > cheap compared to the cost of creating the matrix. Also, you will do
> > work on the matrix and this work will be much more costly then the
> > price of the copy.
> >
> > In the case the matrix fix in the cache, the price of the copy is even
> lower.
> >
> > So in conclusion, optimizing the diagonal won't give speed up in the
> > global user script, but will break many of them.
>
> I agree that the speed difference is small. I'm more worried about the
> cost to users of having to remember odd inconsistencies like this, and
> to think about whether there actually is a speed difference or not,
> etc. (If we do add a copy=False option, then I guarantee many people
> will use it religiously "just in case" the speed difference is enough
> to matter! And that would suck for them.)
>
> Returning a view makes the API slightly nicer, cleaner, more
> consistent, more useful. (I believe the reason this was implemented in
> the first place was that providing a convenient way to *write* to the
> diagonal of an arbitrary array made it easier to implement numpy.eye
> for masked arrays.) And the whole point of numpy is to trade off a
> little speed in favor of having a simple, easy-to-work with high-level
> API :-).
>
> -- Nathaniel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

Reply via email to