Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

Nathaniel Smith Wed, 16 May 2012 07:10:08 -0700

On Wed, May 16, 2012 at 3:04 PM, Benjamin Root <ben.r...@ou.edu> wrote:
>
>
> On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith <n...@pobox.com> wrote:
>>
>> On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien <no...@nouiz.org> wrote:
>> > Hi,
>> >
>> > In fact, I would arg to never change the current behavior, but add the
>> > flag for people that want to use it.
>> >
>> > Why?
>> >
>> > 1) There is probably >10k script that use it that will need to be
>> > checked for correctness. There won't be easy to see crash or error
>> > that allow user to see it.
>>
>> My suggestion is that we follow the scheme, which I think gives ample
>> opportunity for people to notice problems:
>>
>> 1.7: works like 1.6, except that a DeprecationWarning is produced if
>> (and only if) someone writes to an array returned by np.diagonal (or
>> friends). This gives a pleasant heads-up for those who pay attention
>> to DeprecationWarnings.
>>
>> 1.8: return a view, but mark this view read-only. This causes crashes
>> for anyone who ignored the DeprecationWarnings, guaranteeing that
>> they'll notice the issue.
>>
>> 1.9: return a writeable view, transition complete.
>>
>> I've written a pull request implementing the first part of this; I
>> hope everyone interested will take a look:
>>  https://github.com/numpy/numpy/pull/280
>>
>> > 2) This is a globally not significant speed up by this change. Due to
>> > 1), i think it is not work it. Why this is not a significant speed up?
>> > First, the user already create and use the original tensor. Suppose a
>> > matrix of size n x n. If it don't fit in the cache, creating it will
>> > cost n * n. But coping it will cost cst * n. The cst is the price of
>> > loading a full cache line. But if you return a view, you will pay this
>> > cst price later when you do the computation. But it all case, this is
>> > cheap compared to the cost of creating the matrix. Also, you will do
>> > work on the matrix and this work will be much more costly then the
>> > price of the copy.
>> >
>> > In the case the matrix fix in the cache, the price of the copy is even
>> > lower.
>> >
>> > So in conclusion, optimizing the diagonal won't give speed up in the
>> > global user script, but will break many of them.
>>
>> I agree that the speed difference is small. I'm more worried about the
>> cost to users of having to remember odd inconsistencies like this, and
>> to think about whether there actually is a speed difference or not,
>> etc. (If we do add a copy=False option, then I guarantee many people
>> will use it religiously "just in case" the speed difference is enough
>> to matter! And that would suck for them.)
>>
>> Returning a view makes the API slightly nicer, cleaner, more
>> consistent, more useful. (I believe the reason this was implemented in
>> the first place was that providing a convenient way to *write* to the
>> diagonal of an arbitrary array made it easier to implement numpy.eye
>> for masked arrays.) And the whole point of numpy is to trade off a
>> little speed in favor of having a simple, easy-to-work with high-level
>> API :-).
>>
>> -- Nathaniel
>
>
> Just as a sanity check, do the scipy tests run without producing any such
> messages?


I tried checking this before, actually, but can't figure out how to
build scipy against a copy of numpy that is installed in either a
virtualenv or just on PYTHONPATH. (Basically, I just don't want to
install some random development numpy into my system python.) Any
suggestions?

-- Nathaniel
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

Reply via email to