> Zachary Pincus wrote: >> Hello folks, >> >> I recently was trying to write code to modify an array in-place (so >> as not to invalidate any references to that array) > > I'm not sure what this means exactly.
Say one wants to keep two different variables referencing a single in- memory list, as so: a = [1,2,3] b = a Now, if 'b' and 'a' go to live in different places (different class instances or whatever) but we want 'b' and 'a' to always refer to the same in-memory object, so that 'id(a) == id(b)', we need to make sure to not assign a brand new list to either one. That is, if we do something like 'a = [i + 1 for i in a]' then 'id (a) != id(b)'. However, we can do 'a[:] = [i + 1 for i in a]' to modify a in-place. This is not super-common, but it's also not an uncommon python idiom. I was in my email simply pointing out that naïvely translating that idiom to the numpy case can cause unexpected behavior in the case of views. I think that this is is unquestionably a bug -- isn't the point of views that the user shouldn't need to care if a particular array object is a view or not? Given the lack of methods to query whether an array is a view, or what it might be a view on, this seems like a reasonable perspective... I mean, if certain operations produce completely different results when one of the operands is a view, that *seems* like a bug. It might not be worth fixing, but I can't see how that behavior would be considered a feature. However, I do think there's a legitimate question about whether it would be worth fixing -- there could be a lot of complicated checks to catch these kind of corner cases. >> via the standard >> python idiom for lists, e.g.: >> >> a[:] = numpy.flipud(a) >> >> Now, flipud returns a view on 'a', so assigning that to 'a[:]' >> provides pretty strange results as the buffer that is being read (the >> view) is simultaneously modified. > > yes, weird. So why not just: > > a = numpy.flipud(a) > > Since flipud returns a view, the new "a" will still be using the same > data array. Does this satisfy your need above? Nope -- though 'a' and 'numpy.flipud(a)' share the same data, the actual ndarray instances are different. This means that any other references to the 'a' array (made via 'b = a' or whatever) now refer to the old 'a', not the flipped one. The only other option for sharing arrays is to encapsulate them as attributes of *another* object, which itself won't change. That seems a bit clumsy. > It's too bad that to do this you need to know that flipud created a > view, rather than a copy of the data, as if it were a copy, you would > need to do the a[:] trick to make sure a kept the same data, but > that's > the price we pay for the flexibility and power of numpy -- the > alternative is to have EVERYTHING create a copy, but there were be a > substantial performance hit for that. Well, Anne's email suggests another alternative -- each time a view is created, keep track of the original array from whence it came, and then only make a copy when collisions like the above would take place. And actually, I suspect that views already need to keep a reference to their original array in order to keep that array from being deleted before the view is. But I don't know the guts of numpy well enough to say for sure. > NOTE: the docstring doesn't make it clear that a view is created: > >>>> help(numpy.flipud) > Help on function flipud in module numpy.lib.twodim_base: > > flipud(m) > returns an array with the columns preserved and rows flipped in > the up/down direction. Works on the first dimension of m. > > NOTE2: Maybe these kinds of functions should have an optional flag > that > specified whether you want a view or a copy -- I'd have expected a > copy > in this case! Well, it seems like in most cases one does not need to care whether one is looking at a view or an array. The only time that comes to mind is when you're attempting to modify the array in-place, e.g. a[<something>] = <something else> Even if the maybe-bug above were easily fixable (again, not sure about that), you might *still* want to be able to figure out if a were a view before such a modification. Whether this needs a runtime 'is_view' method, or just consistent documentation about what returns a view, isn't clear to me. Certainly the latter couldn't hurt. > QUESTION: > How do you tell if two arrays are views on the same data: is > checking if > they have the same .base reliable? > >>>> a = numpy.array((1,2,3,4)) >>>> b = a.view() >>>> a.base is b.base > False > > No, I guess not. Maybe .base should return self if it's the originator > of the data. > > Is there a reliable way? I usually just test by changing a value in > one > to see if it changes in the other, but that's one heck of kludge! > >>>> a.__array_interface__['data'][0] == b.__array_interface__['data'] >>>> [0] > True > > seems to work, but that's pretty ugly! Good question. As I mentioned above, I assume that this information is tracked internally to prevent the 'original' array data from being deleted before any views have; however I really don't know how it is exposed. Zach _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion