Re: [Numpy-discussion] Multiple-field indexing: view vs copy in 1.14+

Allan Haldane Sat, 27 Jan 2018 20:49:26 -0800

On 01/25/2018 03:56 PM, josef.p...@gmail.com wrote:

On Thu, Jan 25, 2018 at 1:49 PM, Marten van Kerkwijk<m.h.vankerkw...@gmail.com <mailto:m.h.vankerkw...@gmail.com>> wrote:


    On Thu, Jan 25, 2018 at 1:16 PM, Stefan van der Walt
    <stef...@berkeley.edu <mailto:stef...@berkeley.edu>> wrote:
    > On Mon, 22 Jan 2018 10:11:08 -0500, Marten van Kerkwijk wrote:
    >>
    >> I think on the consistency argument is perhaps the most important:
    >> views are very powerful and in many ways one *counts* on them
    >> happening, especially in working with large arrays.
    >
    >
    > I had the same gut feeling, but the fancy indexing example made me
    > pause:
    >
    > In [9]: x = np.arange(12, dtype=float).reshape((3, 4))
    >
    > In [10]: p = x[[0, 1]]  # copy of data
    >
    > Then:
    >
    > In [11]: x = np.array([(0, 1), (2, 3)], dtype=[('a', int), ('b', int)])
    >
    > In [12]: p = x[['a', 'b']]  # copy of data, but proposal will change that


What does this do?
p = x[['a', 'b']].copy()


In 1.14.0 this creates an exact copy of what was returned by
`x[['a', 'b']]`, including any padding bytes.

My impression is that the problems with the view are because the paddedview doesn't behave like a "standard" dtype or array, i.e. the follow-upbehavior is the problematic part.

I think the padded view is a "standard array" in the sense that you caneasily create structured arrays with padding bytes, for example by usingthe `align=True` options.


    >>> np.zeros(3, dtype=np.dtype('u1,f4', align=True))
    array([(0, 0.), (0, 0.), (0, 0.)],

dtype={'names':['f0','f1'], 'formats':['u1','<f4'],'offsets':[0,4], 'itemsize':8, 'aligned':True})


Compare to

    >>> np.zeros(3, dtype='u1,u1,u1,u1,f4')[['f0', 'f4']]
    array([(0, 0.), (0, 0.), (0, 0.)],

dtype={'names':['f0','f4'], 'formats':['u1','<f4'],'offsets':[0,4], 'itemsize':8})



There are still bugs in numpy that occur for arrays with padding.

Allan

Josef

    >
    > We're not doing the same kind of indexing here exactly (in one case we
    > grab elements, in the other parts of elements), but the view behavior
    > may still break the "mental expectation".

    A bit off-topic, but maybe this is another argument to just allow
    `x['a', 'b']` -- I never understood why a tuple was not the
    appropriate iterable for getting multiple items from a record.

    -- Marten
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org>
    https://mail.python.org/mailman/listinfo/numpy-discussion
    <https://mail.python.org/mailman/listinfo/numpy-discussion>




_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple-field indexing: view vs copy in 1.14+

Reply via email to