Re: [Numpy-discussion] Multiple-field indexing: view vs copy in 1.14+

Allan Haldane Mon, 22 Jan 2018 08:16:07 -0800

On 01/22/2018 10:53 AM, josef.p...@gmail.com wrote:

This is similar to the above example
a[['a', 'c']].view('i8')
but it doesn't try to combine fields.
In many examples where I used structured dtypes a long time ago,switched between consistent views as either a standard array of subsetsor as .structured dtypes.For this usecase it wouldn't matter whether a[['a', 'c']] returns a viewor copy, as long as we can get the second view that is consistent withthe selected part of the memory. This would also be independent ofwhether numpy pads internally and adjusts the strides if possible or not.
np.__version__
'1.11.2'
a = np.ones(5, dtype=[('a', 'i8'), ('b', 'f8'), ('c', 'f8')])
a
array([(1, 1.0, 1.0), (1, 1.0, 1.0), (1, 1.0, 1.0), (1, 1.0, 1.0),
        (1, 1.0, 1.0)],
       dtype=[('a', '<i8'), ('b', '<f8'), ('c', '<f8')])
a[['b', 'c']].view(('f8', 2)).mean(0)
array([ 1.,  1.])
a[['b', 'c']].view(('f8', 2)).dtype
dtype('float64')

Hmm, this did not raise a FutureWarning in 11.2, so I was not quiteright in my message. It looks like this particular line only startedraising FutureWarnings in 1.12.0.

Aside The plan is that statsmodels will drop all usage and support forrec_arays/structured dtypes
in the following release (0.10).
Then structured dtypes are free (from our perspective) to provide lowlevel struct support
instead of pretending to be dataframe_like.

Your use of structured arrays is "pandas-like", ie you are using ittabular data manipulation. In numpy 1.13 we updated the structured docsto discourage this. Of course users can do what they want, but here iswhat the new docs say:


    Structured arrays are designed for low-level
    manipulation of structured data, for example, for
    interpreting binary blobs. Structured datatypes are
    designed to mimic 'structs' in the C language, making
    them also useful for interfacing with C code. For these
    purposes, numpy supports specialized features such as
    subarrays and nested datatypes, and allows manual
    control over the memory layout of the structure.

    For simple manipulation of tabular data other pydata
    projects, such as pandas, xarray, or DataArray, provide
    higher-level interfaces that may be more suitable. These
    projects may also give better performance for tabular
    data analysis because the C-struct-like memory layout of
    structured arrays can lead to poor cache behavior.

Allan


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple-field indexing: view vs copy in 1.14+

Reply via email to