On 01/22/2018 10:53 AM, josef.p...@gmail.com wrote:

This is similar to the above example
a[['a', 'c']].view('i8')
but it doesn't try to combine fields.

In  many examples where I used structured dtypes a long time ago, switched between consistent views as either a standard array of subsets or as .structured dtypes. For this usecase it wouldn't matter whether a[['a', 'c']] returns a view or copy, as long as we can get the second view that is consistent with the selected part of the memory. This would also be independent of whether numpy pads internally and adjusts the strides if possible or not.

np.__version__
'1.11.2'

a = np.ones(5, dtype=[('a', 'i8'), ('b', 'f8'), ('c', 'f8')])
a
array([(1, 1.0, 1.0), (1, 1.0, 1.0), (1, 1.0, 1.0), (1, 1.0, 1.0),
        (1, 1.0, 1.0)],
       dtype=[('a', '<i8'), ('b', '<f8'), ('c', '<f8')])

a[['b', 'c']].view(('f8', 2)).mean(0)
array([ 1.,  1.])
a[['b', 'c']].view(('f8', 2)).dtype
dtype('float64')

Hmm, this did not raise a FutureWarning in 11.2, so I was not quite right in my message. It looks like this particular line only started raising FutureWarnings in 1.12.0.

Aside The plan is that statsmodels will drop all usage and support for rec_arays/structured dtypes
in the following release (0.10).
Then structured dtypes are free (from our perspective) to provide low level struct support
instead of pretending to be dataframe_like.

Your use of structured arrays is "pandas-like", ie you are using it tabular data manipulation. In numpy 1.13 we updated the structured docs to discourage this. Of course users can do what they want, but here is what the new docs say:

    Structured arrays are designed for low-level
    manipulation of structured data, for example, for
    interpreting binary blobs. Structured datatypes are
    designed to mimic 'structs' in the C language, making
    them also useful for interfacing with C code. For these
    purposes, numpy supports specialized features such as
    subarrays and nested datatypes, and allows manual
    control over the memory layout of the structure.

    For simple manipulation of tabular data other pydata
    projects, such as pandas, xarray, or DataArray, provide
    higher-level interfaces that may be more suitable. These
    projects may also give better performance for tabular
    data analysis because the C-struct-like memory layout of
    structured arrays can lead to poor cache behavior.

Allan


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to