On Mon, Jan 29, 2018 at 5:50 PM, <josef.p...@gmail.com> wrote: > > > On Mon, Jan 29, 2018 at 4:11 PM, Allan Haldane <allanhald...@gmail.com> > wrote: > >> On 01/29/2018 04:02 PM, josef.p...@gmail.com wrote: >> > >> > >> > On Mon, Jan 29, 2018 at 3:44 PM, Benjamin Root <ben.v.r...@gmail.com >> > <mailto:ben.v.r...@gmail.com>> wrote: >> > >> > I <3 structured arrays. I love the fact that I can access data by >> > row and then by fieldname, or vice versa. There are times when I >> > need to pass just a column into a function, and there are times when >> > I need to process things row by row. Yes, pandas is nice if you want >> > the specialized indexing features, but it becomes a bear to deal >> > with if all you want is normal indexing, or even the ability to >> > easily loop over the dataset. >> > >> > >> > I don't think there is a doubt that structured arrays, arrays with >> > structured dtypes, are a useful container. The question is whether they >> > should be more or the foundation for more. >> > >> > For example, computing a mean, or reduce operation, over numeric element >> > ("columns"). Before padded views it was possible to index by selecting >> > the relevant "columns" and view them as standard array. With padded >> > views that breaks and AFAICS, there is no way in numpy 1.14.0 to compute >> > a mean of some "columns". (I don't have numpy 1.14 to try or find a >> > workaround, like maybe looping over all relevant columns.) >> > >> > Josef >> >> Just to clarify, structured types have always had padding bytes, that >> isn't new. >> >> What *is* new (which we are pushing to 1.15, I think) is that it may be >> somewhat more common to end up with padding than before, and only if you >> are specifically using multi-field indexing, which is a fairly >> specialized case. >> >> I think recfunctions already account properly for padding bytes. Except >> for the bug in #8100, which we will fix, padding-bytes in recarrays are >> more or less invisible to a non-expert who only cares about >> dataframe-like behavior. >> >> In other words, padding is no obstacle at all to computing a mean over a >> column, and single-field indexes in 1.15 behave identically as before. >> The only thing that will change in 1.15 is multi-field indexing, and it >> has never been possible to compute a mean (or any binary operation) on >> multiple fields. >> > > from the example in the other thread > a[['b', 'c']].view(('f8', 2)).mean(0) > > > (from the statsmodels usecase: > read csv with genfromtext to get recarray or structured array > select/index the numeric columns > view them as standard array > do whatever we can do with standard numpy arrays > ) >
Or, to phrase it as a question: How do we get a standard array with homogeneous dtype from the corresponding elements of a structured dtype in numpy 1.14.0? Josef > > Josef > > >> >> Allan >> >> > >> > Cheers! >> > Ben Root >> > >> > On Mon, Jan 29, 2018 at 3:24 PM, <josef.p...@gmail.com >> > <mailto:josef.p...@gmail.com>> wrote: >> > >> > >> > >> > On Mon, Jan 29, 2018 at 2:55 PM, Stefan van der Walt >> > <stef...@berkeley.edu <mailto:stef...@berkeley.edu>> wrote: >> > >> > On Mon, 29 Jan 2018 14:10:56 -0500, josef.p...@gmail.com >> > <mailto:josef.p...@gmail.com> wrote: >> > >> > Given that there is pandas, xarray, dask and more, numpy >> > could as well drop >> > any pretense of supporting dataframe_likes. Or, adjust >> > the recfunctions so >> > we can still work dataframe_like with structured >> > dtypes/recarrays/recfunctions. >> > >> > >> > I haven't been following the duckarray discussion carefully, >> > but could >> > this be an opportunity for a dataframe protocol, so that we >> > can have >> > libraries ingest structured arrays, record arrays, pandas >> > dataframes, >> > etc. without too much specialized code? >> > >> > >> > AFAIU while not being in the data handling area, pandas defines >> > the interface and other libraries provide pandas compatible >> > interfaces or implementations. >> > >> > statsmodels currently still has recarray support and usage. In >> > some interfaces we support pandas, recarrays and plain arrays, >> > or anything where asarray works correctly. >> > >> > But recarrays became messy to support, one rewrite of some >> > functions last year converts recarrays to pandas, does the >> > manipulation and then converts back to recarrays. >> > Also we need to adjust our recarray usage with new numpy >> > versions. But there is no real benefit because I doubt that >> > statsmodels still has any recarray/structured dtype users. So, >> > we only have to remove our own uses in the datasets and unit >> tests. >> > >> > Josef >> > >> > >> > >> > >> > Stéfan >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org <mailto:NumPy-Discussion@pytho >> n.org> >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > <https://mail.python.org/mailman/listinfo/numpy-discussion> >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org >> > >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > <https://mail.python.org/mailman/listinfo/numpy-discussion> >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org> >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > <https://mail.python.org/mailman/listinfo/numpy-discussion> >> > >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion