Hi Stephan, thanks for the note. The progress over last two years wasn't impressive IMO, but I hope you'll manage.
As you suggest, I'll have a look at xarray too, as I see xarray.Dataset. I was sure that it doesn't work with non-homogeneous data at all, clearly I need to refresh my opinion. > 22 февр. 2017 г., в 20:55, Stephan Hoyer <sho...@gmail.com> написал(а): > > On Wed, Feb 22, 2017 at 8:57 AM, Alex Rogozhnikov <alex.rogozhni...@yandex.ru > <mailto:alex.rogozhni...@yandex.ru>> wrote: > Pandas may be nice, if you need a report, and you need get it done tomorrow. > Then you'll throw away the code. When we initially used pandas as main data > storage in yandex/rep, it looked like an good idea, but a year later it was > obvious this was a wrong decision. In case when you build data pipeline / > research that should be working several years later (using some other > installation by someone else), usage of pandas shall be minimal. > > The pandas development team (myself included) is well aware of these issues. > There are long term plans/hopes to fix this, but there's a lot of work to be > done and some hard choices to make: > https://github.com/pandas-dev/pandas/issues/10000 > <https://github.com/pandas-dev/pandas/issues/10000> > https://github.com/pandas-dev/pandas/issues/13862 > <https://github.com/pandas-dev/pandas/issues/13862> > > That's why I am looking for a reliable pandas substitute, which should be: > - completely consistent with numpy and should fail when this wasn't > implemented / impossible > - fewer new abstractions, nobody wants to learn > one-more-way-to-manipulate-the-data, specifically other researchers > - it may be less convenient for interactive data mungling > - in particular, less methods is ok > - written code should be interpretable, and hardly can be misinterpreted. > - not super slow, 1-10 gigabytes datasets are a normal situation > > This has some overlap with our motivations for writing Xarray > (http://xarray.pydata.org <http://xarray.pydata.org/>), so I encourage you to > take a look. It still might be more complex than you're looking for, but we > did try to clean up the really ambiguous APIs from pandas like indexing. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion