Hi Stephan, 
thanks for the note. The progress over last two years wasn't impressive IMO, 
but I hope you'll manage.

As you suggest, I'll have a look at xarray too, as I see xarray.Dataset. 
I was sure that it doesn't work with non-homogeneous data at all, clearly I 
need to refresh my opinion.



> 22 февр. 2017 г., в 20:55, Stephan Hoyer <sho...@gmail.com> написал(а):
> 
> On Wed, Feb 22, 2017 at 8:57 AM, Alex Rogozhnikov <alex.rogozhni...@yandex.ru 
> <mailto:alex.rogozhni...@yandex.ru>> wrote:
> Pandas may be nice, if you need a report, and you need get it done tomorrow. 
> Then you'll throw away the code. When we initially used pandas as main data 
> storage in yandex/rep, it looked like an good idea, but a year later it was 
> obvious this was a wrong decision. In case when you build data pipeline / 
> research that should be working several years later (using some other 
> installation by someone else), usage of pandas shall be minimal. 
> 
> The pandas development team (myself included) is well aware of these issues. 
> There are long term plans/hopes to fix this, but there's a lot of work to be 
> done and some hard choices to make:
> https://github.com/pandas-dev/pandas/issues/10000 
> <https://github.com/pandas-dev/pandas/issues/10000>
> https://github.com/pandas-dev/pandas/issues/13862 
> <https://github.com/pandas-dev/pandas/issues/13862> 
> 
>  That's why I am looking for a reliable pandas substitute, which should be: 
> - completely consistent with numpy and should fail when this wasn't 
> implemented / impossible
> - fewer new abstractions, nobody wants to learn 
> one-more-way-to-manipulate-the-data, specifically other researchers
> - it may be less convenient for interactive data mungling
>   - in particular, less methods is ok
> - written code should be interpretable, and hardly can be misinterpreted.
> - not super slow, 1-10 gigabytes datasets are a normal situation
> 
> This has some overlap with our motivations for writing Xarray 
> (http://xarray.pydata.org <http://xarray.pydata.org/>), so I encourage you to 
> take a look. It still might be more complex than you're looking for, but we 
> did try to clean up the really ambiguous APIs from pandas like indexing.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to