On Tue, Dec 6, 2011 at 6:02 AM, Olivier Grisel <olivier.gri...@ensta.org> wrote: > Hi all, > > My tutorial on scikit-learn at PyCon has been accepted. Would anybody > be interested in sprinting there? The sprint days are Mar. 12-15. > > http://us.pycon.org/2012/ > > I think Wes has submitted a talk on Pandas too. > > I would be very interested in sprinting on machine learning & data > analytics in the cloud using partitioned memory mapped arrays to > prototype a low overhead alternative to the Hadoop MapReduce runtime > optimized for numerical data and in-memory iterative processing, > probably leveraging IPython.parallel and POSIX sendfile [1]. > > Some Pandas idioms like groupBy and alignment would be interesting to > investigate in a distributed setting IMHO.
I don't know if my talk has been accepted or not, but I would be more than interested in that topic, so count me in to participate to that discussion. One thing that I think is crucial as soon as we talk about hadoop is some kind of stories to handle machine failure (which is the most interesting part of hadoop IMO). cheers, David ------------------------------------------------------------------------------ Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general