Re: PySpark / scikit-learn integration sprint at Cloudera - Strata Conference Friday 14th Feb 2014

2013-12-05 Thread Olivier Grisel
2013/12/4 Josh Rosen rosenvi...@gmail.com: Thanks for organizing this! I'll definitely be attending. Great. Looking forward to meet you to. Uri, you might want to register as well on the wiki :) -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel

PySpark - Dill serialization

2013-12-05 Thread Nick Pentreath
Hi devs I came across Dill ( http://trac.mystic.cacr.caltech.edu/project/pathos/wiki/dill) for Python serialization. Was wondering if it may be a replacement to the cloudpickle stuff (and remove that piece of code that needs to be maintained within PySpark)? Josh have you looked into Dill? Any

IntelliJ Scala Import Organizer Plugin

2013-12-05 Thread Aaron Davidson
Hi guys, just wanted to share a little plugin I wrote for IntelliJ to help auto-organize Scala imports. Anyone who has submitted a patch to Spark has probably felt the exhilaration of manually sorting and bucketing your imports. Well, now you can let your IDE have some fun! It's in the plugin

Re: PySpark - Dill serialization

2013-12-05 Thread Matei Zaharia
Looks cool! Josh, if you replace CloudPickle with this, make sure to also update the LICENSE file, which is supposed to contain third-party licenses. Matei On Dec 5, 2013, at 8:02 PM, Josh Rosen rosenvi...@gmail.com wrote: Thanks for the link! I wasn't aware of Dill, but it looks like a nice