The problem seems to be that unpicklable RDD objects are being pulled into
function closures. In your failing dockets, it looks like the rdd created
through sc.parallelize is being pulled into the map lambda’s function closure.
I opened a new Dill bug with a small test case that reproduces this
Well, as you said, MLLib already supports GLM in a sense. Except they only
support two link functions - identity (linear regression) and logit
(logistic regression). It should not be too hard to add other link
functions, as all you have to do is add a different gradient function for
Poisson/Gamma,
Hey,
On Mon, Jun 23, 2014 at 5:27 PM, Mark Baker wrote:
> Thanks for the context, Josh.
>
> I've gone ahead and created a new test case and just opened a new issue;
>
> https://github.com/uqfoundation/dill/issues/49
So that one's dealt with; it was a sys.prefix issue with me using a
virtualenv a
Yep exactly! I’m not sure how complicated it would be to pull off. If someone
wouldn’t mind helping to get me pointed in the right direction I would be happy
to look into and contribute this functionality. I imagine this would be
implemented in the scheduler codebase and there would be some s