s for you around these
> Date: Wed, 26 Mar 2014 10:31:38 -0700
> Subject: Re: Mahout on Spark
> From: dlie...@gmail.com
> To: dev@mahout.apache.org
>
> No, we probably don't want to create them unless we have someone to assign
> them to. You are more than welcome create on
es? I'd like to
> volunteer to take on the shell and the R bindings , should I create JIRA
> items for these?
>
> > Date: Wed, 26 Mar 2014 10:12:01 -0700
> > Subject: Re: Mahout on Spark
> > From: dlie...@gmail.com
> > To: sxk1...@hotmail.com
> > CC: dev@m
@DmitryAre there JIRA items created for the wanted pieces? I'd like to
volunteer to take on the shell and the R bindings , should I create JIRA items
for these?
> Date: Wed, 26 Mar 2014 10:12:01 -0700
> Subject: Re: Mahout on Spark
> From: dlie...@gmail.com
> To: sxk1...@hotm
Sure.
@Saikat et al:
Check out the http://mahout.apache.org/users/sparkbindings/home.html "Wanted"
section.
Of course, data frames and vectorization(feature prep) standardization is
very high priority there.
Another high priority is interactive shell /scripting (just like spark
shell). Something
+1, in fact I would be very much indebted if someone (namely Dmitry :) ) could
do a google hangout focused on spark where folks can ask questions and learn
more, to this end I want to bring up something else, it'd be great if mahout
itself either through the apache project foundation or through
MLlib may be less production tested than Mahout that is true, but I would
say Spark is heavily production tested and getting close to a true 1.0
release. Why do you favour Hadoop for "sturdiness"? Spark uses HDFS as an
input source (or any Hadoop InputFormat) so benefits from the same fault
toleran
On Wednesday, February 19, 2014 7:22 PM, Ted Dunning
wrote:
On Wed, Feb 19, 2014 at 1:55 PM, peng wrote:
> But maybe mahout can include contribs that M/R is not fit for, like
> downpour SGD or graph-based algorithms?
>
Yes. Absolutely.
Downpour SGD is #1 on my list of features for 1.
On Wed, Feb 19, 2014 at 1:55 PM, peng wrote:
> But maybe mahout can include contribs that M/R is not fit for, like
> downpour SGD or graph-based algorithms?
>
Yes. Absolutely.
I was suggested to switch to MLlib for its performance, but I doubt if
that is production ready, even if it is I would still favour hadoop's
sturdiness and self-healing.
But maybe mahout can include contribs that M/R is not fit for, like
downpour SGD or graph-based algorithms?
On Wed 19 Feb 20
To set expectations appropriately, I think it's important to point out
this is completely infeasible short of a total rewrite, and I can't
imagine that will happen. It may not be obvious if you haven't looked
at the code how completely dependent on M/R it is.
You can swap out M/R and Spark if you
I imagine in Mahout offering an option to the users to select from
different execution engines (just like we currently do by giving M/R or
sequential options), and starting from Spark. I am not sure what changes
needed in the codebase, though. Maybe following MLI (or alike) and
implementing some mo
11 matches
Mail list logo