+1 to glms


Sent from my Verizon Wireless 4G LTE smartphone


-------- Original message --------
From: Trevor Grant <trevor.d.gr...@gmail.com>
Date: 02/17/2017 6:56 AM (GMT-08:00)
To: dev@mahout.apache.org
Subject: Re: Contributing an algorithm for samsara

Jim is right, and I would take it one further and say, it would be best to
implement GLMs https://en.wikipedia.org/wiki/Generalized_linear_model ,
from there a Logistic regression is a trivial extension.

Buyer beware- GLMs will be a bit of work- doable, but that would be jumping
in neck first for both Jim and Saikat...

MAHOUT-1928 and MAHOUT-1929

https://issues.apache.org/jira/browse/MAHOUT-1925?jql=project%20%3D%20MAHOUT%20AND%20component%20%3D%20Algorithms%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC

^^ currently open JIRAs around Algorithms- you'll see Logistic and GLMs are
in there.

If you have an algorithm you are particularly intimate with, or explicitly
need/want- feel free to open a JIRA and assign to yourself.

There is also a case to be made for implementing the ALS...

1) It's a much better 'beginner' project.
2) Mahout has some world class Recommenders, a toy ALS implementation might
help us think through how the other reccomenders (e.g. CCO) will 'fit' into
the framework. E.g. ALS being the toy-prototype reccomender that helps us
think through building out that section of the framework.



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Fri, Feb 17, 2017 at 7:59 AM, Jim Jagielski <j...@jagunet.com> wrote:

> My own thoughts are that logistic regression seems a more "generalized"
> and hence more useful algo to be factored in... At least in the
> use cases that I've been toying with.
>
> So I'd like to help out with that if wanted...
>
> > On Feb 9, 2017, at 3:59 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
> >
> > Trevor et al,
> >
> > I'd like to contribute an algorithm or two in samsara using spark as I
> would like to do a compare and contrast with mahout with R server for a
> data science pipeline, machine learning repo that I'm working on, in
> looking at the list of algorithms (https://mahout.apache.org/
> users/basics/algorithms.html) is there an algorithm for spark that would
> be beneficial for the community, my use cases would typically be around
> clustering or real time machine learning for building recommendations on
> the fly.    The algorithms I see that could potentially be useful are: 1)
> Matrix Factorization with ALS 2) Logistic regression with SVD.
> >
> > Apache Mahout: Scalable machine learning and data mining<
> https://mahout.apache.org/users/basics/algorithms.html>
> > mahout.apache.org
> > Mahout 0.12.0 Features by EngineĀ¶ Single Machine MapReduce Spark H2O
> Flink; Mahout Math-Scala Core Library and Scala DSL
> >
> >
> >
> > Any thoughts/guidance or recommendations would be very helpful.
> > Thanks in advance.
>
>

Reply via email to