Jim, What do you say we start with ALS and then tackle glm?
Sent from my iPhone > On Feb 17, 2017, at 6:56 AM, Trevor Grant <trevor.d.gr...@gmail.com> wrote: > > Jim is right, and I would take it one further and say, it would be best to > implement GLMs https://en.wikipedia.org/wiki/Generalized_linear_model , > from there a Logistic regression is a trivial extension. > > Buyer beware- GLMs will be a bit of work- doable, but that would be jumping > in neck first for both Jim and Saikat... > > MAHOUT-1928 and MAHOUT-1929 > > https://issues.apache.org/jira/browse/MAHOUT-1925?jql=project%20%3D%20MAHOUT%20AND%20component%20%3D%20Algorithms%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC > > ^^ currently open JIRAs around Algorithms- you'll see Logistic and GLMs are > in there. > > If you have an algorithm you are particularly intimate with, or explicitly > need/want- feel free to open a JIRA and assign to yourself. > > There is also a case to be made for implementing the ALS... > > 1) It's a much better 'beginner' project. > 2) Mahout has some world class Recommenders, a toy ALS implementation might > help us think through how the other reccomenders (e.g. CCO) will 'fit' into > the framework. E.g. ALS being the toy-prototype reccomender that helps us > think through building out that section of the framework. > > > > Trevor Grant > Data Scientist > https://github.com/rawkintrevo > http://stackexchange.com/users/3002022/rawkintrevo > http://trevorgrant.org > > *"Fortunate is he, who is able to know the causes of things." -Virgil* > > >> On Fri, Feb 17, 2017 at 7:59 AM, Jim Jagielski <j...@jagunet.com> wrote: >> >> My own thoughts are that logistic regression seems a more "generalized" >> and hence more useful algo to be factored in... At least in the >> use cases that I've been toying with. >> >> So I'd like to help out with that if wanted... >> >>> On Feb 9, 2017, at 3:59 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote: >>> >>> Trevor et al, >>> >>> I'd like to contribute an algorithm or two in samsara using spark as I >> would like to do a compare and contrast with mahout with R server for a >> data science pipeline, machine learning repo that I'm working on, in >> looking at the list of algorithms (https://mahout.apache.org/ >> users/basics/algorithms.html) is there an algorithm for spark that would >> be beneficial for the community, my use cases would typically be around >> clustering or real time machine learning for building recommendations on >> the fly. The algorithms I see that could potentially be useful are: 1) >> Matrix Factorization with ALS 2) Logistic regression with SVD. >>> >>> Apache Mahout: Scalable machine learning and data mining< >> https://mahout.apache.org/users/basics/algorithms.html> >>> mahout.apache.org >>> Mahout 0.12.0 Features by Engine¶ Single Machine MapReduce Spark H2O >> Flink; Mahout Math-Scala Core Library and Scala DSL >>> >>> >>> >>> Any thoughts/guidance or recommendations would be very helpful. >>> Thanks in advance. >> >>