Hello Trevor, Apache Software Foundation has been accepted as a mentoring organization in GSOC '17. Congratulations on that!
When you mentioned GLM, I hope you were referring to the Issue https://issues.apache.org/jira/browse/MAHOUT-1941 ? I will get in touch with Saikat through the mailing list. Also, I've been giving some thought of implementing a new algorithm. I am familiar with most Data Mining algorithms (Apriori, Decision Tree, Various types of clustering algorithms) as well as basic Machine Learning algorithms like Regression, Neural Networks. A clarification in regard to new algorithm development. Is there a plan for incorporating new algorithms (other than Generalized Linear Model) in the near future. Or any improvements / optimizations to existing algorithms? Regards, Aditya On Sun, Feb 26, 2017 at 5:03 AM, Trevor Grant <trevor.d.gr...@gmail.com> wrote: > You are correct, Java and Scala are very similar. In fact you can import > Java into scala (and theoretically the other way around too). > > Documentation... yea- that would be great... I'm trying to get the website > migrated from CMS (current system) to a Jekyll based system similar to > Zeppelin or Flink... in the mean time, everyone I think is kind of > standing-by on writing new docs. Including me. > > The only named algorithm in the road map right now is GLM- talk to Saikat > and Jim I think are taking a crack at that. > > You are free (and encouraged) to implement algorithms you already > understand well. > > As far as ASF in GSoC, candidly I don't know for sure. I'm not super > familiar with that initiative. You could ask on d...@community.apache.org > - > I remember seeing some chatter over there. > > best, > tg > > > Trevor Grant > Data Scientist > https://github.com/rawkintrevo > http://stackexchange.com/users/3002022/rawkintrevo > http://trevorgrant.org > > *"Fortunate is he, who is able to know the causes of things." -Virgil* > > > On Sat, Feb 25, 2017 at 3:36 AM, Aditya <adityasarma...@gmail.com> wrote: > > > Hello Trevor, > > > > I have gone through the two links that you sent me. Although I am not > > familiar with scala, I was able to figure out that the files > Fitter.scala, > > Model.scala, UnsupervisedFitter.scala contain traits (which are similar > to > > interfaces in Java) and the LinearRegressionModel.scala contains the core > > code for regression. I wasn't able to understand specific syntactic terms > > like *trait LinearRegressionModel[K] extends RegressionModel[K]*, what > is K > > here? > > > > With respect to my knowledge in Scala, I've never had the opportunity to > > learn / work in Scala but I got a sense that it's model is similar to > that > > of Java. Having worked in Java, I could see some basic similarities in > both > > the languages' models. I've read that Scala is a language where OOP meets > > the functional paradigm. > > > > Also, Could you let me know where I could find the list of algorithms > that > > Mahout implements along with its documentation and what algorithms are > > planned to be implemented soon? > > The main web page just lists down the names. > > > > Thanks, > > Aditya > > > > > > On Thu, Feb 23, 2017 at 6:57 PM, Trevor Grant <trevor.d.gr...@gmail.com> > > wrote: > > > > > Hey Aditya- > > > > > > First of all, welcome to the community. We'd love to have you help > > > contribute. > > > > > > The new algorithms framework is certainly a 'target rich environment'. > > > > > > Since you already are familiar with DBSCAN, why not start there? > > > > > > If you check out: > > > https://github.com/apache/mahout/tree/master/math-scala/ > > > src/main/scala/org/apache/mahout/math/algorithms > > > > > > You'll see in general what our framework looks like.. > > > > > > You'll need to create a ClassificationModel trait similar to: > > > https://github.com/apache/mahout/blob/master/math-scala/ > > > src/main/scala/org/apache/mahout/math/algorithms/ > > > regression/RegressorModel.scala > > > > > > Then you'll extend the the ClassificationModel with DBSCAN (or possibly > > > some intermediate trait, as LinearRegressionModel does before OLS). > > > > > > Perhaps I should have started by asking- how well do you know scala? > > > > > > Anyway, those are good places to get started! Let me know if I can > help. > > > > > > tg > > > > > > Trevor Grant > > > Data Scientist > > > https://github.com/rawkintrevo > > > http://stackexchange.com/users/3002022/rawkintrevo > > > http://trevorgrant.org > > > > > > *"Fortunate is he, who is able to know the causes of things." -Virgil* > > > > > > > > > On Wed, Feb 22, 2017 at 4:23 PM, Aditya <adityasarma...@gmail.com> > > wrote: > > > > > > > Hello everyone! > > > > > > > > I'm a senior year computer science student from Birla Institute of > > > > Technology and Science, India. I have experience in fields like Data > > > Mining > > > > and Machine Learning. Apart from doing basic coursework which > included > > > Data > > > > Mining, Parallel Computing, and Machine Learning I have also worked > on > > > > research projects where I worked on building scalable DBSCAN like > > > > clustering algorithms. > > > > > > > > I have gone through the Apache Mahout website and was wondering if I > > > > could *contribute > > > > to Mahout in terms of algorithm **development / improvising existing > > > > algorithms.* > > > > > > > > I would be grateful if you could provide me with a starting point, > from > > > > where I can pick up and understand the Mahout ecosystem. I have no > > > previous > > > > experience in working with Apache Mahout or Spark but I have worked > > with > > > > the Map reduce model before (but haven't used Hadoop) > > > > > > > > I wish to work full time during summer and take part in the Google > > Summer > > > > of Code 2017 program by contributing to Apache Mahout. > > > > > > > > > > > > Awaiting your replies! > > > > > > > > Cheers! > > > > Aditya > > > > > > > > > >