Hello Trevor,

Apache Software Foundation has been accepted as a mentoring organization in
GSOC '17. Congratulations on that!

When you mentioned GLM, I hope you were referring to the Issue
https://issues.apache.org/jira/browse/MAHOUT-1941 ? I will get in touch
with Saikat through the mailing list.

Also, I've been giving some thought of implementing a new algorithm. I am
familiar with most Data Mining algorithms (Apriori, Decision Tree, Various
types of clustering algorithms) as well as basic Machine Learning
algorithms like Regression, Neural Networks.

A clarification in regard to new algorithm development. Is there a plan for
incorporating new algorithms (other than Generalized Linear Model) in the
near future. Or any improvements / optimizations to existing algorithms?

Regards,
Aditya






On Sun, Feb 26, 2017 at 5:03 AM, Trevor Grant <trevor.d.gr...@gmail.com>
wrote:

> You are correct, Java and Scala are very similar.  In fact you can import
> Java into scala (and theoretically the other way around too).
>
> Documentation... yea- that would be great... I'm trying to get the website
> migrated from CMS (current system) to a Jekyll based system similar to
> Zeppelin or Flink... in the mean time, everyone I think is kind of
> standing-by on writing new docs.  Including me.
>
> The only named algorithm in the road map right now is GLM- talk to Saikat
> and Jim I think are taking a crack at that.
>
> You are free (and encouraged) to implement algorithms you already
> understand well.
>
> As far as ASF in GSoC, candidly I don't know for sure.  I'm not super
> familiar with that initiative.  You could ask on d...@community.apache.org
> -
> I remember seeing some chatter over there.
>
> best,
> tg
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Sat, Feb 25, 2017 at 3:36 AM, Aditya <adityasarma...@gmail.com> wrote:
>
> > Hello Trevor,
> >
> > I have gone through the two links that you sent me. Although I am not
> > familiar with scala, I was able to figure out that the files
> Fitter.scala,
> > Model.scala, UnsupervisedFitter.scala contain traits (which are similar
> to
> > interfaces in Java) and the LinearRegressionModel.scala contains the core
> > code for regression. I wasn't able to understand specific syntactic terms
> > like *trait LinearRegressionModel[K] extends RegressionModel[K]*, what
> is K
> > here?
> >
> > With respect to my knowledge in Scala, I've never had the opportunity to
> > learn / work in Scala but I got a sense that it's model is similar to
> that
> > of Java. Having worked in Java, I could see some basic similarities in
> both
> > the languages' models. I've read that Scala is a language where OOP meets
> > the functional paradigm.
> >
> > Also, Could you let me know where I could find the list of algorithms
> that
> > Mahout implements along with its documentation and what algorithms are
> > planned to be implemented soon?
> > The main web page just lists down the names.
> >
> > Thanks,
> > Aditya
> >
> >
> > On Thu, Feb 23, 2017 at 6:57 PM, Trevor Grant <trevor.d.gr...@gmail.com>
> > wrote:
> >
> > > Hey Aditya-
> > >
> > > First of all, welcome to the community.  We'd love to have you help
> > > contribute.
> > >
> > > The new algorithms framework is certainly a 'target rich environment'.
> > >
> > > Since you already are familiar with DBSCAN, why not start there?
> > >
> > > If you check out:
> > > https://github.com/apache/mahout/tree/master/math-scala/
> > > src/main/scala/org/apache/mahout/math/algorithms
> > >
> > > You'll see in general what our framework looks like..
> > >
> > > You'll need to create a ClassificationModel trait similar to:
> > > https://github.com/apache/mahout/blob/master/math-scala/
> > > src/main/scala/org/apache/mahout/math/algorithms/
> > > regression/RegressorModel.scala
> > >
> > > Then you'll extend the the ClassificationModel with DBSCAN (or possibly
> > > some intermediate trait, as LinearRegressionModel does before OLS).
> > >
> > > Perhaps I should have started by asking- how well do you know scala?
> > >
> > > Anyway, those are good places to get started! Let me know if I can
> help.
> > >
> > > tg
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> > >
> > >
> > > On Wed, Feb 22, 2017 at 4:23 PM, Aditya <adityasarma...@gmail.com>
> > wrote:
> > >
> > > > Hello everyone!
> > > >
> > > > I'm a senior year computer science student from Birla Institute of
> > > > Technology and Science, India. I have experience in fields like Data
> > > Mining
> > > > and Machine Learning. Apart from doing basic coursework which
> included
> > > Data
> > > > Mining, Parallel Computing, and Machine Learning I have also worked
> on
> > > > research projects where I worked on building scalable DBSCAN like
> > > > clustering algorithms.
> > > >
> > > > I have gone through the Apache Mahout website and was wondering if I
> > > > could *contribute
> > > > to Mahout in terms of algorithm **development / improvising existing
> > > > algorithms.*
> > > >
> > > > I would be grateful if you could provide me with a starting point,
> from
> > > > where I can pick up and understand the Mahout ecosystem. I have no
> > > previous
> > > > experience in working with Apache Mahout or Spark but I have worked
> > with
> > > > the Map reduce model before (but haven't used Hadoop)
> > > >
> > > > I wish to work full time during summer and take part in the Google
> > Summer
> > > > of Code 2017 program by contributing to Apache Mahout.
> > > >
> > > >
> > > > Awaiting your replies!
> > > >
> > > > Cheers!
> > > > Aditya
> > > >
> > >
> >
>

Reply via email to