Hello Saikat, #1 and #2 above are already implemented. #4 is tricky so i would not recommend without a strong knowledge of the codebase, and #5 is now deprecated. (I've just updated the algorithms grid to reflect this). The algorithms page includes both algorithms implemented in the math-scala library and algorithms which have CLI drivers written for them.
Please see: http://mahout.apache.org/developers/how-to-contribute.html And please note that per that documentation, it is in everybody's best interest to keep messages on list, contacting committers directly is discouraged. The best way to contribute (if you have not found a new bug or issue) would be for you to pick a single open issue in the mahout JIRA which is not already assigned, and start work on it. When your work is ready for review, just open up a PR and the committers will review it. Please note that if you do pick up an issue to work on, we do expect some amount of responsibility and reliability and tangible amount of satisfactory work since once you've marked a JIRA as something you're working on, others will pass on it. Another good way to contribute would be to look for enhancements that could make to existing code not necessarily open JIRAs that need to be assigned to you. For example please see the recent contribution and workflow on: https://issues.apache.org/jira/browse/MAHOUT-1833 . If you have something new that you'd like to implement, simply start a new JIRA issue and begin work on it. In this case, when you have some code that is ready for review, you can simply open up a PR for it and committers will review it. For new implementations, we generally say that you should do this when you are at least 70-80% finished with your coding. Thank You, Andy ________________________________________ From: Saikat Kanjilal <sxk1...@hotmail.com> Sent: Tuesday, April 26, 2016 7:17 PM To: dev@mahout.apache.org Subject: RE: Mahout contributions Hello,Following up on my last email with more specifics, I've looked through the wiki (https://mahout.apache.org/users/basics/algorithms.html) and I'm interested in implementing the one or more of the following algorithms with Mahout using spark: 1) Matrix Factorization with ALS 2) Naive Bayes 3) Weighted Matrix Factorization, SVD++ 4) Sparse TF-IDF Vectors from Text 5) Lucene integration. Had a few questions:1) Which of these should I start with and where is there the greatest need?2) Should I fork the repo and create branches for the each of the above implementations?3) Should I go ahead and create some JIRAs for these? Would love to have some pointers to get started?Regards From: sxk1...@hotmail.com To: dev@mahout.apache.org Subject: Mahout contributions Date: Wed, 30 Mar 2016 10:23:45 -0700 Hello Committers,I was looking through the current jira tickets and was wondering if there's a particular area of Mahout that needs some more help than others, should I focus on contributing some algorithms usign DSL or Samsara related efforts, I've finally got some bandwidth to do some work and would love some guidance before assigning myself some tickets.Regards