Thanks, this helps, I hope to have a proposal to dev outlining some use cases in the next few weeks.
> From: ap....@outlook.com > To: dev@mahout.apache.org > Subject: Re: Mahout contributions > Date: Fri, 29 Apr 2016 00:03:41 +0000 > > One last thing, Saikat, in answer to your question below. To clarify, for > proposed smaller scale mahout contributions (not on the roadmap or in > currently open Jiras): > a good workflow would be as follows: > > 1. Investigate your idea independently > 2. Float the proposal to dev@, > 3. Allow some time for feedback. > 4. Sketch out the problem independently > 5. If you decide to go on with your work Create a JIRA > 6. Begin work. > 7. When you're 70%-80% (or even 100%) finished with your work, open a PR for > review. > > I only mention this as it seems better to open the JIRA _Before_ you begin > your work rather than after as you mention below. As well It would probably > be best not to open multiple Jiras. > > Also you might want to take a look at: > http://www.apache.org/foundation/voting.html > > These are ways that people can vote and give feedback. As well as rules for > commiters voting in finished code. > > I think that should cover it. > > Andy > > ________________________________________ > From: Saikat Kanjilal <sxk1...@hotmail.com> > Sent: Thursday, April 28, 2016 12:08 PM > To: dev@mahout.apache.org > Subject: RE: Mahout contributions > > This is great information thank you, based on this recommendation I won't > create a JIRA but start work on my project and when the code approaches the > percentages you are describing I will create the appropriate JIRA's and put > together a proposal to send to the list, sound ok? Based on your latest > updates to the wiki i will work on a handful of the clustering algorithms > since I see that the Spark implementations for these are not yet complete. > Thank you again > > > From: ap....@outlook.com > > To: dev@mahout.apache.org > > Subject: Re: Mahout contributions > > Date: Thu, 28 Apr 2016 01:31:09 +0000 > > > > Saikat, > > > > One other thing that I should say is that you do not need clearance or > > input from the committers to begin work on your project, and the interest > > can and should come from the community as a whole. You can write proposal > > as you've done, and if you don't see any "+1"s or responses from the > > community at whole with in a few days, you may want to explain in more > > detail, give examples and use cases. If you are still not seeing +1s or > > any responses from others then I think you can assume that there may not be > > interest; this is usually how things work. > > > > However if its something that your passionate about and you feel like you > > can deliver this should not to stop you. People do not always read the > > dev@ emails or have time to respond. You can still move forward with your > > proposed contribution by following the steps laid out in my previous email; > > follow the protocol at: > > > > http://mahout.apache.org/developers/how-to-contribute.html > > > > and create a JIRA. When you have reached a significant amount of > > completion (around 70-80%), open a PR for review, this way you can explain > > in more detail. > > > > But please realize that when you open a JIRA for a new issue there is some > > expectation of a commitment on your part to complete it. > > > > For example, I am currently investigating some new plotting features. I > > have spent a good deal of time this week and last already and am even > > mocking up code as a sketch of what may become an implementation before I > > open a "New Feature" JIRA for it. > > > > My point is absolutely not to discourage you or anybody else from opening > > JIRAs for new features, rather to let you know that when you open an JIRA > > for a new issue, It tells others that your are working on it, and thus may > > discourage another with a similar idea to contribute this feature. So it > > is best to open it once you've begun your work and are committed to it. > > > > Andy > > > > ________________________________________ > > From: Saikat Kanjilal <sxk1...@hotmail.com> > > Sent: Wednesday, April 27, 2016 8:24 PM > > To: dev@mahout.apache.org > > Subject: RE: Mahout contributions > > > > Andrew,Thank you very much for your input, I actually want to start a new > > set of JIRAs, here's what I want to work on, I want to build a framework > > that ties together search/visualization capability with some machine > > learning algorithms, so essentially think of it as tying in elasticsearch > > and kibana into mahout , the user can search for their data with > > elasticsearch and for deeper analysis on that data they can feed that data > > into one or more mahout backends for analysis. Another interesting tie in > > might be to hack kibana to render ggplot like graphics based on the output > > of mahout algorithms (assuming this can be a kibana plugin). > > Before I go hog wild to create a bunch of JIRA's I'd like to know if > > there's interest in this initiative. The tool will bring together the ELK > > stack with dynamic machine learning algorithms. I can go into a lot more > > detail around use cases if there's enough interest. > > Looking forward to your and other committers input.Thanks > > > > > From: ap....@outlook.com > > > To: dev@mahout.apache.org > > > Subject: Re: Mahout contributions > > > Date: Wed, 27 Apr 2016 20:16:38 +0000 > > > > > > Hello Saikat, > > > > > > #1 and #2 above are already implemented. #4 is tricky so i would not > > > recommend without a strong knowledge of the codebase, and #5 is now > > > deprecated. (I've just updated the algorithms grid to reflect this). > > > The algorithms page includes both algorithms implemented in the > > > math-scala library and algorithms which have CLI drivers written for them. > > > > > > Please see: http://mahout.apache.org/developers/how-to-contribute.html > > > > > > And please note that per that documentation, it is in everybody's best > > > interest to keep messages on list, contacting committers directly is > > > discouraged. > > > > > > The best way to contribute (if you have not found a new bug or issue) > > > would be for you to pick a single open issue in the mahout JIRA which is > > > not already assigned, and start work on it. When your work is ready for > > > review, just open up a PR and the committers will review it. Please note > > > that if you do pick up an issue to work on, we do expect some amount of > > > responsibility and reliability and tangible amount of satisfactory work > > > since once you've marked a JIRA as something you're working on, others > > > will pass on it. > > > > > > Another good way to contribute would be to look for enhancements that > > > could make to existing code not necessarily open JIRAs that need to be > > > assigned to you. For example please see the recent contribution and > > > workflow on: https://issues.apache.org/jira/browse/MAHOUT-1833 . > > > > > > If you have something new that you'd like to implement, simply start a > > > new JIRA issue and begin work on it. In this case, when you have some > > > code that is ready for review, you can simply open up a PR for it and > > > committers will review it. For new implementations, we generally say > > > that you should do this when you are at least 70-80% finished with your > > > coding. > > > > > > Thank You, > > > > > > Andy > > > > > > > > > > > > ________________________________________ > > > From: Saikat Kanjilal <sxk1...@hotmail.com> > > > Sent: Tuesday, April 26, 2016 7:17 PM > > > To: dev@mahout.apache.org > > > Subject: RE: Mahout contributions > > > > > > Hello,Following up on my last email with more specifics, I've looked > > > through the wiki (https://mahout.apache.org/users/basics/algorithms.html) > > > and I'm interested in implementing the one or more of the following > > > algorithms with Mahout using spark: 1) Matrix Factorization with ALS 2) > > > Naive Bayes 3) Weighted Matrix Factorization, SVD++ 4) Sparse TF-IDF > > > Vectors from Text 5) Lucene integration. > > > Had a few questions:1) Which of these should I start with and where is > > > there the greatest need?2) Should I fork the repo and create branches for > > > the each of the above implementations?3) Should I go ahead and create > > > some JIRAs for these? > > > Would love to have some pointers to get started?Regards > > > > > > From: sxk1...@hotmail.com > > > To: dev@mahout.apache.org > > > Subject: Mahout contributions > > > Date: Wed, 30 Mar 2016 10:23:45 -0700 > > > > > > > > > > > > > > > Hello Committers,I was looking through the current jira tickets and was > > > wondering if there's a particular area of Mahout that needs some more > > > help than others, should I focus on contributing some algorithms usign > > > DSL or Samsara related efforts, I've finally got some bandwidth to do > > > some work and would love some guidance before assigning myself some > > > tickets.Regards