RE: Mahout contributions

Saikat Kanjilal Thu, 28 Apr 2016 09:09:18 -0700

This is great information thank you, based on this recommendation I won't 
create a JIRA but start work on my project and when the code approaches the 
percentages you are describing I will create the appropriate JIRA's and put 
together a proposal to send to the list, sound ok?  Based on your latest 
updates to the wiki i will work on a handful of the clustering algorithms since 
I see that the Spark implementations for these are not yet complete.
Thank you again


> From: [email protected]
> To: [email protected]
> Subject: Re: Mahout contributions
> Date: Thu, 28 Apr 2016 01:31:09 +0000
> 
> Saikat, 
> 
> One other thing that I should say is that you do not need clearance or input 
> from the committers to begin work on your project, and the interest can and 
> should come from the community as a whole. You can write proposal as you've 
> done, and if you don't see any "+1"s or responses from the community at whole 
> with in a few days, you may want to explain in more detail, give examples and 
> use cases.  If you are still not seeing +1s or any responses from others then 
> I think you can assume that there may not be interest; this is usually how 
> things work.  
> 
> However if its something that your passionate about and you feel like you can 
> deliver this should not to stop you.  People do not always read the dev@ 
> emails or have time to respond.  You can still move forward with your 
> proposed contribution by following the steps laid out in my previous email; 
> follow the protocol at:
>  
> http://mahout.apache.org/developers/how-to-contribute.html
> 
> and create a JIRA.  When you have reached a significant amount of completion 
> (around 70-80%), open a PR for review, this way you can explain in more 
> detail. 
> 
> But please realize that when you open a JIRA for a new issue there is some 
> expectation of a commitment on your part to complete it. 
> 
> For example, I am currently investigating some new plotting features.  I have 
> spent a good deal of time this week and last already and am even mocking up 
> code as a sketch of what may become an implementation before I open a "New 
> Feature" JIRA for it.    
> 
> My point is absolutely not to discourage you or anybody else from opening 
> JIRAs for new features, rather to let you know that when you open an JIRA for 
> a new issue, It tells others that your are working on it, and thus may 
> discourage another with a similar idea to contribute this feature.  So it is 
> best to open it once you've begun your work and are committed to it.
>   
> Andy
> 
> ________________________________________
> From: Saikat Kanjilal <[email protected]>
> Sent: Wednesday, April 27, 2016 8:24 PM
> To: [email protected]
> Subject: RE: Mahout contributions
> 
> Andrew,Thank you very much for your input, I actually want to start a new set 
> of JIRAs, here's what I want to work on, I want to build a framework that 
> ties together search/visualization capability with some machine learning 
> algorithms, so essentially think of it as tying in elasticsearch and kibana  
> into mahout , the user can search for their data with elasticsearch and for 
> deeper analysis on that data they can feed that data into one or more mahout 
> backends for analysis.  Another interesting tie in might be to hack kibana to 
> render ggplot like graphics based on the output of mahout algorithms 
> (assuming this can be a kibana plugin).
> Before I go hog wild to create a bunch of JIRA's I'd like to know if there's 
> interest in this initiative.  The tool will bring together the ELK stack with 
> dynamic machine learning algorithms.  I can go into a lot more detail around 
> use cases if there's enough interest.
> Looking forward to your and other committers input.Thanks
> 
> > From: [email protected]
> > To: [email protected]
> > Subject: Re: Mahout contributions
> > Date: Wed, 27 Apr 2016 20:16:38 +0000
> >
> > Hello Saikat,
> >
> > #1 and #2 above are already implemented.  #4 is tricky so i would not 
> > recommend without a strong knowledge of the codebase, and #5 is now 
> > deprecated.  (I've just updated the algorithms grid to reflect this).  The 
> > algorithms page includes both algorithms implemented in the math-scala 
> > library and algorithms which have CLI drivers written for them.
> >
> > Please see: http://mahout.apache.org/developers/how-to-contribute.html
> >
> > And please note that per that documentation, it is in everybody's best 
> > interest to keep messages on list, contacting committers directly is 
> > discouraged.
> >
> > The best way to contribute (if you have not found a new bug or issue) would 
> > be for you to pick a single open issue in the mahout JIRA which is not 
> > already assigned, and start work on it.  When your work is ready for 
> > review, just open up a PR and the committers will review it.  Please note 
> > that if you do pick up an issue to work on, we do expect some amount of 
> > responsibility and reliability and tangible amount of satisfactory work 
> > since once you've marked a JIRA as something you're working on, others will 
> > pass on it.
> >
> > Another good way to contribute would be to look for enhancements that could 
> > make to existing code not necessarily open JIRAs that need to be assigned 
> > to you.  For example please see the recent contribution and workflow on: 
> > https://issues.apache.org/jira/browse/MAHOUT-1833 .
> >
> > If you have something new that you'd like to implement, simply start a new 
> > JIRA issue and begin work on it.  In this case, when you have some code 
> > that is ready for review,  you can simply open up a PR for it and 
> > committers will review it.  For new implementations, we generally say that 
> > you should do this when you are at least 70-80% finished with your coding.
> >
> > Thank You,
> >
> > Andy
> >
> >
> >
> > ________________________________________
> > From: Saikat Kanjilal <[email protected]>
> > Sent: Tuesday, April 26, 2016 7:17 PM
> > To: [email protected]
> > Subject: RE: Mahout contributions
> >
> > Hello,Following up on my last email with more specifics,  I've looked 
> > through the wiki (https://mahout.apache.org/users/basics/algorithms.html) 
> > and I'm interested in implementing the one or more of the following 
> > algorithms with Mahout using spark: 1) Matrix Factorization with ALS 2) 
> > Naive Bayes 3) Weighted Matrix Factorization, SVD++ 4) Sparse TF-IDF 
> > Vectors from Text 5) Lucene integration.
> > Had a few questions:1) Which of these should I start with and where is 
> > there the greatest need?2) Should I fork the repo and create branches for 
> > the each of the above implementations?3) Should I go ahead and create some 
> > JIRAs for these?
> > Would love to have some pointers to get started?Regards
> >
> > From: [email protected]
> > To: [email protected]
> > Subject: Mahout contributions
> > Date: Wed, 30 Mar 2016 10:23:45 -0700
> >
> >
> >
> >
> > Hello Committers,I was looking through the current jira tickets and was 
> > wondering if there's a particular area of Mahout that needs some more help 
> > than others, should I focus on contributing some algorithms usign DSL or 
> > Samsara related efforts, I've finally got some bandwidth to do some work 
> > and would love some guidance before assigning myself some tickets.Regards

RE: Mahout contributions

Reply via email to