subject:"Contribute code to MLlib"

Re: Contribute code to MLlib

2015-05-21 Thread Trevor Grant

Thank you Ram and Joseph. I am also hoping to contribute to MLib once my Scala gets up to snuff, this is the guidance I needed for how to proceed when ready. Best wishes, Trevor On Wed, May 20, 2015 at 1:55 PM, Joseph Bradley jos...@databricks.com wrote: Hi Trevor, I may be repeating what

Re: Contribute code to MLlib

2015-05-20 Thread Trevor Grant

Hey Ram, I'm not speaking to Tarek's package specifically but to the spirit of MLib. There are a number of method/algorithms for PCA, I'm not sure by what criterion the current one is considered 'standard'. It is rare to find ANY machine learning algo that is 'clearly better' than any other.

Re: Contribute code to MLlib

2015-05-20 Thread Ram Sriharsha

Hi Trevor Good point, I didn't mean that some algorithm has to be clearly better than another in every scenario to be included in MLLib. However, even if someone is willing to be the maintainer of a piece of code, it does not make sense to accept every possible algorithm into the core library.

Re: Contribute code to MLlib

2015-05-20 Thread Ram Sriharsha

Hi Trevor I'm attaching the MLLib contribution guideline here: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-MLlib-specificContributionGuidelines It speaks to widely known and accepted algorithms but not to whether an algorithm has to be better than

Re: Contribute code to MLlib

2015-05-20 Thread Joseph Bradley

Hi Trevor, I may be repeating what Ram said, but to 2nd it, a few points: We do want MLlib to become an extensive and rich ML library; as you said, scikit-learn is a great example. To make that happen, we of course need to include important algorithms. Important is hazy, but roughly means

Re: Contribute code to MLlib

2015-05-19 Thread Trevor Grant

There are most likely advantages and disadvantages to Tarek's algorithm against the current implementation, and different scenarios where each is more appropriate. Would we not offer multiple PCA algorithms and let the user choose? Trevor Trevor Grant Data Scientist *Fortunate is he, who is

Re: Contribute code to MLlib

2015-05-18 Thread Joseph Bradley

Hi Tarek, Thanks for your interest for checking the guidelines first! On 2 points: Algorithm: PCA is of course a critical algorithm. The main question is how your algorithm/implementation differs from the current PCA. If it's different and potentially better, I'd recommend opening up a JIRA

Contribute code to MLlib

2015-05-18 Thread Tarek Elgamal

Hi, I would like to contribute an algorithm to the MLlib project. I have implemented a scalable PCA algorithm on spark. It is scalable for both tall and fat matrices and the paper around it is accepted for publication in SIGMOD 2015 conference. I looked at the guidelines in the following link:

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Re: Contribute code to MLlib

Contribute code to MLlib

8 matches

Site Navigation

Mail list logo

Footer information