Re: Any plans for new clustering algorithms?

2014-04-22 Thread Sandy Ryza
Thanks Matei. I added a section How to contribute page. On Mon, Apr 21, 2014 at 7:25 PM, Matei Zaharia matei.zaha...@gmail.comwrote: The wiki is actually maintained separately in https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage. We restricted editing of the wiki because bots

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Evan R. Sparks
...@gmail.com wrote: Hi, Spark developers. Are there any plans for implementing new clustering algorithms in MLLib? As far as I understand, current version of Spark ships with only one clustering algorithm - K-Means. I want to contribute to Spark and I'm thinking of adding more clustering

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Paul Brown
| London On Mon, Apr 21, 2014 at 4:39 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: Hi, Spark developers. Are there any plans for implementing new clustering algorithms in MLLib? As far as I understand, current version of Spark ships with only one clustering algorithm - K-Means

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Sean Owen
On Mon, Apr 21, 2014 at 6:03 PM, Paul Brown p...@mult.ifario.us wrote: - MLlib as Mahout.next would be a unfortunate. There are some gems in Mahout, but there are also lots of rocks. Setting a minimal bar of working, correctly implemented, and documented requires a surprising amount of work.

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Xiangrui Meng
+1 on Sean's comment. MLlib covers the basic algorithms but we definitely need to spend more time on how to make the design scalable. For example, think about current ProblemWithAlgorithm naming scheme. That being said, new algorithms are welcomed. I wish they are well-established and

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Nick Pentreath
I'd say a section in the how to contribute page would be a good place to put this. In general I'd say that the criteria for inclusion of an algorithm is it should be high quality, widely known, used and accepted (citations and concrete use cases as examples of this), scalable and

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Xiangrui Meng
Cannot agree more with your words. Could you add one section about how and what to contribute to MLlib's guide? -Xiangrui On Mon, Apr 21, 2014 at 1:41 PM, Nick Pentreath nick.pentre...@gmail.com wrote: I'd say a section in the how to contribute page would be a good place to put this. In

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Sandy Ryza
How do I get permissions to edit the wiki? On Mon, Apr 21, 2014 at 3:19 PM, Xiangrui Meng men...@gmail.com wrote: Cannot agree more with your words. Could you add one section about how and what to contribute to MLlib's guide? -Xiangrui On Mon, Apr 21, 2014 at 1:41 PM, Nick Pentreath

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Xiangrui Meng
The markdown files are under spark/docs. You can submit a PR for changes. -Xiangrui On Mon, Apr 21, 2014 at 6:01 PM, Sandy Ryza sandy.r...@cloudera.com wrote: How do I get permissions to edit the wiki? On Mon, Apr 21, 2014 at 3:19 PM, Xiangrui Meng men...@gmail.com wrote: Cannot agree more

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Sandy Ryza
I thought this might be a good thing to add to the wiki's How to contribute pagehttps://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark, as it's not tied to a release. On Mon, Apr 21, 2014 at 6:09 PM, Xiangrui Meng men...@gmail.com wrote: The markdown files are under

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Nan Zhu
I thought those are files of spark.apache.org? -- Nan Zhu On Monday, April 21, 2014 at 9:09 PM, Xiangrui Meng wrote: The markdown files are under spark/docs. You can submit a PR for changes. -Xiangrui On Mon, Apr 21, 2014 at 6:01 PM, Sandy Ryza sandy.r...@cloudera.com

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Matei Zaharia
The wiki is actually maintained separately in https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage. We restricted editing of the wiki because bots would automatically add stuff. I’ve given you permissions now. Matei On Apr 21, 2014, at 6:22 PM, Nan Zhu zhunanmcg...@gmail.com wrote: