[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-12 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540268#comment-14540268
 ] 

Apache Spark commented on SPARK-6258:
-

User 'yanboliang' has created a pull request for this issue:
https://github.com/apache/spark/pull/6087

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Hrishikesh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530438#comment-14530438
 ] 

Hrishikesh commented on SPARK-6258:
---

[~yanboliang], you can start working on it.

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Yanbo Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530434#comment-14530434
 ] 

Yanbo Liang commented on SPARK-6258:


[~hrishikesh] Are you still work on this issue? If you are not working on it, I 
can take it. [~josephkb]

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531130#comment-14531130
 ] 

Joseph K. Bradley commented on SPARK-6258:
--

[~yanboliang]  That will be great--thanks!

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-04-28 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517470#comment-14517470
 ] 

Joseph K. Bradley commented on SPARK-6258:
--

About a question asked offline:
{quote}How can you pass the GaussianMixtureModel object to the 
trainGaussianMixture method in PythonMLlibAPI.scala?{quote}
It's better to pass simple objects such as native types (float, int, etc.) or 
basic data structures (arrays, etc.).  For this task, only parameters need to 
be passed, which can be done following the many other examples in 
PythonMLLibAPI.scala.  If you had to pass a complex object, it would be best to 
deconstruct it into simple types.

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-03-30 Thread Hrishikesh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386394#comment-14386394
 ] 

Hrishikesh commented on SPARK-6258:
---

Hi [~josephkb]
I am a newbie to spark and I would like to contribute. Could you assign this 
ticket to me?

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-03-30 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387115#comment-14387115
 ] 

Joseph K. Bradley commented on SPARK-6258:
--

[~hrishikesh], glad to hear you're interested!  I'd recommend picking off one 
of these tasks.  I just created another JIRA for part of this task which should 
be a good one to start with: [SPARK-6612]  Does that sound good?  Also, please 
check out this guide; we try to follow these guidelines closely: 
[https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark]

If you have implementation questions, we can discuss them on github after you 
send a PR.

Thanks!

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-03-30 Thread Hrishikesh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388003#comment-14388003
 ] 

Hrishikesh commented on SPARK-6258:
---

[~josephkb] Thank you for your response and valuable suggestions! Will send the 
PR asap.

 Python MLlib API missing items: Clustering
 --

 Key: SPARK-6258
 URL: https://issues.apache.org/jira/browse/SPARK-6258
 Project: Spark
  Issue Type: Sub-task
  Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley

 This JIRA lists items missing in the Python API for this sub-package of MLlib.
 This list may be incomplete, so please check again when sending a PR to add 
 these features to the Python API.
 Also, please check for major disparities between documentation; some parts of 
 the Python API are less well-documented than their Scala counterparts.  Some 
 items may be listed in the umbrella JIRA linked to this task.
 KMeans
 * setEpsilon
 * setInitializationSteps
 KMeansModel
 * computeCost
 * k
 GaussianMixture
 * setInitialModel
 GaussianMixtureModel
 * k
 Completely missing items which should be fixed in separate JIRAs (which have 
 been created and linked to the umbrella JIRA)
 * LDA
 * PowerIterationClustering
 * StreamingKMeans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org