[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-12-31 Thread FlytxtRnD
Github user FlytxtRnD closed the pull request at: https://github.com/apache/spark/pull/7320 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-12-31 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/7320#issuecomment-168222327 @jkbradley Thank you for the review. I will close this issue and will surely help in testing the other PR. Also it would be great if you could take a look at #6880

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-11-13 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-156399657 @yu-iskw @jkbradley any other review comments, please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-11-03 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-153564736 @yu-iskw I didn't get your comment on @Since tags. We will be waiting for further review comments. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-11-03 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-153329563 @yu-iskw PR is updated. Shall I include @since to the methods? Or is it done after getting merged? Please provide any other suggestions, if any. --- If your project

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-11-02 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r43622644 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala --- @@ -0,0 +1,279 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-11-01 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-152919710 Thank you @yu-iskw for the review comments.. Will update the PR asap --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-6724] [MLlib] Support model save/load f...

2015-10-26 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/9267#discussion_r42964572 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala --- @@ -20,17 +20,28 @@ package org.apache.spark.mllib.fpm import java.{util

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-10-19 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-149156272 @mengxr Could you please have a look into this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-09-27 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-143643138 @mengxr @jkbradley I have incorporated the suggestions and changes and updated the PR. Could you please take another look ? --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-09-17 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r39827805 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala --- @@ -0,0 +1,247 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-09-11 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r39255547 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala --- @@ -0,0 +1,247 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-09-10 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-139170303 @mengxr Thank you for all the suggestions. Will update soon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-09-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-136956338 @mengxr We have updated the JIRA ticket to include the benchmark results as well..Could you please take a look and give your suggestions? --- If your project is set

[GitHub] spark pull request: [SPARK-6517][mllib] Implement the Algorithm of...

2015-07-28 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/5267#discussion_r35728733 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala --- @@ -0,0 +1,645 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-6517][mllib] Implement the Algorithm of...

2015-07-28 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/5267#discussion_r35728653 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala --- @@ -0,0 +1,645 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-07-28 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-125838477 Thank you @mengxr . We will take a look into the PR you mentioned.We are looking forward to have DP-Means in the 1.6 release. Thanks a lot for your kind support

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-07-28 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-125571416 @jkbradley To generate docs, I installed jekyll. In jekyll build command, it is showing error. `[info] Done updating. [error] (catalyst/compile:compile

[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-07-23 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/7320#issuecomment-124063686 @jkbradley Plz have a look at the code. I guess the save function is ok. Now I am facing issues with Loader. Can you please help me with passing the type parameter

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-07-19 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-122671172 @mengxr @jkbradley Gentle remainder. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-07-15 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/7320#issuecomment-121588925 @jkbradley We will give it a try and let you know if we are not able to move forward. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-14 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-121505561 @jkbradley Thank you so much for your help and co-operation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-14 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-121478720 @jkbradley Please merge if everything seems to be fine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-13 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-120845075 @jkbreadley The style error reported above has already been fixed in the previous commit. Is there any other style issue that has to resolved ?? --- If your project

[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-07-10 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/7320#issuecomment-120359249 @jkbradley PR is updated to solve the merge issues. Please have a look at the issue. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-09 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-120220817 @jkbradley Is this PR ready for merge ? Please let us know if there is anything more to do. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-07-09 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-119924897 @mengxr Could you please tell me how to generate the API docs? I run build/sbt unidoc as mentioned in https://github.com/apache/spark/blob/master/docs/README.md. But

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-07-09 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-119907064 @mengxr I have reduced the PR length so that it would be easier for you to review. The style issues have been fixed wherever they were observed. I will change the

[GitHub] spark pull request: [SPARK-6724][WIP][MLLIB]Model import/export fo...

2015-07-09 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/7320 [SPARK-6724][WIP][MLLIB]Model import/export for FPGrowth You can merge this pull request into a Git repository by running: $ git pull https://github.com/FlytxtRnD/spark FP Alternatively

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-08 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6737#discussion_r34222734 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -193,20 +208,33 @@ class KMeans private ( val

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-08 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-119807853 @jkbradley sorry for the repeating the style errors..I hope the documentation added is also fine. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-07-06 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-119067665 @jkbradley please review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-30 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-117431033 @jkbradley Thank you for the review. I will make changes soon. I would like to know why pyspark unit tests are failing in this PR. It seems tests in

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-29 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-116645007 @mengxr Could you please say your comments on this PR ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-25 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r33227267 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-23 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r33119238 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-22 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6880#discussion_r32925721 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-22 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6737#discussion_r32909692 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -193,12 +207,19 @@ class KMeans private ( val

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-19 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-113460607 I am updating the PR with the suggested changes. Only one run condition is handled by adding a `require` in setInitialModel. I will modify it based on further

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-18 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6737#discussion_r32801495 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -193,12 +207,19 @@ class KMeans private ( val

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-18 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/6880 [SPARK-8402][MLLIB] DP Means Clustering DP means is a non-parametric clustering algorithm that uses a scale parameter 'lambda' to control the creation of new clusters. This algorithm

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-18 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-113063150 @jkbradley Gentle remainder. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-16 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-112411731 @mengxr @jkbradley Could you please comment on @srowen 's note above ? --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-14 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111936681 Does this patch look fine to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-11 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111375429 Can somebody help me with this test failure? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-11 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111088818 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-10 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6737#discussion_r32189526 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -156,6 +156,26 @@ class KMeans private ( this

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-10 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/6737#discussion_r32103401 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -156,6 +156,26 @@ class KMeans private ( this

[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-09 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/6737 [SPARK-8018][MLlib]KMeans should accept initial cluster centers as param This allows Kmeans to be initialized using an existing set of cluster centers provided as a KMeansModel object. This

[GitHub] spark pull request: [SPARK-7651][MLLib][PySpark] GMM predict, pred...

2015-05-17 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6180#issuecomment-102816170 @jkbradley Thank You --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7651][MLLib][PySpark] GMM predict, pred...

2015-05-15 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6180#issuecomment-102362019 @jkbradley Please review the PR and let me know if anything more is required --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-7651][MLLib][PySpark] GMM predict, pred...

2015-05-15 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/6180 [SPARK-7651][MLLib][PySpark] GMM predict, predictSoft should raise error on bad input In the Python API for Gaussian Mixture Model, predict() and predictSoft() methods should raise an error when

[GitHub] spark pull request: [SPARK-6258] [MLLIB] GaussianMixture Python AP...

2015-05-13 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6087#issuecomment-101933563 @yanboliang @jkbradley Thank you clarifying the doubt --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-05 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-99335954 @jkbradley , @mengxr Thanks for the help! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-05 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-99016123 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-05 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-98990951 @jkbradley , ok we'll try your steps. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-05 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-98981610 "Can you please try rebasing your branch off of the current master? Perhaps that will fix it." We didn't get your point.. To which branch shou

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-05 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-98981083 @jkbradley , we merged it with branch-1.4 to get rid of the failed Mima tests in jenkins. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-98968516 @jkbradley ok, will close and reopen this. Could you please tell us why this happened? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
Github user FlytxtRnD closed the pull request at: https://github.com/apache/spark/pull/5647 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-04 Thread FlytxtRnD
GitHub user FlytxtRnD reopened a pull request: https://github.com/apache/spark/pull/5647 [SPARK-6612] [MLLib] [PySpark] Python KMeans parity The following items are added to Python kmeans: kmeans - setEpsilon, setInitializationSteps KMeansModel - computeCost, k You can

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-05-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-98339048 @jkbradley , could you please check if this is ready to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/5647#discussion_r29402436 --- Diff: python/pyspark/mllib/clustering.py --- @@ -40,11 +40,16 @@ class KMeansModel(Saveable, Loader): >>> data = array([0.0,0.0

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/5647#discussion_r29400538 --- Diff: python/pyspark/mllib/clustering.py --- @@ -40,11 +40,16 @@ class KMeansModel(Saveable, Loader): >>> data = array([0.0,0.0

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-27 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-96901873 @jkbradley, we are facing some issues with python 3 support. We are working on it and will fix it asap. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-23 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-95791587 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-23 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5647#issuecomment-95493721 Could anyone please tell us why this timeout has occurred? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-22 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/5647 [SPARK-6612] [MLLib] [PySpark] Python KMeans parity The following items are added to Python kmeans: kmeans - setEpsilon, setInitializationSteps KMeansModel - computeCost, k You can

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-22 Thread FlytxtRnD
Github user FlytxtRnD closed the pull request at: https://github.com/apache/spark/pull/5391 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-22 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5391#issuecomment-95063943 Hi @jkbradley, @mengxr, Is it ok to close this PR and create a new one for the same ticket? We messed up with some commits in this branch. --- If your project

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-17 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/5391#issuecomment-94123703 Ok. We'll look into the issue and will update soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-07 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/5391 [SPARK-6612] [MLLib] [PySpark] Python KMeans parity The following items are added to Python kmeans: * kmeans - setEpsilon, setInitializationSteps * KMeansModel - computeCost, k You can

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72604505 Thanks @mengxr for your help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23982975 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23981473 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72458892 Please review and merge.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72445559 Is it possible to start a test build in Jenkins without updating the PR? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72406403 So I will go with the current approach. I tried to change Array to ArrayBuffer but is ending up in exceptions. So can I go with array itself ? --- If your project is

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72404786 Instead of passing mu & sigma as arrays, I tried to directly pass "gaussians "(Array[MultivariateGaussian]) from PythonMLLibAPI. But I was not able

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-30 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23828921 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754619 --- Diff: python/pyspark/mllib/tests.py --- @@ -167,6 +167,32 @@ def test_kmeans_deterministic(self): # TODO: Allow small numeric

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23752384 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-28 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71978435 The PR is updated according to https://github.com/apache/spark/pull/4088 which modifies GaussianMixtureModel to expose instances of MutlivariateGaussian rather than

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71146302 @mengxr Thank you for the review and comments. I am changing the code according to #3923 (tgaloppo). --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70205738 @jkbradley py4j serialization issue has been solved by the commit https://github.com/apache/spark/commit/8ead999fd627b12837fb2f082a0e76e9d121d269 --- If your project

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/4059 [SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model Python API for the Gaussian Mixture Model clustering algorithm in MLLib. You can merge this pull request into a Git repository by

[GitHub] spark pull request: [SPARK-5223] [MLlib] [PySpark] fix MapConverte...

2015-01-14 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4023#issuecomment-70040193 Ok. Thanks for the explanation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5223] [MLlib] [PySpark] fix MapConverte...

2015-01-14 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4023#issuecomment-70038625 @davies Could you please explain why the second clause of this condition is added? ```elif isinstance(obj, list) and (obj or isinstance(obj[0

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-29 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/3022#issuecomment-68335194 @tgaloppo Good Work @mengxr Thanks for giving us a chance to be a part of this contribution --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22163250 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/3022#discussion_r22163213 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/DenseGmmEM.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-4156 [MLLIB] EM algorithm for GMMs

2014-12-22 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/3022#issuecomment-67816287 Sorry for late reply.predictLabels() and predictMembership() looks fine.But what about moving the computeSoftAssignments() to GaussianMixtureModelEM class(in KMeans