Github user FlytxtRnD closed the pull request at:
https://github.com/apache/spark/pull/7320
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/7320#issuecomment-168222327
@jkbradley Thank you for the review. I will close this issue and will
surely help in testing the other PR. Also it would be great if you could take a
look at #6880
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-156399657
@yu-iskw @jkbradley any other review comments, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-153564736
@yu-iskw I didn't get your comment on @Since tags. We will be waiting for
further review comments.
---
If your project is set up for it, you can reply to this
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-153329563
@yu-iskw PR is updated. Shall I include @since to the methods? Or is it
done after getting merged? Please provide any other suggestions, if any.
---
If your project
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r43622644
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala ---
@@ -0,0 +1,279 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-152919710
Thank you @yu-iskw for the review comments.. Will update the PR asap
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/9267#discussion_r42964572
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala ---
@@ -20,17 +20,28 @@ package org.apache.spark.mllib.fpm
import java.{util
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-149156272
@mengxr Could you please have a look into this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-143643138
@mengxr @jkbradley I have incorporated the suggestions and changes and
updated the PR. Could you please take another look ?
---
If your project is set up for it, you
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r39827805
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala ---
@@ -0,0 +1,247 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r39255547
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeans.scala ---
@@ -0,0 +1,247 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-139170303
@mengxr Thank you for all the suggestions. Will update soon
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-136956338
@mengxr We have updated the JIRA ticket to include the benchmark results as
well..Could you please take a look and give your suggestions?
---
If your project is set
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/5267#discussion_r35728733
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala ---
@@ -0,0 +1,645 @@
+/*
+ * Licensed to the Apache
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/5267#discussion_r35728653
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala ---
@@ -0,0 +1,645 @@
+/*
+ * Licensed to the Apache
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-125838477
Thank you @mengxr . We will take a look into the PR you mentioned.We are
looking forward to have DP-Means in the 1.6 release. Thanks a lot for your kind
support
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-125571416
@jkbradley To generate docs, I installed jekyll. In jekyll build command,
it is showing error.
`[info] Done updating.
[error] (catalyst/compile:compile
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/7320#issuecomment-124063686
@jkbradley Plz have a look at the code. I guess the save function is ok.
Now I am facing issues with Loader. Can you please help me with passing the
type parameter
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-122671172
@mengxr @jkbradley Gentle remainder.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/7320#issuecomment-121588925
@jkbradley We will give it a try and let you know if we are not able to
move forward.
---
If your project is set up for it, you can reply to this email and have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-121505561
@jkbradley Thank you so much for your help and co-operation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-121478720
@jkbradley Please merge if everything seems to be fine
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-120845075
@jkbreadley The style error reported above has already been fixed in the
previous commit. Is there any other style issue that has to resolved ??
---
If your project
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/7320#issuecomment-120359249
@jkbradley PR is updated to solve the merge issues. Please have a look at
the issue.
---
If your project is set up for it, you can reply to this email and have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-120220817
@jkbradley Is this PR ready for merge ? Please let us know if there is
anything more to do.
---
If your project is set up for it, you can reply to this email and
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-119924897
@mengxr Could you please tell me how to generate the API docs? I run
build/sbt unidoc as mentioned in
https://github.com/apache/spark/blob/master/docs/README.md. But
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-119907064
@mengxr I have reduced the PR length so that it would be easier for you to
review. The style issues have been fixed wherever they were observed.
I will change the
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/7320
[SPARK-6724][WIP][MLLIB]Model import/export for FPGrowth
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/FlytxtRnD/spark FP
Alternatively
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r34222734
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -193,20 +208,33 @@ class KMeans private (
val
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-119807853
@jkbradley sorry for the repeating the style errors..I hope the
documentation added is also fine.
---
If your project is set up for it, you can reply to this email
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-119067665
@jkbradley please review
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-117431033
@jkbradley Thank you for the review. I will make changes soon. I would like
to know why pyspark unit tests are failing in this PR. It seems tests in
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6880#issuecomment-116645007
@mengxr Could you please say your comments on this PR ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r33227267
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r33119238
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6880#discussion_r32925721
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/DpMeansModel.scala ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r32909692
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -193,12 +207,19 @@ class KMeans private (
val
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-113460607
I am updating the PR with the suggested changes. Only one run condition is
handled by adding a `require` in setInitialModel. I will modify it based on
further
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r32801495
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -193,12 +207,19 @@ class KMeans private (
val
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/6880
[SPARK-8402][MLLIB] DP Means Clustering
DP means is a non-parametric clustering algorithm that uses a scale
parameter 'lambda' to control the creation of new clusters. This algorithm
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-113063150
@jkbradley Gentle remainder.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-112411731
@mengxr @jkbradley Could you please comment on @srowen 's note above ?
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-111936681
Does this patch look fine to merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-111375429
Can somebody help me with this test failure?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6737#issuecomment-111088818
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r32189526
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -156,6 +156,26 @@ class KMeans private (
this
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/6737#discussion_r32103401
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -156,6 +156,26 @@ class KMeans private (
this
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/6737
[SPARK-8018][MLlib]KMeans should accept initial cluster centers as param
This allows Kmeans to be initialized using an existing set of cluster
centers provided as a KMeansModel object. This
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6180#issuecomment-102816170
@jkbradley Thank You
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6180#issuecomment-102362019
@jkbradley Please review the PR and let me know if anything more is required
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/6180
[SPARK-7651][MLLib][PySpark] GMM predict, predictSoft should raise error on
bad input
In the Python API for Gaussian Mixture Model, predict() and predictSoft()
methods should raise an error when
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/6087#issuecomment-101933563
@yanboliang @jkbradley Thank you clarifying the doubt
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-99335954
@jkbradley , @mengxr Thanks for the help!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-99016123
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-98990951
@jkbradley , ok we'll try your steps.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-98981610
"Can you please try rebasing your branch off of the current master? Perhaps
that will fix it."
We didn't get your point.. To which branch shou
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-98981083
@jkbradley , we merged it with branch-1.4 to get rid of the failed Mima
tests in jenkins.
---
If your project is set up for it, you can reply to this email and have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-98968516
@jkbradley ok, will close and reopen this. Could you please tell us why
this happened?
---
If your project is set up for it, you can reply to this email and have
Github user FlytxtRnD closed the pull request at:
https://github.com/apache/spark/pull/5647
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user FlytxtRnD reopened a pull request:
https://github.com/apache/spark/pull/5647
[SPARK-6612] [MLLib] [PySpark] Python KMeans parity
The following items are added to Python kmeans:
kmeans - setEpsilon, setInitializationSteps
KMeansModel - computeCost, k
You can
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-98339048
@jkbradley , could you please check if this is ready to merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/5647#discussion_r29402436
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -40,11 +40,16 @@ class KMeansModel(Saveable, Loader):
>>> data = array([0.0,0.0
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/5647#discussion_r29400538
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -40,11 +40,16 @@ class KMeansModel(Saveable, Loader):
>>> data = array([0.0,0.0
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-96901873
@jkbradley, we are facing some issues with python 3 support. We are working
on it and will fix it asap.
---
If your project is set up for it, you can reply to this
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-95791587
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5647#issuecomment-95493721
Could anyone please tell us why this timeout has occurred?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/5647
[SPARK-6612] [MLLib] [PySpark] Python KMeans parity
The following items are added to Python kmeans:
kmeans - setEpsilon, setInitializationSteps
KMeansModel - computeCost, k
You can
Github user FlytxtRnD closed the pull request at:
https://github.com/apache/spark/pull/5391
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5391#issuecomment-95063943
Hi @jkbradley, @mengxr,
Is it ok to close this PR and create a new one for the same ticket? We
messed up with some commits in this branch.
---
If your project
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/5391#issuecomment-94123703
Ok. We'll look into the issue and will update soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/5391
[SPARK-6612] [MLLib] [PySpark] Python KMeans parity
The following items are added to Python kmeans:
* kmeans - setEpsilon, setInitializationSteps
* KMeansModel - computeCost, k
You can
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-72604505
Thanks @mengxr for your help.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/4059#discussion_r23982975
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/4059#discussion_r23981473
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1,
initializationMode="k-
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-72458892
Please review and merge..
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-72445559
Is it possible to start a test build in Jenkins without updating the PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-72406403
So I will go with the current approach. I tried to change Array to
ArrayBuffer but is ending up in exceptions. So can I go with array itself ?
---
If your project is
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-72404786
Instead of passing mu & sigma as arrays, I tried to directly pass
"gaussians "(Array[MultivariateGaussian]) from PythonMLLibAPI. But I was not
able
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/4059#discussion_r23828921
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/4059#discussion_r23754619
--- Diff: python/pyspark/mllib/tests.py ---
@@ -167,6 +167,32 @@ def test_kmeans_deterministic(self):
# TODO: Allow small numeric
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/4059#discussion_r23752384
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1,
initializationMode="k-
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-71978435
The PR is updated according to https://github.com/apache/spark/pull/4088
which modifies GaussianMixtureModel to expose instances of MutlivariateGaussian
rather than
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-71146302
@mengxr Thank you for the review and comments. I am changing the code
according to #3923 (tgaloppo).
---
If your project is set up for it, you can reply to this
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4059#issuecomment-70205738
@jkbradley py4j serialization issue has been solved by the commit
https://github.com/apache/spark/commit/8ead999fd627b12837fb2f082a0e76e9d121d269
---
If your project
GitHub user FlytxtRnD opened a pull request:
https://github.com/apache/spark/pull/4059
[SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model
Python API for the Gaussian Mixture Model clustering algorithm in MLLib.
You can merge this pull request into a Git repository by
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4023#issuecomment-70040193
Ok. Thanks for the explanation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/4023#issuecomment-70038625
@davies Could you please explain why the second clause of this condition is
added?
```elif isinstance(obj, list) and (obj or isinstance(obj[0
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/3022#issuecomment-68335194
@tgaloppo Good Work
@mengxr Thanks for giving us a chance to be a part of this contribution
---
If your project is set up for it, you can reply to this email and
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/3022#discussion_r22163250
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala
---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache
Github user FlytxtRnD commented on a diff in the pull request:
https://github.com/apache/spark/pull/3022#discussion_r22163213
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DenseGmmEM.scala ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software
Github user FlytxtRnD commented on the pull request:
https://github.com/apache/spark/pull/3022#issuecomment-67816287
Sorry for late reply.predictLabels() and predictMembership() looks fine.But
what about moving the computeSoftAssignments() to GaussianMixtureModelEM
class(in KMeans
92 matches
Mail list logo