Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2595#issuecomment-57358549
@jkbradley @manishamde
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2548#discussion_r18245962
--- Diff: python/pyspark/mllib/linalg.py ---
@@ -222,20 +283,33 @@ def dot(self, other):
0.0
a.dot(np.array([[1, 1], [2, 2], [3, 3
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2548#discussion_r18245965
--- Diff: python/pyspark/mllib/linalg.py ---
@@ -439,10 +531,11 @@ def toArray(self):
arr = array.array('d', [float(i) for i in range(4
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2548#issuecomment-57393902
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2548#issuecomment-57402247
Merged into master. Thanks @jkbradley for review!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2595#issuecomment-57415215
@chouqin The performance gain is already significant. The aggregation time
reduced to 3s from ~30s in my experiment. I just want to see whether we can
optimize
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2604#issuecomment-57431357
LGTM. Merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2595#issuecomment-57434938
@chouqin The trained model only contains a single node in the python test.
Maybe there is a bug that caused early termination.
---
If your project is set up for it, you
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2595#discussion_r18297053
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala
---
@@ -159,161 +166,15 @@ private[tree] abstract class
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2595#issuecomment-57579497
@buenrostro-oo @tdas We have seen several test failures from
`NetworkReceiverSuite`. Do you have time to take a look? Thanks!
---
If your project is set up for it, you
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2595#issuecomment-57778527
LGTM. Merged into master. Thanks @chouqin , and @jkbradley and @manishamde
for code review!
Increasing `maxMemoryInMB` also increases the shuffle size. As long
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2636#issuecomment-57785126
@mdagost If you convert `(Int, Array[Double])` to a
`java.util.ListObject` (id the first and features the second (without
converting to string)), you should be able
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2491#issuecomment-57839878
@staple Sorry for late response and thank you for working on this JIRA! For
the best practice, before you start working on a JIRA, please first ask on the
JIRA page
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-57841035
@derrickburns The `*ClusterSuite` was created to prevent referencing
unnecessary objects into the task closure. You can try to remove `Serializable`
from algorithms
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2491#issuecomment-57865710
@staple The conditional distribution matrix may not be sparse. That is why
we use dense format to store it. Maybe we can do a hard thresholding to make it
parse
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2455#issuecomment-57865917
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2455#issuecomment-57865927
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423427
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -375,7 +376,9 @@ abstract class RDD[T: ClassTag](
val sum = weights.sum
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423440
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423426
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -43,7 +43,8 @@ import org.apache.spark.partial.PartialResult
import
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423449
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423438
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423429
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423444
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423475
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423477
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423453
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423474
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423459
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423463
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423478
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423454
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423479
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423461
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423433
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423437
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423464
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423484
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423470
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423457
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423473
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423487
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423485
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423495
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423492
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423489
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423468
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423448
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423443
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423499
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423504
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423498
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423491
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423493
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18423500
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -53,56 +89,238 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2455#issuecomment-57874204
@erikerlandson I didn't check the test code. I will try to find another
time to make a pass on the test. The implementation looks good to me except
minor inline comments
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2622#issuecomment-58091578
@rezazadeh Could you update the example using `scopt` to parse parameters?
You can check other example code for its usage. We try to be consistent across
example code
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486326
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,58 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486384
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486359
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
--- End diff --
Please rename the file to `feature.py` to make `Word2Vec
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486387
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486381
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486395
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486413
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486524
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,192 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486706
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,58 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18486738
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,58 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-58104576
@derrickburns The style test doesn't capture all, unfortunately. The Spark
Code Style Guide is the first place to check. I will mark a few examples inline.
I
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488327
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansParallel.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488318
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -17,429 +17,57 @@
package org.apache.spark.mllib.clustering
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488296
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/GeneralizedKMeansModel.scala
---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488309
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -17,429 +17,57 @@
package org.apache.spark.mllib.clustering
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488331
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansPlusPlus.scala ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488320
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -17,429 +17,57 @@
package org.apache.spark.mllib.clustering
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488293
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/GeneralizedKMeansModel.scala
---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488325
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala ---
@@ -25,37 +25,28 @@ import org.apache.spark.mllib.linalg.Vector
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488297
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/GeneralizedKMeansModel.scala
---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488364
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/KMeansSuite.scala ---
@@ -255,4 +253,4 @@ class KMeansClusterSuite extends FunSuite
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488334
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansPlusPlus.scala ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488302
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -17,429 +17,57 @@
package org.apache.spark.mllib.clustering
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488348
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/MultiKMeansClusterer.scala
---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488358
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/metrics/FastEuclideanOps.scala
---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488316
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
@@ -17,429 +17,57 @@
package org.apache.spark.mllib.clustering
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488342
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/MultiKMeans.scala ---
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488359
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/package.scala ---
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2634#discussion_r18488352
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/metrics/EuclideanOps.scala
---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-58106056
@derrickburns I marked a few style problems (not all of them). There are
breaking changes in your PR, which we should avoid as much as possible. Even we
want to remove
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2356#issuecomment-58118924
@Ishiihara Another file to update is `python/docs/pyspark.mllib.rst`. We
need a section for `pyspark.mllib.feature` module.
---
If your project is set up for it, you can
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-58122649
@derrickburns I don't know any formatter that can do the job nicely. This
has to be done by hand at this moment, unfortunately.
`KMeans` has a public constructor
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2693#issuecomment-58221886
@dbtsai Could you check whether there is any dependency change in
breeze-0.10 and the number of files in breeze-0.10 jar? Does it compatible with
both Scala 2.10 and 2.11
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536661
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536639
--- Diff:
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -39,13 +42,46 @@ trait RandomSampler[T, U] extends Pseudorandom
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536657
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536653
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536673
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536693
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536679
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536671
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536659
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2455#discussion_r18536697
--- Diff:
core/src/test/scala/org/apache/spark/util/random/RandomSamplerSuite.scala ---
@@ -18,96 +18,547 @@
package org.apache.spark.util.random
201 - 300 of 8762 matches
Mail list logo