Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/916#discussion_r13691539
--- Diff: python/pyspark/rdd.py ---
@@ -400,6 +399,18 @@ def takeSample(self, withReplacement, num, seed=None):
sampler.shuffle(samples
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/916#discussion_r13691536
--- Diff: python/pyspark/rdd.py ---
@@ -400,6 +399,18 @@ def takeSample(self, withReplacement, num, seed=None):
sampler.shuffle(samples
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/916#discussion_r13691546
--- Diff: python/pyspark/rdd.py ---
@@ -400,6 +399,18 @@ def takeSample(self, withReplacement, num, seed=None):
sampler.shuffle(samples
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1028#issuecomment-45939821
LGTM. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1026#issuecomment-45954737
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1026#issuecomment-45954727
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1026#issuecomment-45954951
@coderxiang The merge was not clean. It contains changes from my PR. Could
you re-merge the latest master and check the diff is correct on this page?
---
If your project
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/916#discussion_r13732786
--- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala ---
@@ -22,6 +22,9 @@ import scala.reflect.ClassTag
import org.scalatest.FunSuite
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1026#issuecomment-45964506
LGTM. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/490#issuecomment-45965012
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/490#issuecomment-45965023
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/916#issuecomment-45965544
LGTM. Thanks! Waiting for Jenkins ...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/916#issuecomment-45970681
Merged. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/490#issuecomment-45976550
@codeboyyong I've merged this. Could you please make a patch for
branch-0.9? Thanks!
---
If your project is set up for it, you can reply to this email and have your
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742986
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742980
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742995
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -201,6 +202,31 @@ class RowMatrix
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13743003
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -201,6 +202,31 @@ class RowMatrix
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742972
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742973
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13743001
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -201,6 +202,31 @@ class RowMatrix
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742975
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742981
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742970
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742971
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742989
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -201,6 +202,31 @@ class RowMatrix
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742968
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13742976
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-45990973
@vrilleup This implementation looks good to me and thanks for the
experiments! Besides the inline comments, we should think when to switch from
ARPACK to dense SVD. ARPACK
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/916#issuecomment-45992111
@colorant Tried the following with the new implementation:
~~~
val rdd = sc.parallelize(0 until 10, 1).flatMap(i =>
Iterator.fill(10)(0)) // 10
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1075
[HOTFIX] add math3 version to pom
Passed `mvn package`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mengxr/spark takeSample-fix
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1099#discussion_r13843020
--- Diff:
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
---
@@ -73,10 +73,18 @@ private[spark] class
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1099#discussion_r13843025
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -130,7 +129,8 @@ class Client(args: ClientArguments, conf
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1099#discussion_r13843041
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -160,15 +160,19 @@ class Client(args: ClientArguments, conf
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1099#discussion_r13843048
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -160,15 +160,19 @@ class Client(args: ClientArguments, conf
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1099#discussion_r13843046
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -160,15 +160,19 @@ class Client(args: ClientArguments, conf
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-46267457
@codeboyyong Thanks for submitting the patch! It looks good to me except a
few style issues.
---
If your project is set up for it, you can reply to this email and have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1098#issuecomment-46270697
LGTM. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1108#issuecomment-46372243
Verified that it is working now. I'm going to merge this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1110
[WIP][SPARK-2174][MLLIB] treeReduce and treeAggregate
In `reduce` and `aggregate`, the driver node spends linear time on the
number of partitions. It becomes a bottleneck when there are many
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/506#issuecomment-46380865
Thanks all for reviewing this PR! I found the butterfly pattern introduces
complex dependency that slows down the computation. In my tests, a good
approach for Spark is
Github user mengxr closed the pull request at:
https://github.com/apache/spark/pull/506
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13899771
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13899776
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13899792
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13899798
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r1396
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
---
@@ -201,6 +202,31 @@ class RowMatrix
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-46398707
@vrilleup Thanks for updating the PR! I made a comment on the explicit type
checks. I'm a little confused about the new API. If `isDenseSVD` is true, `tol`
doesn
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/964#discussion_r13900396
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/EigenValueDecomposition.scala
---
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-46398850
Btw, we shouldn't use default parameters in method definition. It is
convenient in Scala but it is not Java friendly. Also, this is hard for us to
maintain b
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46411430
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13905413
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -195,4 +195,39 @@ class LBFGSSuite extends FunSuite with
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13905479
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -195,4 +195,39 @@ class LBFGSSuite extends FunSuite with
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1104#discussion_r13905461
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/optimization/LBFGSSuite.scala ---
@@ -195,4 +195,39 @@ class LBFGSSuite extends FunSuite with
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46411779
The change looks good to me. Let us wait for Jenkins and MIMA.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46503707
Jenkins, add to white list.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1124
[SPARK-1112] use min akka frame size to decide how to send task results
Task results are sent either via akka directly or block manager indirectly,
based on whether the size of the serialized task
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46517676
@ash211 If there is a way to make the configuration delivered consistently
to backend, we can use `spark.akka.frameSize` consistently. It is then not
necessary to set the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1124#discussion_r13951639
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -212,7 +208,14 @@ private[spark] class Executor(
val
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1125#issuecomment-46524600
Thanks! Merged.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46527308
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46528231
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1124#discussion_r13955220
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -212,7 +208,12 @@ private[spark] class Executor(
val
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1124#discussion_r13955317
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -212,7 +208,12 @@ private[spark] class Executor(
val
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-46529144
@nevillelyh Is there a JIRA for it? Is it fixed in 0.8.1?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46530221
For the first scenario, it won't make the performance worse because the
system doesn't really work now for serialized task result of size betwee
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46531328
I don't get it. As long as the actor systems are created from
AkkaUtils.createActorSystem, the minimum value of the max frame size is 10M.
All unit tests pass
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46531776
Just tested `- 1024` in `SchedulerBackend`. The system did hang up when the
task size is close to 10M - 1024 ...
~~~
scala> val random = new java.util.Ran
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1132
[SPARK-1112, 2156] Bootstrap to fetch the driver's Spark properties.
This is an alternative solution to #1124 . Before launching the executor
backend, we first fetch driver's spark prop
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46541699
@pwendell @kayousterhout I put an alternative solution in #1132 . Please
let me know which do you prefer.
---
If your project is set up for it, you can reply to this
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1132#discussion_r13982529
--- Diff:
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
---
@@ -101,26 +106,33 @@ private[spark] object
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1132#discussion_r13984220
--- Diff:
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
---
@@ -101,26 +106,33 @@ private[spark] object
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-46643277
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1104#issuecomment-46694507
Merged. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1172#issuecomment-46769968
Jenkins, retest it please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1172#issuecomment-46769990
LGTM. I will merge it to branch-1.0 if Jenkins is happy.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mengxr closed the pull request at:
https://github.com/apache/spark/pull/1124
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1124#issuecomment-46769996
Closing this in favor of #1132.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1172#issuecomment-46775109
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-46814485
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-46924196
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1178#issuecomment-47134903
This looks good to me. I'm going merge it since pyspark is broken without
this patch.
---
If your project is set up for it, you can reply to this email and have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1223#discussion_r14223221
--- Diff: mllib/pom.xml ---
@@ -76,5 +76,16 @@
scalatest-maven-plugin
+
+
+src/main/resources
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1215#discussion_r14223272
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/Gradient.scala ---
@@ -37,7 +37,11 @@ abstract class Gradient extends Serializable
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1223#issuecomment-47187655
LGTM and tested with `mvn install`. Thanks for fixing it!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-47190528
No, just want to see Jenkins happy.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-47190546
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1229
[SPARK-2251] fix concurrency issues in random sampler
The following code is very likely to throw an exception:
~~~
val rdd = sc.parallelize(0 until 111, 10).sample(false, 0.1
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1234
fix concurrency issues in random sampler
The following code is very likely to throw an exception:
~~~
val rdd = sc.parallelize(0 until 111, 10).sample(false, 0.1)
rdd.zip(rdd).count
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1099#issuecomment-47291043
I think `PYSPARK_PYTHON` is set to `/usr/local/bin/python2.7` in Jenkins
but it doesn't exist. @pwendell ?
---
If your project is set up for it, you can reply to
Github user mengxr closed the pull request at:
https://github.com/apache/spark/pull/1234
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1250#issuecomment-47598550
@srowen Thanks for fixing it! LGTM. Merged.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-47681084
@vrilleup Just checked Matlabâs svd and svds. I donât remember I have
used options.{tol, maxit} before. I wonder whether this is useful to expose to
users. I did use
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1110#issuecomment-47686100
@dbtsai Thanks for testing it! I'm going to move `treeReduce` and
`treeAggregate` to `mllib.rdd.RDDFunctions`. For normal data processing, people
generally use
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1155#issuecomment-47813702
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1155#issuecomment-47813772
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14500775
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -27,8 +27,12 @@ import scala.collection.Map
import
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14500845
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -46,6 +48,8 @@ import org.apache.spark.Partitioner.defaultPartitioner
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-47878720
@yangliuyu What did you set for `k` and how many iterations it took?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/964#issuecomment-47881287
@vrilleup Both approaches compute the truncated SVD. I still prefer putting
both implementation under `computeSVD` for now. I'm going to implement a
generic Paramet
1101 - 1200 of 9100 matches
Mail list logo