Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1624#discussion_r15627471
--- Diff: python/pyspark/mllib/regression.py ---
@@ -109,18 +109,45 @@ class
LinearRegressionModel(LinearRegressionModelBase):
True
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1624#discussion_r15627468
--- Diff: python/pyspark/mllib/regression.py ---
@@ -109,18 +109,45 @@ class
LinearRegressionModel(LinearRegressionModelBase):
True
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1624#discussion_r15627480
--- Diff: python/pyspark/mllib/regression.py ---
@@ -109,18 +109,45 @@ class
LinearRegressionModel(LinearRegressionModelBase):
True
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1673#issuecomment-50716977
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627517
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -69,25 +73,32 @@ object DecisionTreeRunner {
opt
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627507
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -69,25 +73,32 @@ object DecisionTreeRunner {
opt
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627514
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -69,25 +73,32 @@ object DecisionTreeRunner {
opt
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627527
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +111,57 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627524
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +111,57 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627594
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +111,57 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627787
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +111,57 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627842
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +111,57 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627893
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -48,11 +50,13 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15627927
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -598,9 +598,12 @@ object DecisionTree extends Serializable with Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628159
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -612,27 +615,31 @@ object DecisionTree extends Serializable with Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628197
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -815,20 +822,10 @@ object DecisionTree extends Serializable with Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628238
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -845,33 +842,15 @@ object DecisionTree extends Serializable with Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628435
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/model/Node.scala
---
@@ -91,4 +91,59 @@ class Node
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628444
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/model/Node.scala
---
@@ -91,4 +91,59 @@ class Node
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628483
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/model/Node.scala
---
@@ -91,4 +91,59 @@ class Node
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628492
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/model/Node.scala
---
@@ -91,4 +91,59 @@ class Node
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628533
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/model/Node.scala
---
@@ -91,4 +91,59 @@ class Node
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628610
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala ---
@@ -31,6 +30,18 @@ import
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15628659
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala ---
@@ -602,12 +609,78 @@ class DecisionTreeSuite extends FunSuite
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1671#discussion_r15630475
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/HashingTF.scala ---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15630544
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15630602
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15630645
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegression.scala
---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15630714
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15630905
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15631754
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -598,9 +598,12 @@ object DecisionTree extends Serializable with Logging
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1671#issuecomment-50735679
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15645443
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -66,6 +66,42 @@ class MatrixFactorizationModel
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15645502
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -66,6 +66,42 @@ class MatrixFactorizationModel
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15645648
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -29,6 +29,8 @@
import
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15645652
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -44,21 +46,27 @@ public void tearDown() {
sc = null
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15645878
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -171,4 +180,29 @@ public void runImplicitALSWithNegativeWeight
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1687#issuecomment-50769991
For the API, another option is to return `Array[Rating]` instead of
`Array[(Int, Double)]`. This should help Java users and it is also compatible
with batch predictions
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1687#issuecomment-50785472
I meant the final `userFeatures` and `productFeatures` stored in the matrix
factorization model. If those two RDDs are kicked out from memory by later
jobs, we have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15657110
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -66,6 +66,44 @@ class MatrixFactorizationModel
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15657224
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -28,6 +28,8 @@
import org.apache.spark.api.java.JavaRDD
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15657246
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -28,6 +28,8 @@
import org.apache.spark.api.java.JavaRDD
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15657282
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -163,12 +173,42 @@ public void runImplicitALSWithNegativeWeight
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1687#discussion_r15657314
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/recommendation/JavaALSSuite.java ---
@@ -163,12 +173,42 @@ public void runImplicitALSWithNegativeWeight
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1687#issuecomment-50793121
@srowen For changing the storage level, I can submit another PR after this
gets merged and ping you for review.
---
If your project is set up for it, you can reply
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15657914
--- Diff: python/pyspark/__init__.py ---
@@ -49,6 +49,12 @@
Main entry point for accessing data stored in Apache Hive
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15657959
--- Diff: python/pyspark/mllib/linalg.py ---
@@ -255,4 +255,6 @@ def _test():
exit(-1)
if __name__ == __main__:
+import sys
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15658209
--- Diff: python/pyspark/mllib/random.py ---
@@ -0,0 +1,222 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15665285
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -69,25 +73,32 @@ object DecisionTreeRunner {
opt
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1671#issuecomment-50810069
@mateiz Thanks for reviewing the code! I merged this into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15667236
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +109,59 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15667261
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +109,59 @@ object DecisionTreeRunner
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15667264
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +109,59 @@ object DecisionTreeRunner
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1698#issuecomment-50814109
@andy327 This is covered in @dbtsai's PR:
https://github.com/apache/spark/pull/1207 , which is in review.
---
If your project is set up for it, you can reply
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15667925
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -42,8 +42,8 @@ class DecisionTree (private val strategy: Strategy
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1673#issuecomment-50816349
@jkbradley The changes look good to me. Thank you for the bug fixes and
adding more docs! Waiting for @manishamde to make a final pass, and Jenkins.
---
If your project
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1700#issuecomment-50816428
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15670511
--- Diff: python/pyspark/mllib/random.py ---
@@ -0,0 +1,182 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15670547
--- Diff: python/pyspark/mllib/random.py ---
@@ -0,0 +1,182 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15670579
--- Diff: python/pyspark/mllib/random.py ---
@@ -0,0 +1,182 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15670583
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -453,4 +455,98 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15670585
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -453,4 +455,98 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50820621
LGTM except minor inline comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1710#issuecomment-50845244
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1710#discussion_r15681269
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala
---
@@ -55,20 +55,24 @@ object Statistics {
/**
* Compute
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1710#discussion_r15681299
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmanCorrelation.scala
---
@@ -89,20 +89,17 @@ private[stat] object
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1710#discussion_r15681334
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmanCorrelation.scala
---
@@ -89,20 +89,17 @@ private[stat] object
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50845853
Merged into master. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1673#discussion_r15681460
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala
---
@@ -100,16 +109,57 @@ object DecisionTreeRunner
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1673#issuecomment-50846546
Merged into master. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1710#issuecomment-50847713
LGTM. Merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1237#issuecomment-50848559
Sorry, I'm still working on it and will put the design doc to JIRA soon.
But unfortunately, it may not be able to catch the v1.1 release.
---
If your project is set up
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50848722
@bgreeven The filename
`mllib/src/main/scala/org/apache/spark/mllib/ann/GeneralizedSteepestDescendAlgorithm`
doesn't have `.scala` extension.
---
If your project is set
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1698#issuecomment-50849409
Your implementation calls `reduceByKey` and `cartesian`. Those are not
cheap streamline operations. `map(x = (1, x)).reduceByKey` is the same as
`reduce` except
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50850308
Yes, it is already a problem with breeze 0.7. But we didn't realized that
hadoop 2.3 depends on commons-math3 in the Spark v1.0 release. If there is a
way to avoid
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50850626
But is it needed for the v1.1 release? Spark v1.1 doesn't support Scala
2.11.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50851243
That sounds good to me but I'm not familiar with the tasks related to Scala
2.11. Please run the discussion on
https://issues.apache.org/jira/browse/SPARK-1812
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50854138
@witgo Could you update the pom to exclude `commons-math3` from
dependencies? I tried at local and LBFGS works well. It should be safe to
remove `commons-math3
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/940#discussion_r15684374
--- Diff: mllib/pom.xml ---
@@ -60,6 +60,14 @@
groupIdjunit/groupId
artifactIdjunit/artifactId
/exclusion
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15684424
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala
---
@@ -49,43 +49,48 @@ private[stat] trait Correlation
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15684617
--- Diff: python/pyspark/mllib/stat.py ---
@@ -0,0 +1,103 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15684640
--- Diff: python/pyspark/mllib/stat.py ---
@@ -0,0 +1,103 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15685005
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -456,6 +458,37 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15684998
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -456,6 +458,37 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15685050
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
---
@@ -59,10 +59,25 @@ class PythonMLLibAPISuite extends FunSuite
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1713#discussion_r15685067
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
---
@@ -59,10 +59,25 @@ class PythonMLLibAPISuite extends FunSuite
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50892988
LGTM. Merged into master. Note that Jenkins didn't tell the full story
because we have `commons-math3` in the test scope. I built the assembly jar and
verified LBFGS work
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1703#issuecomment-50893224
@avati #940 is merged. Do you mind closing this PR? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50896710
Ah, I see. Tests were against individual build instead of the assembly jar.
We should have integration tests in the future.
---
If your project is set up for it, you can
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1701#issuecomment-50897364
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1718
[HOTFIX] downgrade breeze version to 0.7
breeze-0.8.1 causes dependency issues, as discussed in #940 .
You can merge this pull request into a Git repository by running:
$ git pull https
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1361#issuecomment-50898744
@freeman-lab Could you try to merge the latest master and resolve
conflicts? It may be because of the change to constructors.
---
If your project is set up for it, you
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1698#issuecomment-50899568
What if you have 10M columns? I agree that not sending data to the driver
is a good practice. But the current operations `reduceByKey` and `cartesian`
are not optimized
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1698#issuecomment-50900786
Yes, I tried to implement AllReduce without having driver in the middle in
https://github.com/apache/spark/pull/506 but it introduced complex
dependencies. So I fall back
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1719#issuecomment-50901102
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1719#issuecomment-50901119
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1719#discussion_r15703584
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -0,0 +1,376 @@
+/*
+* Licensed to the Apache Software Foundation
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1719#discussion_r15703588
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -0,0 +1,376 @@
+/*
+* Licensed to the Apache Software Foundation
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1719#discussion_r15703583
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -0,0 +1,376 @@
+/*
+* Licensed to the Apache Software Foundation
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1719#discussion_r15703684
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala ---
@@ -0,0 +1,40 @@
+/*
+* Licensed to the Apache Software
501 - 600 of 8762 matches
Mail list logo