Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15439511
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15439516
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15439518
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1562#issuecomment-50266091
@rxin @mateiz I have one question about using `rdd.id` as random seed shift
to avoid sampling the same sequence in each partition. It is a constant within
a session. But
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15439751
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15441176
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1562#discussion_r15441180
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -105,24 +108,91 @@ class RangePartitioner[K : Ordering : ClassTag, V
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1520#issuecomment-50289239
LGTM. Merged into master. Thanks for adding random RDD generators!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50290028
@dorx I removed commons-math3 from dependencies, separated `sampleByKey`
and `sampleByKeyExact`, and corrected the math in waitlisting in sampling with
replacement
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1425#discussion_r15443033
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/KMeansSuite.scala ---
@@ -40,27 +41,51 @@ class KMeansSuite extends FunSuite with
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50291675
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50291855
@witgo Could you merge the latest master? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50303013
@witgo Thanks for checking the dependencies on the JIRA page! I list the
dependency graph here so other people can see the difference easily. I think we
need to figure out
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50303211
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50303204
There were some problems with pyspark. Let's call Jenkins again.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1110#issuecomment-50303437
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50363809
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1425#issuecomment-50379081
LGTM. I'm merging this into master! Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/940#issuecomment-50379548
@dlwh Thanks for the quick reply! The `commons-math3` problem is not which
version to use but how to match the version hadoop depends on. We can switch to
3.1.1 to match
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/929#issuecomment-50380948
@witgo We don't need to checkpoint both users and products, but only the
smaller one. For the initial version, it is fine to checkpoint either of them.
We should al
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1057#issuecomment-50433121
I think we should keep it as it is now and add support for setting
thresholds.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506024
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDGenerators.scala ---
@@ -35,6 +35,9 @@ object RandomRDDGenerators
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506028
--- Diff: python/pyspark/mllib/randomRDD.py ---
@@ -0,0 +1,213 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506026
--- Diff: python/pyspark/mllib/randomRDD.py ---
@@ -0,0 +1,213 @@
+#
--- End diff --
Should the file name match Scala's?
---
If your pr
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506022
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -453,4 +454,74 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506030
--- Diff: python/pyspark/mllib/randomRDD.py ---
@@ -0,0 +1,213 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1628#discussion_r15506031
--- Diff: python/pyspark/mllib/randomRDD.py ---
@@ -0,0 +1,213 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50435025
LGTM except minor inline comments. For the file name, it should be possible
to have a package named `random`, for example, `numpy.random`:
http://docs.scipy.org/doc/numpy
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1624#issuecomment-50435465
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1624#discussion_r15506490
--- Diff: python/pyspark/mllib/regression.py ---
@@ -120,6 +120,23 @@ def train(cls, data, iterations=100, step=1.0,
d._jrdd, iterations, step
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1518#issuecomment-50441485
@dbtsai I thought another way to do this and want to know your opinion. We
can add an optional argument to `appendBias`: `appendBias(bias: Double = 1.0)`.
If this is used
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508694
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508695
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508718
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508725
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508790
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508808
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1110#discussion_r15508803
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala ---
@@ -44,6 +47,65 @@ class RDDFunctions[T: ClassTag](self: RDD[T
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/929#discussion_r15533428
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala ---
@@ -255,6 +260,9 @@ class ALS private (
rank, lambda, alpha
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r15534415
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -88,14 +91,73 @@ private[spark] object SamplingUtils
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/929#discussion_r15534935
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala ---
@@ -255,6 +255,9 @@ class ALS private (
rank, lambda, alpha
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50528615
LGTM. Merged into master. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50529499
Yes, having directories is the way to organize packages in python. We can
make a folder for `random` and include the python files in `mllib/pom.xml`.
Otherwise, user
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1269#issuecomment-50557454
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1269#issuecomment-50558162
@akopich The failed tests might be irrelevant to this PR. It would be nice
if you can make the public interfaces minimal and provide a summary of them.
For example, You
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15561486
--- Diff:
examples/src/main/scala/org/apache/spark/examples/mllib/StreamingLinearRegression.scala
---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15561528
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegression.scala
---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565489
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565590
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegression.scala
---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565612
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565687
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565686
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565706
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565728
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565768
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565789
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/util/MLStreamingUtils.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565824
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565834
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565864
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565941
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565956
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15565983
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala
---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15566035
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegression.scala
---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15566050
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegression.scala
---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15566064
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1361#discussion_r15566077
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingRegression.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50572101
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50572111
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50572128
@bgreeven Jenkins will be automatically triggered for future updates.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/929#discussion_r15566629
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala ---
@@ -255,6 +255,9 @@ class ALS private (
rank, lambda, alpha
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50574345
@dorx I tried `import pyspark.mllib.random` and it failed. It has to be
`from pyspark.mllib import random`. And to use `RandomRDDGenerators`, I need to
call
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1624#discussion_r15566835
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -42,6 +43,16 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/929#issuecomment-50574626
LGTM. Waiting for Jenkins. Btw, @witgo if you have a big dataset to test,
could you try to set the storage level of ratings and user/product in/out links
to
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1493#discussion_r15567017
--- Diff: python/pyspark/mllib/classification.py ---
@@ -63,7 +63,10 @@ class LogisticRegressionModel(LinearModel):
def predict(self, x
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50575901
We pinged Davies today. It seems to be a well-known problem with Python.
There are ways to force import a standard module in Python 2, but they are all
very messy
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/929#issuecomment-50575931
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1652#issuecomment-50634715
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1652#issuecomment-50634700
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1659#issuecomment-50635193
LGTM. Merged into master. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1437#issuecomment-50645071
@lianhuiwang I created a JIRA for it:
https://issues.apache.org/jira/browse/SPARK-2755 . We can serialize the object
to a stream instead of Array[Byte] directly.
---
If
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1652#issuecomment-50645844
LGTM. Merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/929#issuecomment-50654846
Merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50662945
I tried in Python 2.7 but it didn't work:
~~~
Python 2.7.7 (default, Jun 2 2014, 01:41:14)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1663#issuecomment-50663331
LGTM. Waiting for Jenkins.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1663#discussion_r15604802
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala ---
@@ -102,36 +100,14 @@ object MLUtils {
// Convenient methods for
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1663#discussion_r15604829
--- Diff: python/pyspark/mllib/util.py ---
@@ -29,15 +29,18 @@ class MLUtils:
Helper methods to load, save and pre-process data used in MLlib
GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/1671
[SPARK-2511][MLLIB] add HashingTF and IDF
This is roughly the TF-IDF implementation used in the Databricks Cloud
Demo: http://databricks.com/cloud/ .
Both `HashingTF` and `IDF` are
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1663#issuecomment-50689249
Yes, MiMa doesn't recognize package private classes. Please add those
exclusion rules manually:
~~~
[error] * o
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50690034
@JoshRosen Ah ... I should copy it literally. Thanks! Do you know what is
the oldest version of python that we support?
---
If your project is set up for it, you can
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1628#issuecomment-50691239
@JoshRosen If we don't support 2.5, could we use `from __future__ import
absolute_import`?
---
If your project is set up for it, you can reply to this email and
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1673#issuecomment-50691630
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1673#issuecomment-50691618
Jenkins, add to whitelist.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1518#issuecomment-50691925
I think this is the approach LIBLINEAR uses. Yes, let's discuss tomorrow.
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2435#issuecomment-57116574
LGTM and tested with 1000 trees. I've merged it into master. Thanks
@jkbradley for implementing RF and @codedeft @manishamde @chouqin for reviewing!
---
If your pr
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1778#issuecomment-57204030
LGTM. Merged into master! Thanks @rezazadeh !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18184054
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,54 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18184065
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,54 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18184061
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala ---
@@ -284,6 +285,54 @@ class PythonMLLibAPI extends Serializable
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18184091
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,124 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/2356#discussion_r18184098
--- Diff: python/pyspark/mllib/Word2Vec.py ---
@@ -0,0 +1,151 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
1401 - 1500 of 2153 matches
Mail list logo