[GitHub] spark pull request: [MLlib] word2vec: Distributed Representation o...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15703665 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,376 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] word2vec: Distributed Representation o...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15703627 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,376 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] word2vec: Distributed Representation o...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15703716 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala --- @@ -0,0 +1,40 @@ +/* +* Licensed to the Apache Software

[GitHub] spark pull request: [MLlib] word2vec: Distributed Representation o...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15703745 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,376 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] word2vec: Distributed Representation o...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-50901946 @Ishiihara This is great! Could you add the JIRA number to the title `[SPARK-]`? I will ping you after I finish the first pass. --- If your project is set up

[GitHub] spark pull request: [SPARK-1997] update breeze to version 0.8.1

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/940#issuecomment-50908990 I reverted the change in #1718 and asked @marmbrus to take look at the dependency issues caused by scala-logging. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [HOTFIX] downgrade breeze version to 0.7

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1718#issuecomment-50909062 I merged this PR to master to not break the assembly jar. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2786][mllib] Python correlations

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1713#discussion_r15717240 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -456,6 +458,37 @@ class PythonMLLibAPI extends Serializable

[GitHub] spark pull request: [SPARK-2786][mllib] Python correlations

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1713#discussion_r15717696 --- Diff: python/pyspark/mllib/stat.py --- @@ -0,0 +1,103 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2786][mllib] Python correlations

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1713#discussion_r15718765 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -49,43 +49,48 @@ private[stat] trait Correlation

[GitHub] spark pull request: [SPARK-2786][mllib] Python correlations

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1713#discussion_r15718783 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -456,6 +458,37 @@ class PythonMLLibAPI extends Serializable

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-50934956 @freeman-lab I think the static methods `StreamingLinearRegressionWithSGD.start` are not necessary, and these methods actually do not start anything. Do you mind removing

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15719184 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50938240 @miccagiann In case you didn't know, today is the deadline for merging new features into v1.1. MLlib is less strict than core but we tried to meet the deadline

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15720780 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15720792 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15720822 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15720956 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721094 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721159 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721232 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721254 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721390 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721454 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721563 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721747 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721768 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721765 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721802 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721791 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721821 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721868 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721908 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala --- @@ -0,0 +1,40 @@ +/* +* Licensed to the Apache Software

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721918 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721930 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15721977 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2796] [mllib] DecisionTree bug fix: ord...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1720#discussion_r15722196 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -522,28 +522,36 @@ object DecisionTree extends Serializable with Logging

[GitHub] spark pull request: [SPARK-2796] [mllib] DecisionTree bug fix: ord...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1720#discussion_r15722274 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -42,6 +42,18 @@ class DecisionTreeSuite extends FunSuite

[GitHub] spark pull request: [SPARK-2796] [mllib] DecisionTree bug fix: ord...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1720#issuecomment-50942241 LGTM. Waiting for Jenkins ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2796] [mllib] DecisionTree bug fix: ord...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1720#issuecomment-50943502 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15723222 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala --- @@ -162,45 +162,55 @@ object GradientDescent extends Logging

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15723261 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala --- @@ -162,45 +162,55 @@ object GradientDescent extends Logging

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15723317 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50946926 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50946978 @miccagiann Thanks for updating the PR! LGTM and waiting for Jenkins ... --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15724703 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,375 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15724782 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala --- @@ -174,17 +182,18 @@ object GradientDescent extends Logging

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15724803 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala --- @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15724885 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15724913 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725089 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,92 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725093 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,92 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725100 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,92 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725151 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -19,6 +19,8 @@ package org.apache.spark.mllib.api.python

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725310 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725329 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725364 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725375 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725411 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725482 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -459,6 +466,76 @@ class PythonMLLibAPI extends Serializable

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725504 --- Diff: python/pyspark/mllib/tree.py --- @@ -0,0 +1,219 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1361#discussion_r15725538 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15725584 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,92 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50949076 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50949963 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50949959 Jenkins, add to whitelist. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1719#issuecomment-50949973 How about making more iterations? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50951638 Yes, you need to merge the latest master and resolve conflicts first. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-1580][MLLIB] Estimate ALS communication...

2014-08-01 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/1731 [SPARK-1580][MLLIB] Estimate ALS communication and computation costs. Continue the work from #493. Closes #493 and Closes #593 You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50952102 LGTM. Waiting for Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50952132 I added you to the whitelist. Jenkins should be triggered automatically for changes from you. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50952343 Great! Do you mind adding regularization type and intercept to other linear methods? For example, `LogisticRegressionWithSGD` and `SVMWithSGD`. --- If your project

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-50952454 LGTM. Merged into master. Thanks a lot for putting Streaming and MLlib together! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50952492 It should be part of the same JIRA. But let's do that in a separate PR. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-2801][MLlib]: DistributionGenerator ren...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1732#issuecomment-50952558 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50953229 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15727075 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,401 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15727095 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,401 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [MLlib] [SPARK-2510]word2vec: Distributed Repr...

2014-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1719#discussion_r15727134 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -0,0 +1,401 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-1580][MLLIB] Estimate ALS communication...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1731#issuecomment-50953592 Merged into master. Thanks @tmyklebu for the work! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Adding OWL-QN optimizer for L1 regularizations...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/840#issuecomment-50953718 @codedeft Could you add `[SPARK-1892][MLLIB]` to the title of this PR? So it shows up in the result if people search for the JIRA or `[MLLIB]`. Thanks! --- If your

[GitHub] spark pull request: [SPARK-2550][MLLIB][APACHE SPARK] Support regu...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1624#issuecomment-50953756 I re-opened the JIRA. Please use the same JIRA number for your new PR. Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-2801][MLlib]: DistributionGenerator ren...

2014-08-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1732#issuecomment-50954562 LGTM. Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729982 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729983 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729985 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729981 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729990 --- Diff: python/run-tests --- @@ -71,6 +71,7 @@ run_test pyspark/mllib/random.py run_test pyspark/mllib/recommendation.py run_test pyspark/mllib

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729987 --- Diff: python/pyspark/mllib/tests.py --- @@ -127,9 +128,19 @@ def test_classification(self): self.assertTrue(nb_model.predict(features[2]) = 0

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729986 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729984 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729989 --- Diff: python/pyspark/mllib/tests.py --- @@ -256,9 +276,19 @@ def test_classification(self): self.assertTrue(nb_model.predict(features[2]) = 0

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15729992 --- Diff: python/pyspark/mllib/util.py --- @@ -29,9 +30,9 @@ class MLUtils: Helper methods to load, save and pre-process data used in MLlib

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1727#issuecomment-50972970 LGTM. Merged into both master and branch-1.1. Thanks!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2197] [mllib] Java DecisionTree bug fix...

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1740#discussion_r15732026 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala --- @@ -17,6 +17,8 @@ package

[GitHub] spark pull request: [SPARK-2197] [mllib] Java DecisionTree bug fix...

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1740#discussion_r15732025 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala --- @@ -60,4 +62,31 @@ class Strategy ( val

[GitHub] spark pull request: [SPARK-2197] [mllib] Java DecisionTree bug fix...

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1740#discussion_r15732032 --- Diff: mllib/src/test/java/org/apache/spark/mllib/tree/JavaDecisionTreeSuite.java --- @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2197] [mllib] Java DecisionTree bug fix...

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1740#discussion_r15732035 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.mllib.tree

[GitHub] spark pull request: [SPARK-2197] [mllib] Java DecisionTree bug fix...

2014-08-02 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1740#issuecomment-50977449 Thanks! This looks good to me except minor style issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2478] [mllib] DecisionTree Python API

2014-08-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1727#discussion_r15732257 --- Diff: examples/src/main/python/mllib/tree.py --- @@ -0,0 +1,129 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more

<    2   3   4   5   6   7   8   9   10   11   >