[GitHub] spark pull request: [SPARK-4127] [MLlib] [PySpark] Python bindings...

2015-06-25 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6744#issuecomment-115292948 I moved `StreamingLinearAlgorithm` to `regression.py` so that it complies with `LabeledPoint` and (there were circular imports) --- If your project is set up

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6849#discussion_r33152525 --- Diff: python/pyspark/mllib/tests.py --- @@ -1011,6 +1013,137 @@ def collect(rdd): self.assertEqual(predict_results, [[0, 1, 1], [1, 0, 1

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6849#discussion_r33152218 --- Diff: python/pyspark/mllib/tests.py --- @@ -25,7 +25,8 @@ import array as pyarray from time import time, sleep -from numpy import

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114894945 @mengxr Thanks for the updates! I confirm that the tests are very similar --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6849#discussion_r33152107 --- Diff: python/pyspark/mllib/classification.py --- @@ -580,6 +583,102 @@ def train(cls, data, lambda_=1.0): return NaiveBayesModel

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114936616 jenkins test this again --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114939985 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114969875 Sure, btw the last failure was some random error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114949213 I have rerun the tests. (Strange that the tests that failed in the previous run were related to StreamingKMeans) --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-24 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-114977877 @mengxr Could you give another pass? I can update tomorrow, there is some problem with my ubunt. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-5989] [MLlib] Model save/load for LDA

2015-06-23 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6948#issuecomment-114577327 ping @hhbyyh @jkbradley Would be great if you could have a look ! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-23 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6849#discussion_r33117673 --- Diff: python/pyspark/mllib/classification.py --- @@ -580,6 +583,102 @@ def train(cls, data, lambda_=1.0): return NaiveBayesModel

[GitHub] spark pull request: [SPARK-5989] Model save/load for LDA

2015-06-23 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6948 [SPARK-5989] Model save/load for LDA Add support for saving and loading LDA both the local and distributed versions. You can merge this pull request into a Git repository by running: $ git

[GitHub] spark pull request: [SPARK-5989] [MLlib] Model save/load for LDA

2015-06-23 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6948#discussion_r33017051 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala --- @@ -354,4 +445,126 @@ class DistributedLDAModel private

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6821#discussion_r32894898 --- Diff: python/pyspark/mllib/feature.py --- @@ -447,6 +454,17 @@ class Word2Vec(object): syms = model.findSynonyms(vec, 2) [s[0] for s

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6821#discussion_r32895115 --- Diff: python/pyspark/mllib/feature.py --- @@ -447,6 +454,17 @@ class Word2Vec(object): syms = model.findSynonyms(vec, 2) [s[0] for s

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6821#discussion_r32891085 --- Diff: python/pyspark/mllib/feature.py --- @@ -447,6 +454,17 @@ class Word2Vec(object): syms = model.findSynonyms(vec, 2) [s[0] for s

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6821#discussion_r32891096 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -594,6 +595,8 @@ private[python] class PythonMLLibAPI extends

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6821#discussion_r32891157 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -594,6 +595,8 @@ private[python] class PythonMLLibAPI extends

[GitHub] spark pull request: [SPARK-7104][MLlib] Support model save/load in...

2015-06-21 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6821#issuecomment-113870617 @yu-iskw Sorry for the phony comments. I am trying to learn by helping others. :+1: --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-20 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-113716917 @mengxr Should I close this then? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-20 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-113783694 @mengxr Could you review this? This should be the next one to go as I have refactored the code. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113411808 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113523344 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8479] [MLlib] Add numNonzeros and numAc...

2015-06-19 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6904 [SPARK-8479] [MLlib] Add numNonzeros and numActives to linalg.Matrices Matrices allow zeros to be stored in values. Sometimes a method is handy to check if the numNonZeros are same as number

[GitHub] spark pull request: [SPARK-6364] [MLlib] Implement equals and hash...

2015-06-19 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/5081#discussion_r32833024 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala --- @@ -100,6 +100,50 @@ sealed trait Matrix extends Serializable

[GitHub] spark pull request: [SPARK-6364] [MLlib] Implement equals and hash...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5081#issuecomment-113522900 @mengxr Please have a look at https://github.com/apache/spark/pull/6904/ This will help in ruling out some cases with minimum code repetition. --- If your project

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113484564 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113395933 @mengxr updated ! please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6364] [MLlib] Implement equals and hash...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5081#issuecomment-113428460 Please do not. We just had a discussion about this yesterday. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...

2015-06-19 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r32809380 --- Diff: python/pyspark/mllib/linalg.py --- @@ -821,6 +822,36 @@ def __reduce__(self): self.numRows, self.numCols, self.values.tostring

[GitHub] spark pull request: [SPARK-8479] [MLlib] Add numNonzeros and numAc...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6904#issuecomment-113582010 @mengxr Sorry about that. Btw, when is a foreach preferred as compared to a while loop. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-8265] [MLlib] [PySpark] Add LinearDataG...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6715#issuecomment-113615256 Rebased over master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113615382 Thanks for your reviews and help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6364] [MLlib] Implement equals and hash...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5081#issuecomment-113599173 There seems to be a unrelated whitespace error in L30 in `/spark/examples/src/main/scala/org/apache/spark/examples/DFSReadWriteTest.scala` --- If your project is set

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-19 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5748#issuecomment-113608869 @jkbradley I just had a proper look at this after a long time. I think this PR succeeds in preventing the huge Word2Vec map while constructing the Word2Vec

[GitHub] spark pull request: [SPARK-8265] [MLlib] [PySpark] Add LinearDataG...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6715#issuecomment-113365386 @mengxr Thanks for your comments. I have addressed them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-8265] [MLlib] [PySpark] Add LinearDataG...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6715#issuecomment-113278651 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-18 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6499#discussion_r32759474 --- Diff: docs/mllib-clustering.md --- @@ -593,6 +593,58 @@ ssc.start() ssc.awaitTermination() {% endhighlight %} +/div + +div

[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-113280600 @mengxr The idea is to use __repr__ in order to enable eval(). Sadly we cannot leverage SciPy's methods for matrices since it is an optional dependency. --- If your

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-113277715 @freeman-lab @mengxr Thanks for your valuable comments. I've updated the PR. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-18 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6499#discussion_r32773505 --- Diff: python/pyspark/mllib/tests.py --- @@ -863,6 +876,107 @@ def test_model_transform(self): eprod.transform(sparsevec), SparseVector

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-113052658 I have added tests to verify it. from pyspark.mllib.regression import LabeledPoint lb = LabeledPoint(2, [0.1, 1.2, 3.4]) rdd

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-18 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-113053477 MLUtils.LabeledPoints is just a wrapper around the scala code right? Were you trying to infer that this also should be a wrapper around the parse method in Scala

[GitHub] spark pull request: [SPARK-5673] [MLlib] Implement Streaming wrapp...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/4456#issuecomment-112907685 I'm not an expert but it seems that the PR topic is out of place. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-7605] [MLlib] [PySpark] Python API for ...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6346#issuecomment-112931617 ping @davies Would you be able to have a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5673] [MLlib] Implement Streaming wrapp...

2015-06-17 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/4456#discussion_r32658912 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLassoWithSGD.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7605] [MLlib] [PySpark] Python API for ...

2015-06-17 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6346#discussion_r32670198 --- Diff: python/pyspark/mllib/feature.py --- @@ -525,6 +526,41 @@ def fit(self, data): return Word2VecModel(jmodel) +class

[GitHub] spark pull request: [SPARK-7605] [MLlib] [PySpark] Python API for ...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6346#issuecomment-112940733 @davies I fixed your comments and replaced vector with scalingVector for consistency --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-7605] [MLlib] [PySpark] Python API for ...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6346#issuecomment-113040170 @davies Thanks. Do you want me to look at any particular PR if it needs reviews? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-113044527 Hmm. The present supported format is coherent with that done in Scala. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-113040286 @davies Could you have a look at this also? :P --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [MLlib] [SPARK-7667] MLlib Python API consiste...

2015-06-17 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6856#discussion_r32618639 --- Diff: python/pyspark/mllib/clustering.py --- @@ -96,6 +96,9 @@ def k(self): def predict(self, x): Find the cluster to which

[GitHub] spark pull request: [MLlib] [SPARK-7667] MLlib Python API consiste...

2015-06-17 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6856#discussion_r32618827 --- Diff: python/pyspark/mllib/tree.py --- @@ -90,9 +92,11 @@ def predict(self, x): else: return self.call(predict

[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6354#issuecomment-112803758 ping @davies anything left? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6354#issuecomment-112857730 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3258] [MLlib] [PySpark] Python bindings...

2015-06-16 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6849 [SPARK-3258] [MLlib] [PySpark] Python bindings for StreamingLogisticRegressionwithSGD You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-16 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-112663978 cc @freeman-lab :P --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7633] [MLlib] [PySpark] Python bindings...

2015-06-16 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6849#issuecomment-112648942 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-12 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-111675920 Thanks, this is also closely related to https://github.com/apache/spark/pull/6715 , https://github.com/apache/spark/pull/6744 and https://github.com/apache/spark

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-12 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-111507833 @freeman-lab I think this is ready for a first pass. Please review :) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4127] Python bindings for StreamingLine...

2015-06-10 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6744 [SPARK-4127] Python bindings for StreamingLinearRegressionWithSGD You can merge this pull request into a Git repository by running: $ git pull https://github.com/MechCoder/spark spark-4127

[GitHub] spark pull request: [SPARK-4127] [MLlib] [PySpark] Python bindings...

2015-06-10 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6744#issuecomment-110743132 cc: @freeman-lab Will remove the LinearDataGenerator stuff (or rebase over master) as soon as https://github.com/apache/spark/pull/6715 is merged. --- If your

[GitHub] spark pull request: [SPARK-8291] [MLlib] [PySpark] Add parse funct...

2015-06-10 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6746#issuecomment-110764267 cc: @srowen @brkyvz Can you please have a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8291] Add parse functionality to Labele...

2015-06-10 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6746 [SPARK-8291] Add parse functionality to LabeledPoint in PySpark It is useful to have functionality that can parse a string into a LabeledPoint while loading files, etc You can merge this pull

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-10 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-110616969 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8140] [MLlib] Remove construct to get w...

2015-06-09 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6720#issuecomment-110301758 ping @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8140] [MLlib] Remove construct to get w...

2015-06-09 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6720 [SPARK-8140] [MLlib] Remove construct to get weights in StreamingLinearAlgorithm You can merge this pull request into a Git repository by running: $ git pull https://github.com/MechCoder

[GitHub] spark pull request: [SPARK-8265] [MLlib] [PySpark] Add LinearDataG...

2015-06-09 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6715#issuecomment-110268404 cc: @davies Please have a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8265] [MLlib] [PySpark] Add LinearDataG...

2015-06-09 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6715 [SPARK-8265] [MLlib] [PySpark] Add LinearDataGenerator to pyspark.mllib.utils You can merge this pull request into a Git repository by running: $ git pull https://github.com/MechCoder

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-09 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r32049209 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala --- @@ -63,11 +63,49 @@ class VectorsSuite extends FunSuite

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-09 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6112#issuecomment-110463701 @GeorgeDittmar People are busy with the Spark Summit so it might take some time for core developers to have a last look ;) --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5748#issuecomment-110228137 @jkbradley ping? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-08 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/5748#issuecomment-110228122 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r31921583 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -717,6 +719,53 @@ class SparseVector( new SparseVector(size

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r31923436 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -717,6 +719,53 @@ class SparseVector( new SparseVector(size

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r31923358 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -717,6 +719,53 @@ class SparseVector( new SparseVector(size

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6112#issuecomment-110029570 After the minor comments LGTM. cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r31921723 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -717,6 +719,53 @@ class SparseVector( new SparseVector(size

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

2015-06-08 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6112#discussion_r31923541 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala --- @@ -63,11 +63,56 @@ class VectorsSuite extends FunSuite

[GitHub] spark pull request: [SPARK-8140] [MLlib] Minor internal improvemen...

2015-06-06 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6684 [SPARK-8140] [MLlib] Minor internal improvements in Streaming MLlib Algorithms 1. Prevent creating a map of data to find numFeatures 2. If model is empty, then initialize with a zero vector

[GitHub] spark pull request: [SPARK-8140] [Minor] [MLlib] Minor internal im...

2015-06-06 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6684#discussion_r31868166 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala --- @@ -195,11 +195,11 @@ abstract class

[GitHub] spark pull request: [SPARK-8140] [Minor] [MLlib] Minor internal im...

2015-06-06 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6684#issuecomment-109606777 @srowen I removed the None check and restored the model.isEmpty check. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-8140] [Minor] [MLlib] Minor internal im...

2015-06-06 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6684#discussion_r31869050 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala --- @@ -179,7 +179,7 @@ abstract class

[GitHub] spark pull request: [SPARK-8140] [Minor] [MLlib] Minor internal im...

2015-06-06 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6684#discussion_r31869181 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala --- @@ -179,7 +179,7 @@ abstract class

[GitHub] spark pull request: [SPARK-8140] [Minor] [MLlib] Minor internal im...

2015-06-06 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6684#discussion_r31868258 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala --- @@ -79,9 +79,6 @@ abstract class

[GitHub] spark pull request: [SPARK-7639] [PySpark] [MLlib] Python API for ...

2015-06-06 Thread MechCoder
Github user MechCoder closed the pull request at: https://github.com/apache/spark/pull/6387 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-8140] [MLlib] Remove empty model check ...

2015-06-06 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6684#issuecomment-109700780 Also if `input.first().features.size` is better than `input.map(_.features.size).first()` --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-05 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-109431092 @freeman-lab Finally I've made the tests pass. I would like to know if this is on the right track, since I'm going to add support for Linear / Logistic Regression

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-04 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-109156672 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8032] Make version checking for NumPy i...

2015-06-02 Thread MechCoder
GitHub user MechCoder opened a pull request: https://github.com/apache/spark/pull/6579 [SPARK-8032] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x

[GitHub] spark pull request: [SPARK-8032] [PySpark] Make version checking f...

2015-06-02 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6579#issuecomment-107833080 ping @mengxr @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8032] [PySpark] Make version checking f...

2015-06-02 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6579#issuecomment-108061216 I see, I misunderstood the comment 'MLlib requires NumPy 1.4+ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-8032] [PySpark] Make version checking f...

2015-06-02 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6579#issuecomment-108065006 lol. sorry about the silly mistake. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-02 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6499#discussion_r31555361 --- Diff: python/pyspark/mllib/tests.py --- @@ -818,6 +830,78 @@ def test_model_transform(self): self.assertEqual(model.transform([1.0, 2.0

[GitHub] spark pull request: [SPARK-4118] [MLlib] [PySpark] Python bindings...

2015-06-02 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6499#issuecomment-107970067 @freeman-lab @mengxr Please give a first pass. I've update the PR with a couple of tests. Also, Jenkins is showing me this error in Python 3 for the doctests

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-02 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/5748#discussion_r31533274 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -400,17 +400,13 @@ class Word2Vec extends Serializable with Logging

[GitHub] spark pull request: [SPARK-8032] [PySpark] Make version checking f...

2015-06-02 Thread MechCoder
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6579#issuecomment-107990627 jenkins test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-02 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/5748#discussion_r31537727 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -508,7 +507,7 @@ class Word2VecModel private[mllib

[GitHub] spark pull request: [SPARK-7045] [MLlib] Avoid intermediate repres...

2015-06-02 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/5748#discussion_r31537671 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -426,38 +422,40 @@ class Word2Vec extends Serializable with Logging

<    1   2   3   4   5   6   7   8   9   10   >