[GitHub] spark pull request: [SPARK-10266][Documentation, ML] Fixed @Since ...

2015-11-24 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/9338#issuecomment-159428812
  
Hi @jkbradley - the version numbers look OK - but I've been unable to get 
the Scala API docs building in my environment (not just this branch - I get the 
same problems on master) - so I can't fully verify it. 

If you can review it locally - that would probably be quicker in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-11-08 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-154882353
  
@yu-iskw Thanks for making the changes - LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-10-25 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-150954098
  
@yu-iskw Almost there - you're just missing the version numbers on 
`GBTClassifier` class + methods (they should all be set to `1.4.0`), everything 
else looks good though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...

2015-10-25 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8627#issuecomment-150949269
  
Test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...

2015-10-21 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8627#issuecomment-150025192
  
@yu-iskw Thanks for testing!
@mengxr I've rebased now - the changes were very minor. Should be good to 
merge once the tests complete.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-10-21 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-150023630
  
@yu-iskw Thanks for removing those methods - but did you loose some of the 
version numbers when rebasing? I.e. ```DecisionTreeClassifier.getImpurity()``` 
has a ```@since```, but ```RandomForestClassifier.getImpurity()``` doesn't now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-10-21 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8684#issuecomment-150018838
  
I think we should keep the comments - but just tweak them so that they're 
valid [reStructuredText](http://sphinx-doc.org/rest.html). 

In this case all that really needs to be done is to remove the square 
brackets from URLs and format the indented blocks in the same way as the 
formula in the ```RidgeRegressionWithSGD.train()``` method (i.e. 4 lines indent 
+ newlines before + after).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...

2015-10-21 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8627#issuecomment-149997391
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...

2015-09-21 Thread noel-smith
Github user noel-smith closed the pull request at:

https://github.com/apache/spark/pull/8855


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...

2015-09-21 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8855

[Doc][PySpark][MLLib] Added newlines to docstrings to fix parameter 
formatting (1.5 backport)

Backport of #8851 for 1.5 branch.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
docstring-missing-newline-fix-1-5-backport

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8855.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8855


commit 6d779f6bb2f3e7d08b8a8a5bd25e5e3f90b64010
Author: noelsmith 
Date:   2015-09-21T21:24:19Z

Added newlines to docstrings to fix parameter formatting - backport for 1.5 
branch.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...

2015-09-21 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8851#issuecomment-142115598
  
Sure - should be straightforward, I'll take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...

2015-09-21 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8851

[Doc][PySpark][MLLib] Added newlines to docstrings to fix parameter 
formatting

Added newlines before `:param ...:` and `:return:` markup. Without these, 
parameter lists aren't formatted correctly in the API docs. I.e:

![screen shot 2015-09-21 at 21 49 
26](https://cloud.githubusercontent.com/assets/11915197/10004686/de3c41d4-60aa-11e5-9c50-a46dcb51243f.png)

.. looks like this once newline is added:

![screen shot 2015-09-21 at 21 50 
14](https://cloud.githubusercontent.com/assets/11915197/10004706/f86bfb08-60aa-11e5-8524-ae4436713502.png)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark docstring-missing-newline-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8851.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8851


commit fb706bb85e65024c5955f9c2dacda0c64dae102b
Author: noelsmith 
Date:   2015-09-21T20:35:49Z

Added newlines to docstring to fix parameter formatting




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...

2015-09-19 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8571#issuecomment-141709019
  
Tweaked JavaScript to make it more robust to Sphinx changes. Set minimal 
version of Sphinx to `1.2`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-09-17 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-141213397
  
The comments on `MultilayerPerceptronClassifier` + 
`MultilayerPerceptronClassifierModel` are good. Just need to remove the four 
functions noted above (these were deleted recently in master).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-09-17 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8690#discussion_r39791016
  
--- Diff: python/pyspark/ml/classification.py ---
@@ -116,6 +120,37 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
 def _create_model(self, java_model):
 return LogisticRegressionModel(java_model)
 
+@since("1.4.0")
+def setElasticNetParam(self, value):
+"""
+Sets the value of :py:attr:`elasticNetParam`.
+"""
+self._paramMap[self.elasticNetParam] = value
+return self
+
+@since("1.4.0")
+def getElasticNetParam(self):
+"""
+Gets the value of elasticNetParam or its default value.
+"""
+return self.getOrDefault(self.elasticNetParam)
+
+@since("1.4.0")
+def setFitIntercept(self, value):
+"""
+Sets the value of :py:attr:`fitIntercept`.
+"""
+self._paramMap[self.fitIntercept] = value
+return self
+
+@since("1.4.0")
+def getFitIntercept(self):
+"""
+Gets the value of fitIntercept or its default value.
+"""
+return self.getOrDefault(self.fitIntercept)
+
--- End diff --

Should remove the four functions above - they were deleted by a recent 
commit to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8627#issuecomment-140899724
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10285][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8695#discussion_r39686456
  
--- Diff: python/pyspark/ml/util.py ---
@@ -36,6 +39,8 @@ def wrapper(*args, **kwargs):
 class Identifiable(object):
 """
 Object with a unique ID.
+
+.. addedversion:: 1.3.0
--- End diff --

Can remove this, as it doesn't appear in the generated API docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10284][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8694#issuecomment-140896835
  
Just need to replace `.. addedversion::` with `.. versionadded::` and
add a version number to `ParamGridBuilder.build()`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8693#discussion_r39684621
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -147,6 +159,8 @@ class TreeRegressorParams(object):
 class RandomForestParams(object):
 """
 Private class to track supported random forest parameters.
+
+.. addedversion:: 1.4.0
--- End diff --

Can also remove this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8693#discussion_r39684568
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -154,6 +168,8 @@ class RandomForestParams(object):
 class GBTParams(object):
 """
 Private class to track supported GBT params.
+
--- End diff --

This is non-public too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8693#discussion_r39684460
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -140,6 +150,8 @@ def intercept(self):
 class TreeRegressorParams(object):
 """
 Private class to track supported impurity measures.
+
+.. addedversion:: 1.4.0
--- End diff --

Can probably remove this is at isn't a public class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8693#issuecomment-140892197
  
Replace `.. addedversion::` with `.. versionadded::`.

Merge from master to remove `setElasticNetParam` and `getElasticNetParam` 
functions.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10282][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8692#issuecomment-140889476
  
Need to replace `.. addedversion::` with `.. versionadded::` in the class 
docstrings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10281][ML][PySpark][Docs] Add @since an...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8691#issuecomment-140870734
  
No problems, LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10269] Add @since annotation to pyspark...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8626#issuecomment-140868760
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-140856366
  
Some functions have been removed since this PR was created + there are some 
new classes (`MultilayerPerceptronClassifier`) without version numbers.

@mengxr How do you usually handle this? Would you expect the PR to be 
rebased/merged with the latest version of the file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8690#issuecomment-140853730
  
Should replace `.. addedversion::` with `.. versionadded::` in all the 
class docstrings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10279][MLlib][PySpark][Docs] Add @since...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8689#issuecomment-140836995
  
That's works too, thanks for resolving - LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8685#issuecomment-140835549
  
Thanks for the changes - LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-16 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8684#issuecomment-140833937
  
That's great - LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10279][MLlib][PySpark][Docs] Add @since...

2015-09-15 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8689#issuecomment-140536184
  
These changes are already included in #8685. Can probably drop this PR and 
merge SPARK-10279 with SPARK-10278.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8685#issuecomment-140533249
  
Just minor fixes needed - change `addedversion` to `versionadded` + one 
version number alteration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39562007
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -235,6 +248,8 @@ def save(self, sc, path):
 class Loader(object):
 """
 Mixin for classes which can load saved models from files.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39562037
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -280,15 +297,21 @@ def _load_java(cls, sc, path):
 return java_obj.load(sc._jsc.sc(), path)
 
 @classmethod
+@since("1.3.0")
 def load(cls, sc, path):
+"""Load a model from the given path."""
 java_model = cls._load_java(sc, path)
 return cls(java_model)
 
 
 class LinearDataGenerator(object):
-"""Utils for generating linear data"""
+"""Utils for generating linear data.
+
+.. addedversion:: 1.5.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39562027
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -256,6 +271,8 @@ class JavaLoader(Loader):
 """
 Mixin for classes which can load saved models using its Scala
 implementation.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561998
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -222,9 +231,13 @@ class JavaSaveable(Saveable):
 """
 Mixin for models that provide save() through their Scala
 implementation.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561981
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -197,6 +204,8 @@ def loadVectors(sc, path):
 class Saveable(object):
 """
 Mixin for models and transformers which may be saved as files.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561958
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -32,6 +32,8 @@ class MLUtils(object):
 
 """
 Helper methods to load, save and pre-process data used in MLlib.
+
+.. addedversion:: 1.0.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561631
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -431,6 +461,8 @@ class GradientBoostedTrees(object):
 
 Learning algorithm for a gradient boosted trees model for
 classification or regression.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561561
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -418,6 +446,8 @@ class GradientBoostedTreesModel(TreeEnsembleModel, 
JavaLoader):
 .. note:: Experimental
 
 Represents a gradient-boosted tree model.
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561531
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -252,6 +276,8 @@ class RandomForest(object):
 
 Learning algorithm for a random forest model for classification or
 regression.
+
+.. addedversion:: 1.2.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561422
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -30,6 +30,11 @@
 
 
 class TreeEnsembleModel(JavaModelWrapper, JavaSaveable):
+"""TreeEnsembleModel
+
+.. addedversion:: 1.3.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561444
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -72,7 +80,10 @@ class DecisionTreeModel(JavaModelWrapper, JavaSaveable, 
JavaLoader):
 .. note:: Experimental
 
 A decision tree model for classification or regression.
+
+.. addedversion:: 1.1.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561473
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -115,6 +133,8 @@ class DecisionTree(object):
 
 Learning algorithm for a decision tree model for classification or
 regression.
+
+.. addedversion:: 1.1.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39561501
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -239,6 +261,8 @@ class RandomForestModel(TreeEnsembleModel, JavaLoader):
 .. note:: Experimental
 
 Represents a random forest model.
+
+.. addedversion:: 1.2.0
--- End diff --

Change to `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8685#discussion_r39560663
  
--- Diff: python/pyspark/mllib/tree.py ---
@@ -90,16 +101,23 @@ def predict(self, x):
 else:
 return self.call("predict", _convert_to_vector(x))
 
+@since("1.1.0")
 def numNodes(self):
+"""Get number of nodes in tree, including leaf nodes."""
 return self._java_model.numNodes()
 
+@since("1.1.0")
 def depth(self):
+"""Get depth of tree.
+E.g.: Depth 0 means 1 leaf node.  Depth 1 means 1 internal node 
and 2 leaf nodes.
+"""
 return self._java_model.depth()
 
 def __repr__(self):
 """ summary of model. """
 return self._java_model.toString()
 
+@since("1.3.0")
--- End diff --

This is from 1.2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8684#issuecomment-140524867
  
Looks good - just need to replace `addedversion` with `versionadded` in the 
class docstrings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558433
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -640,6 +736,8 @@ class 
StreamingLinearRegressionWithSGD(StreamingLinearAlgorithm):
 :param: numIterations Total number of iterations run.
 :param: miniBatchFraction Fraction of data on which SGD is run for each
   iteration.
+
+.. addedversion:: 1.5.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558391
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -571,8 +641,29 @@ def load(cls, sc, path):
 
 
 class IsotonicRegression(object):
+"""
+Isotonic regression.
+Currently implemented using parallelized pool adjacent violators 
algorithm.
+Only univariate (single feature) algorithm supported.
+
+Sequential PAV implementation based on:
+Tibshirani, Ryan J., Holger Hoefling, and Robert Tibshirani.
+  "Nearly-isotonic regression." Technometrics 53.1 (2011): 54-61.
+  Available from 
[[http://www.stat.cmu.edu/~ryantibs/papers/neariso.pdf]]
+
+Sequential PAV parallelization based on:
+Kearsley, Anthony J., Richard A. Tapia, and Michael W. Trosset.
+  "An approach to parallelizing isotonic regression."
+  Applied Mathematics and Parallel Computing. Physica-Verlag HD, 1996. 
141-147.
+  Available from 
[[http://softlib.rice.edu/pub/CRPC-TRs/reports/CRPC-TR96640.pdf]]
+
+@see [[http://en.wikipedia.org/wiki/Isotonic_regression Isotonic 
regression (Wikipedia)]]
+
+.. addedversion:: 1.4.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558375
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -523,6 +586,8 @@ class IsotonicRegressionModel(Saveable, Loader):
 ... rmtree(path)
 ... except OSError:
 ... pass
+
+.. addedversion:: 1.4.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558409
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -590,10 +681,13 @@ class StreamingLinearAlgorithm(object):
 Base class that has to be inherited by any StreamingLinearAlgorithm.
 
 Prevents reimplementation of methods predictOn and predictOnValues.
+
+.. addedversion:: 1.5.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558359
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -445,8 +497,19 @@ def load(cls, sc, path):
 
 
 class RidgeRegressionWithSGD(object):
+"""
+Train a regression model with L2-regularization using Stochastic 
Gradient Descent.
+This solves the l2-regularized least squares regression formulation
+ f(weights) = 1/2n ||A weights-y||^2^  + regParam/2 
||weights||^2^
+Here the data matrix has n rows, and the input RDD holds the set of 
rows of A, each with
+its corresponding right hand side label y.
+See also the documentation for the precise formulation.
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558311
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -326,8 +361,19 @@ def load(cls, sc, path):
 
 
 class LassoWithSGD(object):
+"""
+Train a regression model with L1-regularization using Stochastic 
Gradient Descent.
+This solves the l1-regularized least squares regression formulation
+ f(weights) = 1/2n ||A weights-y||^2^  + regParam ||weights||_1
+Here the data matrix has n rows, and the input RDD holds the set of 
rows of A, each with
+its corresponding right hand side label y.
+See also the documentation for the precise formulation.
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558335
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -428,14 +474,20 @@ class RidgeRegressionModel(LinearRegressionModelBase):
 True
 >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5
 True
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558268
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -198,8 +215,20 @@ def _regression_train_wrapper(train_func, modelClass, 
data, initial_weights):
 
 
 class LinearRegressionWithSGD(object):
+"""
+Train a linear regression model with no regularization using 
Stochastic Gradient Descent.
+This solves the least squares regression formulation
+ f(weights) = 1/n ||A weights-y||^2^
+(which is the mean squared error).
+Here the data matrix has n rows, and the input RDD holds the set of 
rows of A, each with
+its corresponding right hand side label y.
+See also the documentation for the precise formulation.
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558296
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -309,14 +338,20 @@ class LassoModel(LinearRegressionModelBase):
 True
 >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5
 True
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39558247
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -162,14 +173,20 @@ class 
LinearRegressionModel(LinearRegressionModelBase):
 True
 >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5
 True
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39552511
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -93,8 +101,11 @@ class LinearRegressionModelBase(LinearModel):
 True
 >>> abs(lrmb.predict(SparseVector(2, {0: -1.03, 1: 7.777})) - 14.624) 
< 1e-6
 True
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39552470
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -65,6 +67,8 @@ class LinearModel(object):
 
 :param weights: Weights computed for every feature.
 :param intercept: Intercept computed for this model.
+
+.. addedversion:: 0.9.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8684#discussion_r39552087
  
--- Diff: python/pyspark/mllib/regression.py ---
@@ -42,6 +42,8 @@ class LabeledPoint(object):
 column matrix)
 
 Note: 'label' and 'features' are accessible as class attributes.
+
+.. addedversion:: 1.0.0
--- End diff --

Switch to `.. versionadded`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [PySpark][MLlib][Docs] Replaced addversion wit...

2015-09-15 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8773

[PySpark][MLlib][Docs] Replaced addversion with versionadded in mllib.random

Missed this when reviewing `pyspark.mllib.random` for SPARK-10275.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark mllib-random-versionadded-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8773.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8773


commit a21e0ca909d87243a4fbe9508c36eec0ce710386
Author: noelsmith 
Date:   2015-09-15T18:48:36Z

Replaced addversion with versionadded in mllib.random




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8677#discussion_r39547963
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -36,6 +36,8 @@ class Rating(namedtuple("Rating", ["user", "product", 
"rating"])):
 (1, 2, 5.0)
 >>> (r[0], r[1], r[2])
 (1, 2, 5.0)
+
+.. addedversion:: 1.2.0
--- End diff --

Should be `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8677#discussion_r39548087
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -157,17 +167,25 @@ def recommendProducts(self, user, num):
 return list(self.call("recommendProducts", user, num))
 
 @property
+@since("1.3.1")
 def rank(self):
+"""Rank for the features in this model"""
 return self.call("rank")
 
 @classmethod
+@since("1.3.1")
 def load(cls, sc, path):
+"""Load a model from the given path"""
 model = cls._load_java(sc, path)
 wrapper = sc._jvm.MatrixFactorizationModelWrapper(model)
 return MatrixFactorizationModel(wrapper)
 
 
 class ALS(object):
+"""Alternating Least Squares matrix factorization
+
+.. addedversion:: 0.9.0
--- End diff --

Same again - should be `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...

2015-09-15 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8677#discussion_r39547983
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -111,13 +113,17 @@ class MatrixFactorizationModel(JavaModelWrapper, 
JavaSaveable, JavaLoader):
 ... rmtree(path)
 ... except OSError:
 ... pass
+
+.. addedversion:: 0.9.0
--- End diff --

Should be `.. versionadded::`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...

2015-09-14 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8677#issuecomment-140230088
  
LGTM apart from the one minor issue above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8627#discussion_r39455034
  
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -129,20 +135,32 @@ def computeCost(self, rdd):
  [_convert_to_vector(c) for c in self.centers])
 return cost
 
+@since(1.4)
 def save(self, sc, path):
+"""
+Save this model to the given path.
+"""
 java_centers = _py2java(sc, [_convert_to_vector(c) for c in 
self.centers])
 java_model = 
sc._jvm.org.apache.spark.mllib.clustering.KMeansModel(java_centers)
 java_model.save(sc._jsc.sc(), path)
 
 @classmethod
+@since('1.4.0')
 def load(cls, sc, path):
+"""
+Load a model from the given path.
+"""
 java_model = 
sc._jvm.org.apache.spark.mllib.clustering.KMeansModel.load(sc._jsc.sc(), path)
 return KMeansModel(_java2py(sc, java_model.clusterCenters()))
 
 
 class KMeans(object):
+"""
+.. versionadded:: 0.9.1
--- End diff --

Good point - I'll update to 1.9.2 to match public releases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8627#discussion_r39453363
  
--- Diff: python/pyspark/mllib/clustering.py ---
@@ -129,20 +135,32 @@ def computeCost(self, rdd):
  [_convert_to_vector(c) for c in self.centers])
 return cost
 
+@since(1.4)
 def save(self, sc, path):
+"""
+Save this model to the given path.
+"""
 java_centers = _py2java(sc, [_convert_to_vector(c) for c in 
self.centers])
 java_model = 
sc._jvm.org.apache.spark.mllib.clustering.KMeansModel(java_centers)
 java_model.save(sc._jsc.sc(), path)
 
 @classmethod
+@since('1.4.0')
 def load(cls, sc, path):
+"""
+Load a model from the given path.
+"""
 java_model = 
sc._jvm.org.apache.spark.mllib.clustering.KMeansModel.load(sc._jsc.sc(), path)
 return KMeansModel(_java2py(sc, java_model.clusterCenters()))
 
 
 class KMeans(object):
+"""
+.. versionadded:: 0.9.1
--- End diff --

@davies Was `1.9.0-incubating` a public release? If so I'll update this to 
from `1.9.1` to `1.9.0`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8677#discussion_r39452996
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -157,17 +167,25 @@ def recommendProducts(self, user, num):
 return list(self.call("recommendProducts", user, num))
 
 @property
+@since("1.3.1")
--- End diff --

Think this was only added in 1.4.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10275][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8666#issuecomment-140215395
  
All LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8665#issuecomment-140199152
  
Sounds good - thanks for confirming - I'll reinstate the thre-part version 
numbers in my PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8665#issuecomment-140185860
  
@mengxr @davies Just to confirm, before @yu-iskw and I make the changes - 
we want to stick with the two-part version numbers (`@since(1.4)`) used in 
`pyspark.sql` instead of the full 3-part numbers (`1.4.0`) used in the other 
APIs - correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8665#discussion_r39436349
  
--- Diff: python/pyspark/mllib/fpm.py ---
@@ -58,6 +61,7 @@ class FPGrowth(object):
 """
 
 @classmethod
+@since("1.4.0")
--- End diff --

Same here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8665#discussion_r39436251
  
--- Diff: python/pyspark/mllib/fpm.py ---
@@ -41,8 +41,11 @@ class FPGrowthModel(JavaModelWrapper):
 >>> model = FPGrowth.train(rdd, 0.6, 2)
 >>> sorted(model.freqItemsets().collect())
 [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), 
...
+
+.. addedversion:: 1.4.0
 """
 
+@since("1.4.0")
--- End diff --

Maybe use float here: `@since(1.4)` - based on suggestion: 
https://github.com/apache/spark/pull/8633#discussion_r39088552


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...

2015-09-14 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8665#discussion_r39435855
  
--- Diff: python/pyspark/mllib/fpm.py ---
@@ -41,8 +41,11 @@ class FPGrowthModel(JavaModelWrapper):
 >>> model = FPGrowth.train(rdd, 0.6, 2)
 >>> sorted(model.freqItemsets().collect())
 [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), 
...
+
+.. addedversion:: 1.4.0
--- End diff --

To be consistent with the `pyspark.sql` module we ought to use 2-part 
version numbers (i.e. `1.4`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...

2015-09-14 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8633#issuecomment-140179397
  
@mengxr @yu-iskw - Sounds like a plan - I'll take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...

2015-09-10 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8571#issuecomment-139411836
  
Nice - thanks for finding that! 

It probably indicates I need to make the JS logic a bit more robust though. 
I'll take a look at the Sphinx history and find out how it's changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...

2015-09-10 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8571#issuecomment-139404962
  
@davies Could you let me know your browser/platform + whether the example 
[link](https://dl.dropboxusercontent.com/u/20821334/pyspark-api-nav-enhance/pyspark.mllib.html)
 works for you? I'll try and reproduce locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...

2015-09-10 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8571#issuecomment-139387988
  
Hmm that's odd - that's exactly the way I'm building too. Did your 
generated html pages include the links to the `pyspark.js` and `pyspark.css` 
files?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...

2015-09-09 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8633#discussion_r39096533
  
--- Diff: python/pyspark/mllib/feature.py ---
@@ -84,11 +84,14 @@ class Normalizer(VectorTransformer):
 >>> nor2 = Normalizer(float("inf"))
 >>> nor2.transform(v)
 DenseVector([0.0, 0.5, 1.0])
+
+.. versionadded:: 1.2.0
--- End diff --

I think matching the overall project versioning scheme make it's clearer - 
but I'm happy to implement it either way. 

One thing to watch for with using floats is that you can't differentiate 
between 1.1 and 1.10 (but it looks like that's unlikely to be a problem from 
the versioning history).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10094] Pyspark ML Feature transformers ...

2015-09-09 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8623#issuecomment-138812209
  
I also added a related PR https://github.com/apache/spark/pull/8571 about 
highlighting experimental features the API docs. Would be useful to get a 
second opinion on it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10373] [PYSPARK] move @since into pyspa...

2015-09-08 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8657#discussion_r38992400
  
--- Diff: python/pyspark/__init__.py ---
@@ -48,6 +48,22 @@
 from pyspark.status import *
 from pyspark.profiler import Profiler, BasicProfiler
 
+
+def since(version):
+"""
+A decorator that annotates a function to append the version of Spark 
the function was added.
+"""
+import re
+indent_p = re.compile(r'\n( +)')
+
+def deco(f):
--- End diff --

You're right, that makes more sense. Ignore the comment above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10373] [PYSPARK] move @since into pyspa...

2015-09-08 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8657#discussion_r38990876
  
--- Diff: python/pyspark/__init__.py ---
@@ -48,6 +48,22 @@
 from pyspark.status import *
 from pyspark.profiler import Profiler, BasicProfiler
 
+
+def since(version):
+"""
+A decorator that annotates a function to append the version of Spark 
the function was added.
+"""
+import re
+indent_p = re.compile(r'\n( +)')
+
+def deco(f):
--- End diff --

Would it be OK to add a clause to handle situations where f.__doc__ is 
None? There are a handful of methods without docstrings. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...

2015-09-08 Thread noel-smith
Github user noel-smith commented on a diff in the pull request:

https://github.com/apache/spark/pull/8633#discussion_r38990506
  
--- Diff: python/pyspark/__init__.py ---
@@ -51,6 +51,26 @@
 # for back compatibility
 from pyspark.sql import SQLContext, HiveContext, SchemaRDD, Row
 
+
+def since(version):
--- End diff --

OK - no problem - I'll update this (+ SPARK-10269/10271/10272) once 
SPARK-10373 is complete.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...

2015-09-06 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8633

[SPARK-10273] Add @since annotation to pyspark.mllib.feature

Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked 
to handle functions without docstrings).

Added @since to methods + "versionadded::" to classes (derived from the git 
file history in pyspark).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10273-since-mllib-feature

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8633


commit 3dadc0368d0ccbba1967da7e3e70fa462b15befc
Author: noelsmith 
Date:   2015-09-06T21:18:25Z

Added @since to mllib.feature




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10272] Added @since tags to pyspark.mll...

2015-09-06 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8628

[SPARK-10272] Added @since tags to pyspark.mllib.evaluation

Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked 
to handle functions without docstrings).

Added @since to public methods + "versionadded::" to classes (derived from 
the git file history in pyspark).

Note - I added also the tags to MultilabelMetrics even though it isn't 
declared as public in the __all__ statement... if that's incorrect - I'll 
remove.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10272-since-mllib-evalutation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8628.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8628


commit 43a9a999016c08d8e044d1864e792ca1e7fb67a2
Author: noelsmith 
Date:   2015-09-06T19:06:24Z

Added @since tags to mllib.evaluation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...

2015-09-06 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8627

[SPARK-10271] Added @since tags to pyspark.mllib.clustering

Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked 
to handle functions without docstrings).

Added @since to methods + "versionadded::" to classes (derived from the git 
file history in pyspark).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10271-since-mllib-clustering

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8627.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8627


commit 23a075f4af8b6b77755b062a7864c775e68b383b
Author: noelsmith 
Date:   2015-09-06T18:40:48Z

Added @since tags to mllib.clustering




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...

2015-09-06 Thread noel-smith
GitHub user noel-smith reopened a pull request:

https://github.com/apache/spark/pull/8626

Add @since annotation to pyspark.mllib.classification

Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked 
to handle functions without docstrings).

Added @since to methods + "versionadded::" to classes derived from the file 
history.

Note - some methods are inherited from the regression module (i.e. 
LinearModel.intercept) so these won't have version numbers in the API docs 
until that model is updated.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10269-since-mlib-classification

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8626.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8626


commit 0c8a844e3e6aeacf01e4efa0904b2b2cf9b1fd1d
Author: noelsmith 
Date:   2015-09-06T16:23:33Z

Added placeholder since decorator + version numbers




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...

2015-09-06 Thread noel-smith
Github user noel-smith closed the pull request at:

https://github.com/apache/spark/pull/8626


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...

2015-09-06 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8626

Add @since annotation to pyspark.mllib.classification

Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked 
to handle functions without docstrings).

Added @since to methods + "versionadded::" to classes derived from the file 
history.

Note - some methods are inherited from the regression module (i.e. 
LinearModel.intercept) so these won't have version numbers in the API docs 
until that model is updated.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10269-since-mlib-classification

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8626.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8626


commit 0c8a844e3e6aeacf01e4efa0904b2b2cf9b1fd1d
Author: noelsmith 
Date:   2015-09-06T16:23:33Z

Added placeholder since decorator + version numbers




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10094] Pyspark ML Feature transformers ...

2015-09-06 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8623

[SPARK-10094] Pyspark ML Feature transformers marked as experimental

Modified class-level docstrings to mark all feature transformers in 
pyspark.ml as experimental.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark 
SPARK-10094-mark-pyspark-ml-trans-exp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8623.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8623


commit 53eec78b38c77fac94575503e86b3bc51da1f6a4
Author: noelsmith 
Date:   2015-09-06T09:07:14Z

Pyspark tranformers marked as experimental




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...

2015-09-02 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8571

[SPARK-10415][PySpark] Enhance Navigation Sidebar in PySpark API

These are CSS/JavaScript changes to add classes/functions + a few other 
tweaks to make navigation in the PySpark API a bit simpler.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark pyspark-api-nav-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8571.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8571


commit 06e3ddc40a18d5d8cecb1dccf84eac6bc2401b08
Author: noelsmith 
Date:   2015-08-31T21:44:34Z

Added class + function list to TOC

commit beaea590bf4f9d1852d885ef8f751088561d0aa2
Author: noelsmith 
Date:   2015-09-02T07:34:05Z

Simplified JavaScript + CSS tweaks




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...

2015-08-27 Thread noel-smith
Github user noel-smith commented on the pull request:

https://github.com/apache/spark/pull/8399#issuecomment-135655500
  
That would be great - I've just messaged him. If there are any other 
changes you need to get this into 1.5 I'll get them in ASAP today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...

2015-08-24 Thread noel-smith
GitHub user noel-smith opened a pull request:

https://github.com/apache/spark/pull/8399

[SPARK-10188] [Pyspark] Pyspark CrossValidator with RMSE selects incorrect 
model

* Added isLargerBetter() method to Pyspark Evaluator to match the Scala 
version.
* JavaEvaluator delegates isLargerBetter() to underlying Scala object.
* Added check for isLargerBetter() in CrossValidator to determine whether 
to use argmin or argmax.
* Added test cases for where smaller is better (RMSE) and larger is better 
(R-Squared).

(This contribution is my original work and that I license the work to the 
project under Sparks' open source license)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noel-smith/spark pyspark-rmse-xval-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8399.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8399


commit d00357e40cef090c20ed4089d1cc23ebdaba2918
Author: noelsmith 
Date:   2015-08-24T17:02:13Z

Added test for cross validation

commit 6cd4ed12e4c37e80a3f88f93cb4255a1c011f5af
Author: noelsmith 
Date:   2015-08-24T18:03:18Z

Added/fixed tests for cross validation

commit 63b3835b3676d8c1c19f756d4e9dba5575ef9d3f
Author: noelsmith 
Date:   2015-08-24T18:24:48Z

Removed print statements

commit 7794cf73e10f2b5c57cfff1a2ea4a175e282c33c
Author: noelsmith 
Date:   2015-08-24T18:25:33Z

Added checks for isLargerBetter()




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org