[GitHub] spark pull request: [SPARK-10266][Documentation, ML] Fixed @Since ...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/9338#issuecomment-159428812 Hi @jkbradley - the version numbers look OK - but I've been unable to get the Scala API docs building in my environment (not just this branch - I get the same problems on master) - so I can't fully verify it. If you can review it locally - that would probably be quicker in this case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-154882353 @yu-iskw Thanks for making the changes - LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-150954098 @yu-iskw Almost there - you're just missing the version numbers on `GBTClassifier` class + methods (they should all be set to `1.4.0`), everything else looks good though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8627#issuecomment-150949269 Test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8627#issuecomment-150025192 @yu-iskw Thanks for testing! @mengxr I've rebased now - the changes were very minor. Should be good to merge once the tests complete. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-150023630 @yu-iskw Thanks for removing those methods - but did you loose some of the version numbers when rebasing? I.e. ```DecisionTreeClassifier.getImpurity()``` has a ```@since```, but ```RandomForestClassifier.getImpurity()``` doesn't now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8684#issuecomment-150018838 I think we should keep the comments - but just tweak them so that they're valid [reStructuredText](http://sphinx-doc.org/rest.html). In this case all that really needs to be done is to remove the square brackets from URLs and format the indented blocks in the same way as the formula in the ```RidgeRegressionWithSGD.train()``` method (i.e. 4 lines indent + newlines before + after). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271][Pyspark][MLLib] Added @since tag...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8627#issuecomment-149997391 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...
Github user noel-smith closed the pull request at: https://github.com/apache/spark/pull/8855 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8855 [Doc][PySpark][MLLib] Added newlines to docstrings to fix parameter formatting (1.5 backport) Backport of #8851 for 1.5 branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark docstring-missing-newline-fix-1-5-backport Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8855.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8855 commit 6d779f6bb2f3e7d08b8a8a5bd25e5e3f90b64010 Author: noelsmith Date: 2015-09-21T21:24:19Z Added newlines to docstrings to fix parameter formatting - backport for 1.5 branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8851#issuecomment-142115598 Sure - should be straightforward, I'll take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Doc][PySpark][MLLib] Added newlines to docstr...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8851 [Doc][PySpark][MLLib] Added newlines to docstrings to fix parameter formatting Added newlines before `:param ...:` and `:return:` markup. Without these, parameter lists aren't formatted correctly in the API docs. I.e: ![screen shot 2015-09-21 at 21 49 26](https://cloud.githubusercontent.com/assets/11915197/10004686/de3c41d4-60aa-11e5-9c50-a46dcb51243f.png) .. looks like this once newline is added: ![screen shot 2015-09-21 at 21 50 14](https://cloud.githubusercontent.com/assets/11915197/10004706/f86bfb08-60aa-11e5-8524-ae4436713502.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark docstring-missing-newline-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8851.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8851 commit fb706bb85e65024c5955f9c2dacda0c64dae102b Author: noelsmith Date: 2015-09-21T20:35:49Z Added newlines to docstring to fix parameter formatting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8571#issuecomment-141709019 Tweaked JavaScript to make it more robust to Sphinx changes. Set minimal version of Sphinx to `1.2`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-141213397 The comments on `MultilayerPerceptronClassifier` + `MultilayerPerceptronClassifierModel` are good. Just need to remove the four functions noted above (these were deleted recently in master). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8690#discussion_r39791016 --- Diff: python/pyspark/ml/classification.py --- @@ -116,6 +120,37 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre def _create_model(self, java_model): return LogisticRegressionModel(java_model) +@since("1.4.0") +def setElasticNetParam(self, value): +""" +Sets the value of :py:attr:`elasticNetParam`. +""" +self._paramMap[self.elasticNetParam] = value +return self + +@since("1.4.0") +def getElasticNetParam(self): +""" +Gets the value of elasticNetParam or its default value. +""" +return self.getOrDefault(self.elasticNetParam) + +@since("1.4.0") +def setFitIntercept(self, value): +""" +Sets the value of :py:attr:`fitIntercept`. +""" +self._paramMap[self.fitIntercept] = value +return self + +@since("1.4.0") +def getFitIntercept(self): +""" +Gets the value of fitIntercept or its default value. +""" +return self.getOrDefault(self.fitIntercept) + --- End diff -- Should remove the four functions above - they were deleted by a recent commit to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8627#issuecomment-140899724 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10285][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8695#discussion_r39686456 --- Diff: python/pyspark/ml/util.py --- @@ -36,6 +39,8 @@ def wrapper(*args, **kwargs): class Identifiable(object): """ Object with a unique ID. + +.. addedversion:: 1.3.0 --- End diff -- Can remove this, as it doesn't appear in the generated API docs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10284][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8694#issuecomment-140896835 Just need to replace `.. addedversion::` with `.. versionadded::` and add a version number to `ParamGridBuilder.build()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8693#discussion_r39684621 --- Diff: python/pyspark/ml/regression.py --- @@ -147,6 +159,8 @@ class TreeRegressorParams(object): class RandomForestParams(object): """ Private class to track supported random forest parameters. + +.. addedversion:: 1.4.0 --- End diff -- Can also remove this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8693#discussion_r39684568 --- Diff: python/pyspark/ml/regression.py --- @@ -154,6 +168,8 @@ class RandomForestParams(object): class GBTParams(object): """ Private class to track supported GBT params. + --- End diff -- This is non-public too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8693#discussion_r39684460 --- Diff: python/pyspark/ml/regression.py --- @@ -140,6 +150,8 @@ def intercept(self): class TreeRegressorParams(object): """ Private class to track supported impurity measures. + +.. addedversion:: 1.4.0 --- End diff -- Can probably remove this is at isn't a public class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10283][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8693#issuecomment-140892197 Replace `.. addedversion::` with `.. versionadded::`. Merge from master to remove `setElasticNetParam` and `getElasticNetParam` functions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10282][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8692#issuecomment-140889476 Need to replace `.. addedversion::` with `.. versionadded::` in the class docstrings. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10281][ML][PySpark][Docs] Add @since an...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8691#issuecomment-140870734 No problems, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10269] Add @since annotation to pyspark...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8626#issuecomment-140868760 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-140856366 Some functions have been removed since this PR was created + there are some new classes (`MultilayerPerceptronClassifier`) without version numbers. @mengxr How do you usually handle this? Would you expect the PR to be rebased/merged with the latest version of the file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10280][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8690#issuecomment-140853730 Should replace `.. addedversion::` with `.. versionadded::` in all the class docstrings. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10279][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8689#issuecomment-140836995 That's works too, thanks for resolving - LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8685#issuecomment-140835549 Thanks for the changes - LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8684#issuecomment-140833937 That's great - LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10279][MLlib][PySpark][Docs] Add @since...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8689#issuecomment-140536184 These changes are already included in #8685. Can probably drop this PR and merge SPARK-10279 with SPARK-10278. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8685#issuecomment-140533249 Just minor fixes needed - change `addedversion` to `versionadded` + one version number alteration. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39562007 --- Diff: python/pyspark/mllib/util.py --- @@ -235,6 +248,8 @@ def save(self, sc, path): class Loader(object): """ Mixin for classes which can load saved models from files. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39562037 --- Diff: python/pyspark/mllib/util.py --- @@ -280,15 +297,21 @@ def _load_java(cls, sc, path): return java_obj.load(sc._jsc.sc(), path) @classmethod +@since("1.3.0") def load(cls, sc, path): +"""Load a model from the given path.""" java_model = cls._load_java(sc, path) return cls(java_model) class LinearDataGenerator(object): -"""Utils for generating linear data""" +"""Utils for generating linear data. + +.. addedversion:: 1.5.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39562027 --- Diff: python/pyspark/mllib/util.py --- @@ -256,6 +271,8 @@ class JavaLoader(Loader): """ Mixin for classes which can load saved models using its Scala implementation. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561998 --- Diff: python/pyspark/mllib/util.py --- @@ -222,9 +231,13 @@ class JavaSaveable(Saveable): """ Mixin for models that provide save() through their Scala implementation. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561981 --- Diff: python/pyspark/mllib/util.py --- @@ -197,6 +204,8 @@ def loadVectors(sc, path): class Saveable(object): """ Mixin for models and transformers which may be saved as files. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561958 --- Diff: python/pyspark/mllib/util.py --- @@ -32,6 +32,8 @@ class MLUtils(object): """ Helper methods to load, save and pre-process data used in MLlib. + +.. addedversion:: 1.0.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561631 --- Diff: python/pyspark/mllib/tree.py --- @@ -431,6 +461,8 @@ class GradientBoostedTrees(object): Learning algorithm for a gradient boosted trees model for classification or regression. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561561 --- Diff: python/pyspark/mllib/tree.py --- @@ -418,6 +446,8 @@ class GradientBoostedTreesModel(TreeEnsembleModel, JavaLoader): .. note:: Experimental Represents a gradient-boosted tree model. + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561531 --- Diff: python/pyspark/mllib/tree.py --- @@ -252,6 +276,8 @@ class RandomForest(object): Learning algorithm for a random forest model for classification or regression. + +.. addedversion:: 1.2.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561422 --- Diff: python/pyspark/mllib/tree.py --- @@ -30,6 +30,11 @@ class TreeEnsembleModel(JavaModelWrapper, JavaSaveable): +"""TreeEnsembleModel + +.. addedversion:: 1.3.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561444 --- Diff: python/pyspark/mllib/tree.py --- @@ -72,7 +80,10 @@ class DecisionTreeModel(JavaModelWrapper, JavaSaveable, JavaLoader): .. note:: Experimental A decision tree model for classification or regression. + +.. addedversion:: 1.1.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561473 --- Diff: python/pyspark/mllib/tree.py --- @@ -115,6 +133,8 @@ class DecisionTree(object): Learning algorithm for a decision tree model for classification or regression. + +.. addedversion:: 1.1.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39561501 --- Diff: python/pyspark/mllib/tree.py --- @@ -239,6 +261,8 @@ class RandomForestModel(TreeEnsembleModel, JavaLoader): .. note:: Experimental Represents a random forest model. + +.. addedversion:: 1.2.0 --- End diff -- Change to `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10278][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8685#discussion_r39560663 --- Diff: python/pyspark/mllib/tree.py --- @@ -90,16 +101,23 @@ def predict(self, x): else: return self.call("predict", _convert_to_vector(x)) +@since("1.1.0") def numNodes(self): +"""Get number of nodes in tree, including leaf nodes.""" return self._java_model.numNodes() +@since("1.1.0") def depth(self): +"""Get depth of tree. +E.g.: Depth 0 means 1 leaf node. Depth 1 means 1 internal node and 2 leaf nodes. +""" return self._java_model.depth() def __repr__(self): """ summary of model. """ return self._java_model.toString() +@since("1.3.0") --- End diff -- This is from 1.2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8684#issuecomment-140524867 Looks good - just need to replace `addedversion` with `versionadded` in the class docstrings. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558433 --- Diff: python/pyspark/mllib/regression.py --- @@ -640,6 +736,8 @@ class StreamingLinearRegressionWithSGD(StreamingLinearAlgorithm): :param: numIterations Total number of iterations run. :param: miniBatchFraction Fraction of data on which SGD is run for each iteration. + +.. addedversion:: 1.5.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558391 --- Diff: python/pyspark/mllib/regression.py --- @@ -571,8 +641,29 @@ def load(cls, sc, path): class IsotonicRegression(object): +""" +Isotonic regression. +Currently implemented using parallelized pool adjacent violators algorithm. +Only univariate (single feature) algorithm supported. + +Sequential PAV implementation based on: +Tibshirani, Ryan J., Holger Hoefling, and Robert Tibshirani. + "Nearly-isotonic regression." Technometrics 53.1 (2011): 54-61. + Available from [[http://www.stat.cmu.edu/~ryantibs/papers/neariso.pdf]] + +Sequential PAV parallelization based on: +Kearsley, Anthony J., Richard A. Tapia, and Michael W. Trosset. + "An approach to parallelizing isotonic regression." + Applied Mathematics and Parallel Computing. Physica-Verlag HD, 1996. 141-147. + Available from [[http://softlib.rice.edu/pub/CRPC-TRs/reports/CRPC-TR96640.pdf]] + +@see [[http://en.wikipedia.org/wiki/Isotonic_regression Isotonic regression (Wikipedia)]] + +.. addedversion:: 1.4.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558375 --- Diff: python/pyspark/mllib/regression.py --- @@ -523,6 +586,8 @@ class IsotonicRegressionModel(Saveable, Loader): ... rmtree(path) ... except OSError: ... pass + +.. addedversion:: 1.4.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558409 --- Diff: python/pyspark/mllib/regression.py --- @@ -590,10 +681,13 @@ class StreamingLinearAlgorithm(object): Base class that has to be inherited by any StreamingLinearAlgorithm. Prevents reimplementation of methods predictOn and predictOnValues. + +.. addedversion:: 1.5.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558359 --- Diff: python/pyspark/mllib/regression.py --- @@ -445,8 +497,19 @@ def load(cls, sc, path): class RidgeRegressionWithSGD(object): +""" +Train a regression model with L2-regularization using Stochastic Gradient Descent. +This solves the l2-regularized least squares regression formulation + f(weights) = 1/2n ||A weights-y||^2^ + regParam/2 ||weights||^2^ +Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with +its corresponding right hand side label y. +See also the documentation for the precise formulation. + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558311 --- Diff: python/pyspark/mllib/regression.py --- @@ -326,8 +361,19 @@ def load(cls, sc, path): class LassoWithSGD(object): +""" +Train a regression model with L1-regularization using Stochastic Gradient Descent. +This solves the l1-regularized least squares regression formulation + f(weights) = 1/2n ||A weights-y||^2^ + regParam ||weights||_1 +Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with +its corresponding right hand side label y. +See also the documentation for the precise formulation. + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558335 --- Diff: python/pyspark/mllib/regression.py --- @@ -428,14 +474,20 @@ class RidgeRegressionModel(LinearRegressionModelBase): True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558268 --- Diff: python/pyspark/mllib/regression.py --- @@ -198,8 +215,20 @@ def _regression_train_wrapper(train_func, modelClass, data, initial_weights): class LinearRegressionWithSGD(object): +""" +Train a linear regression model with no regularization using Stochastic Gradient Descent. +This solves the least squares regression formulation + f(weights) = 1/n ||A weights-y||^2^ +(which is the mean squared error). +Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with +its corresponding right hand side label y. +See also the documentation for the precise formulation. + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558296 --- Diff: python/pyspark/mllib/regression.py --- @@ -309,14 +338,20 @@ class LassoModel(LinearRegressionModelBase): True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39558247 --- Diff: python/pyspark/mllib/regression.py --- @@ -162,14 +173,20 @@ class LinearRegressionModel(LinearRegressionModelBase): True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39552511 --- Diff: python/pyspark/mllib/regression.py --- @@ -93,8 +101,11 @@ class LinearRegressionModelBase(LinearModel): True >>> abs(lrmb.predict(SparseVector(2, {0: -1.03, 1: 7.777})) - 14.624) < 1e-6 True + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39552470 --- Diff: python/pyspark/mllib/regression.py --- @@ -65,6 +67,8 @@ class LinearModel(object): :param weights: Weights computed for every feature. :param intercept: Intercept computed for this model. + +.. addedversion:: 0.9.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10277][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8684#discussion_r39552087 --- Diff: python/pyspark/mllib/regression.py --- @@ -42,6 +42,8 @@ class LabeledPoint(object): column matrix) Note: 'label' and 'features' are accessible as class attributes. + +.. addedversion:: 1.0.0 --- End diff -- Switch to `.. versionadded` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [PySpark][MLlib][Docs] Replaced addversion wit...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8773 [PySpark][MLlib][Docs] Replaced addversion with versionadded in mllib.random Missed this when reviewing `pyspark.mllib.random` for SPARK-10275. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark mllib-random-versionadded-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8773.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8773 commit a21e0ca909d87243a4fbe9508c36eec0ce710386 Author: noelsmith Date: 2015-09-15T18:48:36Z Replaced addversion with versionadded in mllib.random --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8677#discussion_r39547963 --- Diff: python/pyspark/mllib/recommendation.py --- @@ -36,6 +36,8 @@ class Rating(namedtuple("Rating", ["user", "product", "rating"])): (1, 2, 5.0) >>> (r[0], r[1], r[2]) (1, 2, 5.0) + +.. addedversion:: 1.2.0 --- End diff -- Should be `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8677#discussion_r39548087 --- Diff: python/pyspark/mllib/recommendation.py --- @@ -157,17 +167,25 @@ def recommendProducts(self, user, num): return list(self.call("recommendProducts", user, num)) @property +@since("1.3.1") def rank(self): +"""Rank for the features in this model""" return self.call("rank") @classmethod +@since("1.3.1") def load(cls, sc, path): +"""Load a model from the given path""" model = cls._load_java(sc, path) wrapper = sc._jvm.MatrixFactorizationModelWrapper(model) return MatrixFactorizationModel(wrapper) class ALS(object): +"""Alternating Least Squares matrix factorization + +.. addedversion:: 0.9.0 --- End diff -- Same again - should be `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8677#discussion_r39547983 --- Diff: python/pyspark/mllib/recommendation.py --- @@ -111,13 +113,17 @@ class MatrixFactorizationModel(JavaModelWrapper, JavaSaveable, JavaLoader): ... rmtree(path) ... except OSError: ... pass + +.. addedversion:: 0.9.0 --- End diff -- Should be `.. versionadded::` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8677#issuecomment-140230088 LGTM apart from the one minor issue above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8627#discussion_r39455034 --- Diff: python/pyspark/mllib/clustering.py --- @@ -129,20 +135,32 @@ def computeCost(self, rdd): [_convert_to_vector(c) for c in self.centers]) return cost +@since(1.4) def save(self, sc, path): +""" +Save this model to the given path. +""" java_centers = _py2java(sc, [_convert_to_vector(c) for c in self.centers]) java_model = sc._jvm.org.apache.spark.mllib.clustering.KMeansModel(java_centers) java_model.save(sc._jsc.sc(), path) @classmethod +@since('1.4.0') def load(cls, sc, path): +""" +Load a model from the given path. +""" java_model = sc._jvm.org.apache.spark.mllib.clustering.KMeansModel.load(sc._jsc.sc(), path) return KMeansModel(_java2py(sc, java_model.clusterCenters())) class KMeans(object): +""" +.. versionadded:: 0.9.1 --- End diff -- Good point - I'll update to 1.9.2 to match public releases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8627#discussion_r39453363 --- Diff: python/pyspark/mllib/clustering.py --- @@ -129,20 +135,32 @@ def computeCost(self, rdd): [_convert_to_vector(c) for c in self.centers]) return cost +@since(1.4) def save(self, sc, path): +""" +Save this model to the given path. +""" java_centers = _py2java(sc, [_convert_to_vector(c) for c in self.centers]) java_model = sc._jvm.org.apache.spark.mllib.clustering.KMeansModel(java_centers) java_model.save(sc._jsc.sc(), path) @classmethod +@since('1.4.0') def load(cls, sc, path): +""" +Load a model from the given path. +""" java_model = sc._jvm.org.apache.spark.mllib.clustering.KMeansModel.load(sc._jsc.sc(), path) return KMeansModel(_java2py(sc, java_model.clusterCenters())) class KMeans(object): +""" +.. versionadded:: 0.9.1 --- End diff -- @davies Was `1.9.0-incubating` a public release? If so I'll update this to from `1.9.1` to `1.9.0` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10276][MLlib][PySpark] Add @since annot...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8677#discussion_r39452996 --- Diff: python/pyspark/mllib/recommendation.py --- @@ -157,17 +167,25 @@ def recommendProducts(self, user, num): return list(self.call("recommendProducts", user, num)) @property +@since("1.3.1") --- End diff -- Think this was only added in 1.4.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10275][MLlib] Add @since annotation to ...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8666#issuecomment-140215395 All LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8665#issuecomment-140199152 Sounds good - thanks for confirming - I'll reinstate the thre-part version numbers in my PRs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8665#issuecomment-140185860 @mengxr @davies Just to confirm, before @yu-iskw and I make the changes - we want to stick with the two-part version numbers (`@since(1.4)`) used in `pyspark.sql` instead of the full 3-part numbers (`1.4.0`) used in the other APIs - correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8665#discussion_r39436349 --- Diff: python/pyspark/mllib/fpm.py --- @@ -58,6 +61,7 @@ class FPGrowth(object): """ @classmethod +@since("1.4.0") --- End diff -- Same here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8665#discussion_r39436251 --- Diff: python/pyspark/mllib/fpm.py --- @@ -41,8 +41,11 @@ class FPGrowthModel(JavaModelWrapper): >>> model = FPGrowth.train(rdd, 0.6, 2) >>> sorted(model.freqItemsets().collect()) [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), ... + +.. addedversion:: 1.4.0 """ +@since("1.4.0") --- End diff -- Maybe use float here: `@since(1.4)` - based on suggestion: https://github.com/apache/spark/pull/8633#discussion_r39088552 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10274][MLlib] Add @since annotation to ...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8665#discussion_r39435855 --- Diff: python/pyspark/mllib/fpm.py --- @@ -41,8 +41,11 @@ class FPGrowthModel(JavaModelWrapper): >>> model = FPGrowth.train(rdd, 0.6, 2) >>> sorted(model.freqItemsets().collect()) [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), ... + +.. addedversion:: 1.4.0 --- End diff -- To be consistent with the `pyspark.sql` module we ought to use 2-part version numbers (i.e. `1.4`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8633#issuecomment-140179397 @mengxr @yu-iskw - Sounds like a plan - I'll take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8571#issuecomment-139411836 Nice - thanks for finding that! It probably indicates I need to make the JS logic a bit more robust though. I'll take a look at the Sphinx history and find out how it's changed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8571#issuecomment-139404962 @davies Could you let me know your browser/platform + whether the example [link](https://dl.dropboxusercontent.com/u/20821334/pyspark-api-nav-enhance/pyspark.mllib.html) works for you? I'll try and reproduce locally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8571#issuecomment-139387988 Hmm that's odd - that's exactly the way I'm building too. Did your generated html pages include the links to the `pyspark.js` and `pyspark.css` files? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8633#discussion_r39096533 --- Diff: python/pyspark/mllib/feature.py --- @@ -84,11 +84,14 @@ class Normalizer(VectorTransformer): >>> nor2 = Normalizer(float("inf")) >>> nor2.transform(v) DenseVector([0.0, 0.5, 1.0]) + +.. versionadded:: 1.2.0 --- End diff -- I think matching the overall project versioning scheme make it's clearer - but I'm happy to implement it either way. One thing to watch for with using floats is that you can't differentiate between 1.1 and 1.10 (but it looks like that's unlikely to be a problem from the versioning history). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10094] Pyspark ML Feature transformers ...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8623#issuecomment-138812209 I also added a related PR https://github.com/apache/spark/pull/8571 about highlighting experimental features the API docs. Would be useful to get a second opinion on it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10373] [PYSPARK] move @since into pyspa...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8657#discussion_r38992400 --- Diff: python/pyspark/__init__.py --- @@ -48,6 +48,22 @@ from pyspark.status import * from pyspark.profiler import Profiler, BasicProfiler + +def since(version): +""" +A decorator that annotates a function to append the version of Spark the function was added. +""" +import re +indent_p = re.compile(r'\n( +)') + +def deco(f): --- End diff -- You're right, that makes more sense. Ignore the comment above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10373] [PYSPARK] move @since into pyspa...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8657#discussion_r38990876 --- Diff: python/pyspark/__init__.py --- @@ -48,6 +48,22 @@ from pyspark.status import * from pyspark.profiler import Profiler, BasicProfiler + +def since(version): +""" +A decorator that annotates a function to append the version of Spark the function was added. +""" +import re +indent_p = re.compile(r'\n( +)') + +def deco(f): --- End diff -- Would it be OK to add a clause to handle situations where f.__doc__ is None? There are a handful of methods without docstrings. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...
Github user noel-smith commented on a diff in the pull request: https://github.com/apache/spark/pull/8633#discussion_r38990506 --- Diff: python/pyspark/__init__.py --- @@ -51,6 +51,26 @@ # for back compatibility from pyspark.sql import SQLContext, HiveContext, SchemaRDD, Row + +def since(version): --- End diff -- OK - no problem - I'll update this (+ SPARK-10269/10271/10272) once SPARK-10373 is complete. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10273] Add @since annotation to pyspark...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8633 [SPARK-10273] Add @since annotation to pyspark.mllib.feature Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added @since to methods + "versionadded::" to classes (derived from the git file history in pyspark). You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10273-since-mllib-feature Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8633.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8633 commit 3dadc0368d0ccbba1967da7e3e70fa462b15befc Author: noelsmith Date: 2015-09-06T21:18:25Z Added @since to mllib.feature --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10272] Added @since tags to pyspark.mll...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8628 [SPARK-10272] Added @since tags to pyspark.mllib.evaluation Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added @since to public methods + "versionadded::" to classes (derived from the git file history in pyspark). Note - I added also the tags to MultilabelMetrics even though it isn't declared as public in the __all__ statement... if that's incorrect - I'll remove. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10272-since-mllib-evalutation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8628.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8628 commit 43a9a999016c08d8e044d1864e792ca1e7fb67a2 Author: noelsmith Date: 2015-09-06T19:06:24Z Added @since tags to mllib.evaluation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10271] Added @since tags to pyspark.mll...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8627 [SPARK-10271] Added @since tags to pyspark.mllib.clustering Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added @since to methods + "versionadded::" to classes (derived from the git file history in pyspark). You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10271-since-mllib-clustering Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8627.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8627 commit 23a075f4af8b6b77755b062a7864c775e68b383b Author: noelsmith Date: 2015-09-06T18:40:48Z Added @since tags to mllib.clustering --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...
GitHub user noel-smith reopened a pull request: https://github.com/apache/spark/pull/8626 Add @since annotation to pyspark.mllib.classification Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added @since to methods + "versionadded::" to classes derived from the file history. Note - some methods are inherited from the regression module (i.e. LinearModel.intercept) so these won't have version numbers in the API docs until that model is updated. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10269-since-mlib-classification Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8626.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8626 commit 0c8a844e3e6aeacf01e4efa0904b2b2cf9b1fd1d Author: noelsmith Date: 2015-09-06T16:23:33Z Added placeholder since decorator + version numbers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...
Github user noel-smith closed the pull request at: https://github.com/apache/spark/pull/8626 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add @since annotation to pyspark.mllib.classif...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8626 Add @since annotation to pyspark.mllib.classification Duplicated the @since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added @since to methods + "versionadded::" to classes derived from the file history. Note - some methods are inherited from the regression module (i.e. LinearModel.intercept) so these won't have version numbers in the API docs until that model is updated. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10269-since-mlib-classification Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8626.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8626 commit 0c8a844e3e6aeacf01e4efa0904b2b2cf9b1fd1d Author: noelsmith Date: 2015-09-06T16:23:33Z Added placeholder since decorator + version numbers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10094] Pyspark ML Feature transformers ...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8623 [SPARK-10094] Pyspark ML Feature transformers marked as experimental Modified class-level docstrings to mark all feature transformers in pyspark.ml as experimental. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark SPARK-10094-mark-pyspark-ml-trans-exp Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8623.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8623 commit 53eec78b38c77fac94575503e86b3bc51da1f6a4 Author: noelsmith Date: 2015-09-06T09:07:14Z Pyspark tranformers marked as experimental --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10415][PySpark] Enhance Navigation Side...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8571 [SPARK-10415][PySpark] Enhance Navigation Sidebar in PySpark API These are CSS/JavaScript changes to add classes/functions + a few other tweaks to make navigation in the PySpark API a bit simpler. You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark pyspark-api-nav-enhance Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8571.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8571 commit 06e3ddc40a18d5d8cecb1dccf84eac6bc2401b08 Author: noelsmith Date: 2015-08-31T21:44:34Z Added class + function list to TOC commit beaea590bf4f9d1852d885ef8f751088561d0aa2 Author: noelsmith Date: 2015-09-02T07:34:05Z Simplified JavaScript + CSS tweaks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...
Github user noel-smith commented on the pull request: https://github.com/apache/spark/pull/8399#issuecomment-135655500 That would be great - I've just messaged him. If there are any other changes you need to get this into 1.5 I'll get them in ASAP today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10188] [Pyspark] Pyspark CrossValidator...
GitHub user noel-smith opened a pull request: https://github.com/apache/spark/pull/8399 [SPARK-10188] [Pyspark] Pyspark CrossValidator with RMSE selects incorrect model * Added isLargerBetter() method to Pyspark Evaluator to match the Scala version. * JavaEvaluator delegates isLargerBetter() to underlying Scala object. * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax. * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared). (This contribution is my original work and that I license the work to the project under Sparks' open source license) You can merge this pull request into a Git repository by running: $ git pull https://github.com/noel-smith/spark pyspark-rmse-xval-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8399.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8399 commit d00357e40cef090c20ed4089d1cc23ebdaba2918 Author: noelsmith Date: 2015-08-24T17:02:13Z Added test for cross validation commit 6cd4ed12e4c37e80a3f88f93cb4255a1c011f5af Author: noelsmith Date: 2015-08-24T18:03:18Z Added/fixed tests for cross validation commit 63b3835b3676d8c1c19f756d4e9dba5575ef9d3f Author: noelsmith Date: 2015-08-24T18:24:48Z Removed print statements commit 7794cf73e10f2b5c57cfff1a2ea4a175e282c33c Author: noelsmith Date: 2015-08-24T18:25:33Z Added checks for isLargerBetter() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org