[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14653 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r78224937 --- Diff: python/pyspark/ml/wrapper.py --- @@ -19,8 +19,8 @@ from pyspark import SparkContext from pyspark.sql import DataFrame -from pyspark.ml import Estimator, Transformer, Model from pyspark.ml.param import Params +from pyspark.ml import Estimator, Transformer, Model --- End diff -- I think the order before moving this was better --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r78224369 --- Diff: python/pyspark/ml/param/__init__.py --- @@ -336,6 +336,11 @@ def hasParam(self, paramName): return isinstance(p, Param) else: raise TypeError("hasParam(): paramName must be a string") +try: --- End diff -- I don't think this code is reachable, is this necessary? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r75230698 --- Diff: python/pyspark/ml/wrapper.py --- @@ -243,7 +240,7 @@ def __init__(self, java_model=None): """ Initialize this instance with a Java model object. Subclasses should call this constructor, initialize params, -and then call _transfer_params_from_java. +and then call _transformer_params. --- End diff -- Not sure you intended this change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r75228035 --- Diff: python/pyspark/ml/classification.py --- @@ -59,6 +59,16 @@ class LogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredicti ... Row(label=0.0, weight=2.0, features=Vectors.sparse(1, [], []))]).toDF() >>> lr = LogisticRegression(maxIter=5, regParam=0.01, weightCol="weight") >>> model = lr.fit(df) +>>> emap = lr.extractParamMap() +>>> mmap = model.extractParamMap() +>>> all([emap[getattr(lr, param.name)] == value for (param, value) in mmap.items()]) --- End diff -- style: Also `(param, value)` -> `param, value` (Brackets redundant) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r75227783 --- Diff: python/pyspark/ml/classification.py --- @@ -59,6 +59,16 @@ class LogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredicti ... Row(label=0.0, weight=2.0, features=Vectors.sparse(1, [], []))]).toDF() >>> lr = LogisticRegression(maxIter=5, regParam=0.01, weightCol="weight") >>> model = lr.fit(df) +>>> emap = lr.extractParamMap() +>>> mmap = model.extractParamMap() +>>> all([emap[getattr(lr, param.name)] == value for (param, value) in mmap.items()]) --- End diff -- `emap[getattr(lr, param.name)]` is the same as `emap[param]` no? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14653#discussion_r75225024 --- Diff: python/pyspark/ml/classification.py --- @@ -59,6 +59,16 @@ class LogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredicti ... Row(label=0.0, weight=2.0, features=Vectors.sparse(1, [], []))]).toDF() >>> lr = LogisticRegression(maxIter=5, regParam=0.01, weightCol="weight") >>> model = lr.fit(df) +>>> emap = lr.extractParamMap() --- End diff -- style: `emap` -> `estimator_paramMap` `mmap` -> `model_paramMap` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14653: [SPARK-10931][PYSPARK][ML] PySpark ML Models shou...
GitHub user evanyc15 opened a pull request: https://github.com/apache/spark/pull/14653 [SPARK-10931][PYSPARK][ML] PySpark ML Models should contain Param values ## What changes were proposed in this pull request? Changed PySpark models to include the Param values. Refer to the closed PR 10270 for additional information. ## How was this patch tested? Tested using Python doctests ## Changesets: Estimator UID is being copied correctly to the Transformer model objects and params now, working on Doctests Changed the way parameters are copied from the Estimator to Transformer Checkpoint, switching back to inheritance method Working on DocTests Implemented Doctests for Recommendation, Clustering, Classification (except RandomForestClassifier), Evaluation, Tuning, Regression (except RandomRegression) Ready for Code Review Code Review changeset #1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/evanyc15/spark SPARK-10931-pyspark-mllib Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14653.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14653 commit 2f9417ca3419afb421f4e86d082325fb5b10bbbf Author: Evan ChenDate: 2015-11-19T03:54:57Z Copied parameters over from Estimator to Transformer Estimator UID is being copied correctly to the Transformer model objects and params now, working on Doctests Changed the way parameters are copied from the Estimator to Transformer Checkpoint, switching back to inheritance method Working on DocTests Implemented Doctests for Recommendation, Clustering, Classification (except RandomForestClassifier), Evaluation, Tuning, Regression (except RandomRegression) Ready for Code Review Code Review changeset #1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org