[
https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527760#comment-14527760
]
Joseph K. Bradley commented on SPARK-4766:
------------------------------------------
I started working on this JIRA, but I'm starting to think it's more trouble
than it's worth. Here are pros/cons of splitting Estimator & Model Params:
Input/output columns:
* Pro: It's odd to be able to set lrModel.labelCol.
* Con: When we want to evaluate a model, we'll want to know the label column.
A model will know its default evaluator...but not the label column.
Other parameters:
* Pro: It's odd to set lrModel.maxIter. It's awkward that CrossValidator could
mistakenly iterate over lrModel.maxIter values (if lrModel were in a Pipeline).
* Con: A lot of parameters are arguably part of the model. (regParam, etc.)
Pro: The separation is more technically correct.
Cons:
* The separation adds a little boilerplate.
* You also have to be careful about which validateAndTransformSchema you call
due to inheriting from multiple traits.
Given everything, I'm going to close this JIRA as not a problem. But please
post if you disagree.
CC: [~mengxr]
> ML Estimator Params should be distinct from Transformer Params
> --------------------------------------------------------------
>
> Key: SPARK-4766
> URL: https://issues.apache.org/jira/browse/SPARK-4766
> Project: Spark
> Issue Type: Improvement
> Components: ML
> Affects Versions: 1.2.0
> Reporter: Joseph K. Bradley
>
> Currently, in spark.ml, both Transformers and Estimators extend the same
> Params classes. There should be one Params class for the Transformer and one
> for the Estimator. These could sometimes be the same, but for other models,
> we may need either (a) to make them distinct or (b) to have the Estimator
> params class extend the Transformer one.
> E.g., it is weird to be able to do:
> {code}
> val model: LogisticRegressionModel = ...
> model.getMaxIter()
> {code}
> It's also weird to be able to:
> * Wrap LogisticRegressionModel (a Transformer) with CrossValidator
> * Pass a set of ParamMaps to CrossValidator which includes parameter
> LogisticRegressionModel.maxIter
> * (CrossValidator would try to set that parameter.)
> * I'm not sure if this would cause a failure or just be a noop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]