[ https://issues.apache.org/jira/browse/SPARK-42825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702391#comment-17702391 ]
Hyukjin Kwon commented on SPARK-42825: -------------------------------------- Should probably fix the docs? > setParams() only sets explicitly named params. Is this intentional or a bug? > ---------------------------------------------------------------------------- > > Key: SPARK-42825 > URL: https://issues.apache.org/jira/browse/SPARK-42825 > Project: Spark > Issue Type: Question > Components: ML, PySpark > Affects Versions: 3.3.2 > Reporter: Lucas Partridge > Priority: Minor > > The Python signature/docstring of the setParams() method for the estimators > and transformers under pyspark.ml imply that if you don't set any of the > named params then they will be reset to their default values. > Example from > [https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.clustering.GaussianMixture.html#pyspark.ml.clustering.GaussianMixture.setParams] > : > {code:java} > setParams(self, \*, featuresCol="features", predictionCol="prediction", k=2, > probabilityCol="probability", tol=0.01, maxIter=100, seed=None, > aggregationDepth=2, weightCol=None){code} > In the extreme this would imply that if you called setParams() with no args > then _all_ the params would be reset to their default values. > But what actually happens is that _only_ the params passed in the call get > changed; the values of any other params aren't affected. So if you call > setParams() with no args then _no_ params get changed! > So is this behavior by design? I guess it is from the name of the method. But > it is counter-intuitive from its docstring. So if this behavior is > intentional then perhaps the default docstring should make this explicit by > saying something like: > "Sets the named params. The values of other params are not affected." -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org