[ https://issues.apache.org/jira/browse/SPARK-29428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949968#comment-16949968 ]
Borys Biletskyy commented on SPARK-29428: ----------------------------------------- Thanks for your inputs. Maybe makes sense to mention it in the python docs, where I really missed conventions regarding None params. For me it was a way to workaround https://issues.apache.org/jira/browse/SPARK-29414. >From the following Params method I've got an impression that None params are >acceptable. {code:java} def _set(self, **kwargs): """ Sets user-supplied params. """ for param, value in kwargs.items(): p = getattr(self, param) if value is not None: try: value = p.typeConverter(value) except TypeError as e: raise TypeError('Invalid param value given for param "%s". %s' % (p.name, e)) self._paramMap[p] = value return self {code} Which is not the case here, where None params are not acceptable. {code:java} def set(self, param, value): """ Sets a parameter in the embedded param map. """ self._shouldOwn(param) try: value = param.typeConverter(value) except ValueError as e: raise ValueError('Invalid param value given for param "%s". %s' % (param.name, e)) self._paramMap[param] = value {code} > Can't persist/set None-valued param > ------------------------------------ > > Key: SPARK-29428 > URL: https://issues.apache.org/jira/browse/SPARK-29428 > Project: Spark > Issue Type: Bug > Components: ML, PySpark > Affects Versions: 2.3.2 > Reporter: Borys Biletskyy > Priority: Major > > {code:java} > import pytest > from pyspark import keyword_only > from pyspark.ml import Model > from pyspark.sql import DataFrame > from pyspark.ml.util import DefaultParamsReadable, DefaultParamsWritable > from pyspark.ml.param.shared import HasInputCol > from pyspark.sql.functions import * > class NoneParamTester(Model, > HasInputCol, > DefaultParamsReadable, > DefaultParamsWritable > ): > @keyword_only > def __init__(self, inputCol: str = None): > super(NoneParamTester, self).__init__() > kwargs = self._input_kwargs > self.setParams(**kwargs) > @keyword_only > def setParams(self, inputCol: str = None): > kwargs = self._input_kwargs > self._set(**kwargs) > return self > def _transform(self, data: DataFrame) -> DataFrame: > return data > class TestNoneParam(object): > def test_persist_none(self, spark, temp_dir): > path = temp_dir + '/test_model' > model = NoneParamTester(inputCol=None) > assert model.isDefined(model.inputCol) > assert model.isSet(model.inputCol) > assert model.getInputCol() is None > model.write().overwrite().save(path) > NoneParamTester.load(path) # TypeError: Could not convert <class > 'NoneType'> to string type > def test_set_none(self, spark): > model = NoneParamTester(inputCol=None) > assert model.isDefined(model.inputCol) > assert model.isSet(model.inputCol) > assert model.getInputCol() is None > model.set(model.inputCol, None) # TypeError: Could not convert > <class 'NoneType'> to string type > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org