[ 
https://issues.apache.org/jira/browse/SPARK-29428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949968#comment-16949968
 ] 

Borys Biletskyy commented on SPARK-29428:
-----------------------------------------

Thanks for your inputs. Maybe makes sense to mention it in the python docs, 
where I really missed conventions regarding None params. For me it was a way to 
workaround https://issues.apache.org/jira/browse/SPARK-29414.

>From the following Params method I've got an impression that None params are 
>acceptable.
{code:java}
def _set(self, **kwargs):
    """
    Sets user-supplied params.
    """
    for param, value in kwargs.items():
        p = getattr(self, param)
        if value is not None:
            try:
                value = p.typeConverter(value)
            except TypeError as e:
                raise TypeError('Invalid param value given for param "%s". %s' 
% (p.name, e))
        self._paramMap[p] = value
    return self
{code}
Which is not the case here, where None params are not acceptable.
{code:java}
def set(self, param, value):
    """
    Sets a parameter in the embedded param map.
    """
    self._shouldOwn(param)
    try:
        value = param.typeConverter(value)
    except ValueError as e:
        raise ValueError('Invalid param value given for param "%s". %s' % 
(param.name, e))
    self._paramMap[param] = value
{code}
 

 

> Can't persist/set None-valued param 
> ------------------------------------
>
>                 Key: SPARK-29428
>                 URL: https://issues.apache.org/jira/browse/SPARK-29428
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, PySpark
>    Affects Versions: 2.3.2
>            Reporter: Borys Biletskyy
>            Priority: Major
>
> {code:java}
> import pytest
> from pyspark import keyword_only
> from pyspark.ml import Model
> from pyspark.sql import DataFrame
> from pyspark.ml.util import DefaultParamsReadable, DefaultParamsWritable
> from pyspark.ml.param.shared import HasInputCol
> from pyspark.sql.functions import *
> class NoneParamTester(Model,
>                       HasInputCol,
>                       DefaultParamsReadable,
>                       DefaultParamsWritable
>                       ):
>     @keyword_only
>     def __init__(self, inputCol: str = None):
>         super(NoneParamTester, self).__init__()
>         kwargs = self._input_kwargs
>         self.setParams(**kwargs)
>     @keyword_only
>     def setParams(self, inputCol: str = None):
>         kwargs = self._input_kwargs
>         self._set(**kwargs)
>         return self
>     def _transform(self, data: DataFrame) -> DataFrame:
>         return data
> class TestNoneParam(object):
>     def test_persist_none(self, spark, temp_dir):
>         path = temp_dir + '/test_model'
>         model = NoneParamTester(inputCol=None)
>         assert model.isDefined(model.inputCol)
>         assert model.isSet(model.inputCol)
>         assert model.getInputCol() is None
>         model.write().overwrite().save(path)
>         NoneParamTester.load(path)  # TypeError: Could not convert <class 
> 'NoneType'> to string type
>     def test_set_none(self, spark):
>         model = NoneParamTester(inputCol=None)
>         assert model.isDefined(model.inputCol)
>         assert model.isSet(model.inputCol)
>         assert model.getInputCol() is None
>         model.set(model.inputCol, None)  # TypeError: Could not convert 
> <class 'NoneType'> to string type
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to