[ 
https://issues.apache.org/jira/browse/SPARK-22198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akos Tomasits updated SPARK-22198:
----------------------------------
    Description: 
It is not possible to create proper Java custom Transformer by extending 
UnaryTransformer.

It seems that the method 'uid()' is called during object creation before the 
provided 'uid' constructor parameter could be set.

This leads to the following error:

{quote}
 java.lang.IllegalArgumentException: requirement failed: Param 
<prefix>_1563950936fa__inputCol does not belong to <prefix>_d4105b75c4aa.
{quote}

If you extend UnaryTransformer and try to use it e.g. through CrossValidator, 
you will need to explicitly include a constructor, which receives a String 
parameter. As I saw in the source of built in transformers, this parameter is a 
'uid', which should be set in the object. However, it is not possible to do it 
in time, because the uid() method is invoked (and its result might be used) 
before this constructor finishes.

Sample class:

{quote}
public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
                implements Serializable, DefaultParamsWritable, 
DefaultParamsReadable<TextCleaner> \{

    private static final long serialVersionUID = 2658543236303100458L;
    
    private static final String sparkUidPrefix = "TextCleaner";
    
    private final String sparkUid;

    public TextCleaner() \{
                sparkUid = 
org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
        }

        public TextCleaner(String uid) \{
             sparkUid = uid;
        }
    
    @Override
    public String uid() \{      
          return sparkUid;
    }

    ...
{quote}


  was:
It is not possible to create proper Java custom Transformer by extending 
UnaryTransformer.

It seems that the method 'uid()' is called during object creation before the 
provided 'uid' constructor parameter could be set.

This leads to the following error:

{quote}
 java.lang.IllegalArgumentException: requirement failed: Param 
<prefix>_1563950936fa__inputCol does not belong to <prefix>_d4105b75c4aa.
{quote}

If you extend UnaryTransformer and try to use it e.g. through CrossValidator, 
you will need to explicitly include a constructor, which receives a String 
parameter. As I saw in the source of built in transformers, this parameter is a 
'uid', which should be set in the object. However, this is not possible to do 
it in time, because the uid() method is invoked (and its result might be used) 
before this constructor finishes.

Sample class:

{quote}
public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
                implements Serializable, DefaultParamsWritable, 
DefaultParamsReadable<TextCleaner> \{

    private static final long serialVersionUID = 2658543236303100458L;
    
    private static final String sparkUidPrefix = "TextCleaner";
    
    private final String sparkUid;

    public TextCleaner() \{
                sparkUid = 
org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
        }

        public TextCleaner(String uid) \{
             sparkUid = uid;
        }
    
    @Override
    public String uid() \{      
          return sparkUid;
    }

    ...
{quote}



> Java incompatibility when extending UnaryTransformer
> ----------------------------------------------------
>
>                 Key: SPARK-22198
>                 URL: https://issues.apache.org/jira/browse/SPARK-22198
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Akos Tomasits
>
> It is not possible to create proper Java custom Transformer by extending 
> UnaryTransformer.
> It seems that the method 'uid()' is called during object creation before the 
> provided 'uid' constructor parameter could be set.
> This leads to the following error:
> {quote}
>  java.lang.IllegalArgumentException: requirement failed: Param 
> <prefix>_1563950936fa__inputCol does not belong to <prefix>_d4105b75c4aa.
> {quote}
> If you extend UnaryTransformer and try to use it e.g. through CrossValidator, 
> you will need to explicitly include a constructor, which receives a String 
> parameter. As I saw in the source of built in transformers, this parameter is 
> a 'uid', which should be set in the object. However, it is not possible to do 
> it in time, because the uid() method is invoked (and its result might be 
> used) before this constructor finishes.
> Sample class:
> {quote}
> public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
>                 implements Serializable, DefaultParamsWritable, 
> DefaultParamsReadable<TextCleaner> \{
>     private static final long serialVersionUID = 2658543236303100458L;
>     
>     private static final String sparkUidPrefix = "TextCleaner";
>     
>     private final String sparkUid;
>     public TextCleaner() \{
>               sparkUid = 
> org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
>       }
>       public TextCleaner(String uid) \{
>              sparkUid = uid;
>         }
>     
>     @Override
>     public String uid() \{            
>           return sparkUid;
>     }
>     ...
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to