[ 
https://issues.apache.org/jira/browse/SPARK-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179016#comment-15179016
 ] 

Seth Hendrickson commented on SPARK-13068:
------------------------------------------

I think these are good and valid points. I will give it some more thought. 

My concern is that the {{expectedType}} approach does not play nice with 
lists/numpy arrays/vectors. If {{expectedType=list}}, then we can cast numpy 
array to list, but if the numpy array dtype is float and Scala expects ints in 
the array, there will still be a Py4J classcast exception. Likewise, if someone 
passes [1,2,3] to an Array[Double] param in Scala, then we will get an 
exception. To me, it is a bit unsatisfying to provide a solution that works for 
one very common case, but still fails in other common cases.

I'm fine with not adding subclasses of Param for each type, but I think the 
Param validator functions would provide a comprehensive solution to some of the 
issues we're seeing. There is another 
[Jira|https://issues.apache.org/jira/browse/SPARK-10009] open about pyspark 
params working with lists/numpy arrays/vectors so I think addressing this issue 
in a robust way is important. I would love to hear feedback and others' 
thoughts and alternative approaches. Thanks!

> Extend pyspark ml paramtype conversion to support lists
> -------------------------------------------------------
>
>                 Key: SPARK-13068
>                 URL: https://issues.apache.org/jira/browse/SPARK-13068
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, PySpark
>            Reporter: holdenk
>            Priority: Trivial
>
> In SPARK-7675 we added type conversion for PySpark ML params. We should 
> follow up and support param type conversion for lists and nested structures 
> as required. This blocks having all PySpark ML params having type information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to