[
https://issues.apache.org/jira/browse/SPARK-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179016#comment-15179016
]
Seth Hendrickson commented on SPARK-13068:
------------------------------------------
I think these are good and valid points. I will give it some more thought.
My concern is that the {{expectedType}} approach does not play nice with
lists/numpy arrays/vectors. If {{expectedType=list}}, then we can cast numpy
array to list, but if the numpy array dtype is float and Scala expects ints in
the array, there will still be a Py4J classcast exception. Likewise, if someone
passes [1,2,3] to an Array[Double] param in Scala, then we will get an
exception. To me, it is a bit unsatisfying to provide a solution that works for
one very common case, but still fails in other common cases.
I'm fine with not adding subclasses of Param for each type, but I think the
Param validator functions would provide a comprehensive solution to some of the
issues we're seeing. There is another
[Jira|https://issues.apache.org/jira/browse/SPARK-10009] open about pyspark
params working with lists/numpy arrays/vectors so I think addressing this issue
in a robust way is important. I would love to hear feedback and others'
thoughts and alternative approaches. Thanks!
> Extend pyspark ml paramtype conversion to support lists
> -------------------------------------------------------
>
> Key: SPARK-13068
> URL: https://issues.apache.org/jira/browse/SPARK-13068
> Project: Spark
> Issue Type: Improvement
> Components: ML, PySpark
> Reporter: holdenk
> Priority: Trivial
>
> In SPARK-7675 we added type conversion for PySpark ML params. We should
> follow up and support param type conversion for lists and nested structures
> as required. This blocks having all PySpark ML params having type information.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]