[ https://issues.apache.org/jira/browse/SPARK-17161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Cutler updated SPARK-17161: --------------------------------- Summary: Add PySpark-ML JavaWrapper convenience function to create py4j JavaArrays (was: Add PySpark-ML JavaWrapper convienience function to create py4j JavaArrays) > Add PySpark-ML JavaWrapper convenience function to create py4j JavaArrays > ------------------------------------------------------------------------- > > Key: SPARK-17161 > URL: https://issues.apache.org/jira/browse/SPARK-17161 > Project: Spark > Issue Type: Improvement > Components: ML, PySpark > Reporter: Bryan Cutler > Priority: Minor > > Often in Spark ML, there are classes that use a Scala `Array` to construct. > In order to add the same API to Python, a Java-friendly alternate constructor > needs to exist to be compatible with py4j when converting from a list. This > is because the current conversion in PySpark _py2java creates a > java.util.ArrayList, as shown in this error msg > {noformat} > Py4JError: An error occurred while calling > None.org.apache.spark.ml.feature.CountVectorizerModel. Trace: > py4j.Py4JException: Constructor > org.apache.spark.ml.feature.CountVectorizerModel([class java.util.ArrayList]) > does not exist > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179) > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196) > at py4j.Gateway.invoke(Gateway.java:235) > {noformat} > Creating an alternate constructor can be avoided by creating a py4j JavaArray > using {{new_array}}. This type is compatible with the Scala `Array` > currently used in classes like {{CountVectorizerModel}} and > {{StringIndexerModel}}. > Most of the boiler-plate Python code to do this can be put in a convenience > function inside of ml.JavaWrapper to give a clean way of constructing ML > objects without adding special constructors. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org