[ https://issues.apache.org/jira/browse/SPARK-17161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
holdenk resolved SPARK-17161. ----------------------------- Resolution: Fixed Fix Version/s: 2.2.0 > Add PySpark-ML JavaWrapper convenience function to create py4j JavaArrays > ------------------------------------------------------------------------- > > Key: SPARK-17161 > URL: https://issues.apache.org/jira/browse/SPARK-17161 > Project: Spark > Issue Type: Improvement > Components: ML, PySpark > Reporter: Bryan Cutler > Priority: Minor > Fix For: 2.2.0 > > > Often in Spark ML, there are classes that use a Scala Array in a constructor. > In order to add the same API to Python, a Java-friendly alternate > constructor needs to exist to be compatible with py4j when converting from a > list. This is because the current conversion in PySpark _py2java creates a > java.util.ArrayList, as shown in this error msg > {noformat} > Py4JError: An error occurred while calling > None.org.apache.spark.ml.feature.CountVectorizerModel. Trace: > py4j.Py4JException: Constructor > org.apache.spark.ml.feature.CountVectorizerModel([class java.util.ArrayList]) > does not exist > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179) > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196) > at py4j.Gateway.invoke(Gateway.java:235) > {noformat} > Creating an alternate constructor can be avoided by creating a py4j JavaArray > using {{new_array}}. This type is compatible with the Scala Array currently > used in classes like {{CountVectorizerModel}} and {{StringIndexerModel}}. > Most of the boiler-plate Python code to do this can be put in a convenience > function inside of ml.JavaWrapper to give a clean way of constructing ML > objects without adding special constructors. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org