[ https://issues.apache.org/jira/browse/SPARK-33661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-33661. ---------------------------------- Resolution: Not A Problem > Unable to load RandomForestClassificationModel trained in Spark 2.x > ------------------------------------------------------------------- > > Key: SPARK-33661 > URL: https://issues.apache.org/jira/browse/SPARK-33661 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 3.0.1 > Reporter: Marcus Levine > Priority: Major > > When attempting to load a RandomForestClassificationModel that was trained in > Spark 2.x using Spark 3.x, an exception is raised: > {code:python} > ... > RandomForestClassificationModel.load('/path/to/my/model') > File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 330, in > load > File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 291, > in load > File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 280, in > load > File "/usr/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line > 1305, in __call__ > File "/usr/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in > deco > File "<string>", line 3, in raise_from > pyspark.sql.utils.AnalysisException: No such struct field rawCount in id, > prediction, impurity, impurityStats, gain, leftChild, rightChild, split; > {code} > There seems to be a schema incompatibility between the trained model data > saved by Spark 2.x and the expected data for a model trained in Spark 3.x > If this issue is not resolved, users will be forced to retrain any existing > random forest models they trained in Spark 2.x using Spark 3.x before they > can upgrade -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org