[ https://issues.apache.org/jira/browse/SPARK-25941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-25941. ------------------------------- Resolution: Not A Problem This isn't a bug. The implementation has changed in Spark and MLlib a lot over time. I would not expect exactly the same answer, and this is quite close. > Random forest score decreased due to updating spark version > ----------------------------------------------------------- > > Key: SPARK-25941 > URL: https://issues.apache.org/jira/browse/SPARK-25941 > Project: Spark > Issue Type: Bug > Components: Deploy, Input/Output, ML > Affects Versions: 2.3.2 > Reporter: jack li > Priority: Major > Labels: ML, forest, random > > h3. Problem description > I use different versions of spark to analyze random forest scores.. > * spark-core_2.10 and version 2.0.0 > ** RandomForestsKaggle Score = 0.8978765219058574 > * spark-core_2.11 and version 2.4.0 > ** RandomForestsKaggle Score = 0.8886987035251259 > Source : [https://github.com/smartscity/Kaggle_Titanic_spark] > [Example github source and > readme|https://github.com/smartscity/Kaggle_Titanic_spark/blob/master/README.md] > > h3. Introduce > This case is Titanic Competitions on the Kaggle. > [https://www.kaggle.com/c/titanic] > h3. Conclusion > After upgrading the spark version({{version 2.4.0}}), the random forest score > dropped({{0.01}}). > h3. Expectation > Expect random forest score not to drop as the version upgrades. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org