[ https://issues.apache.org/jira/browse/SPARK-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212428#comment-15212428 ]
Joseph K. Bradley commented on SPARK-13783: ------------------------------------------- I'd prefer what [~GayathriMurali] mentioned; that's what is done in spark.mllib. That should be more efficient (taking more advantage of columnar storage). I do want us to save Params for each tree since that will be more robust to future code changes (rather than re-creating them based on the GBT params). However, that may require some code refactoring so that the GBT can get a set of {{jsonParams}} for each tree. Given that, the GBT could store that JSON in another DataFrame. How does that sound? It may make sense to implement export/import for one ensemble model before the other since both might require changes to the single-tree save/load. Would you mind helping to review each other's work? Who would prefer to go first? Thanks! > Model export/import for spark.ml: GBTs > -------------------------------------- > > Key: SPARK-13783 > URL: https://issues.apache.org/jira/browse/SPARK-13783 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Joseph K. Bradley > > This JIRA is for both GBTClassifier and GBTRegressor. The implementation > should reuse the one for DecisionTree*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org