[ 
https://issues.apache.org/jira/browse/SPARK-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212428#comment-15212428
 ] 

Joseph K. Bradley commented on SPARK-13783:
-------------------------------------------

I'd prefer what [~GayathriMurali] mentioned; that's what is done in 
spark.mllib.  That should be more efficient (taking more advantage of columnar 
storage).

I do want us to save Params for each tree since that will be more robust to 
future code changes (rather than re-creating them based on the GBT params).  
However, that may require some code refactoring so that the GBT can get a set 
of {{jsonParams}} for each tree.  Given that, the GBT could store that JSON in 
another DataFrame.

How does that sound?

It may make sense to implement export/import for one ensemble model before the 
other since both might require changes to the single-tree save/load.  Would you 
mind helping to review each other's work?  Who would prefer to go first?  
Thanks!

> Model export/import for spark.ml: GBTs
> --------------------------------------
>
>                 Key: SPARK-13783
>                 URL: https://issues.apache.org/jira/browse/SPARK-13783
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Joseph K. Bradley
>
> This JIRA is for both GBTClassifier and GBTRegressor.  The implementation 
> should reuse the one for DecisionTree*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to