Github user fmcquillan99 commented on the issue:

    https://github.com/apache/madlib/pull/289
  
    ```
    The model table produced by the training function contains the following 
columns:
    
    gid INTEGER. Group id that uniquely identifies a set of grouping column 
values.
    sample_id   INTEGER. The id of the bootstrap sample that this tree is a 
part of.
    tree        BYTEA8. Trained tree model stored in binary format (not human 
readable).
    impurity_var_importance     DOUBLE PRECISION[]. The gini impurity 
importance score for the tree.
    ```
    
    I don't think we need the `impurity_var_importance` for each tree in the 
forest, since we have the final/averaged one on the grouping table.
    And we don't put the `oob_var_importance` here so it is inconsistent.



---

Reply via email to