Github user iyerr3 commented on a diff in the pull request:

    https://github.com/apache/madlib/pull/246#discussion_r175924018
  
    --- Diff: 
src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
    @@ -127,7 +132,11 @@ tree_train(
     
       <DT>weights (optional)</DT>
       <DD>TEXT. Column name containing numerical weights for each observation.
    +  Can be any value greater than 0 (does not need to be
    +  an integer).  
       This can be used to handle the case of unbalanced data sets.
    +  For classification the row's vote is multiplied by the weight, 
    --- End diff --
    
    I suggest rephrase as 
    
    > The `weights` is used to compute a weighted average in the output leaf 
node. For classification, the contribution of a row towards the vote of it's 
corresponding level is multiplied by the weight (weighted mode). For 
regression, the output value of the row is multiplied by the weight (weighted 
mean).   


---

Reply via email to