[GitHub] spark pull request #19715: [SPARK-22397][ML]add multiple columns support to ...

MLnick Wed, 29 Nov 2017 05:37:43 -0800

Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19715#discussion_r153775090
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala ---
    @@ -86,6 +104,10 @@ private[feature] trait QuantileDiscretizerBase extends 
Params
      * categorical features. The number of bins can be set using the 
`numBuckets` parameter. It is
      * possible that the number of buckets used will be smaller than this 
value, for example, if there
      * are too few distinct values of the input to create enough distinct 
quantiles.
    + * Since 2.3.0,
    --- End diff --
    
    Let's match the Bucketizer comment. So something like:
    
    ```
    ...
    Since 2.3.0, `QuantileDiscretizer ` can map multiple columns at once by 
setting the `inputCols` parameter. 
    Note that when both the `inputCol` and `inputCols` parameters are set, a 
log warning will be printed and
    only `inputCol` will take effect, while `inputCols` will be ignored. To 
specify the number of buckets 
    for each column , the `numBucketsArray ` parameter can be set, or if the 
number of buckets should be the
    same across columns, `numBuckets` can be set as a convenience.
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19715: [SPARK-22397][ML]add multiple columns support to ...

Reply via email to