[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

viirya Wed, 13 Sep 2017 20:06:06 -0700

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18902
  
    @MLnick Thanks for pinging me.
    
    I go through this quickly. The basic idea is the same, performing the 
operations on multiple inputs columns at one single Dataset/DataFrame operation.
    
    Unlike `Bucketizer`, `Imputer` has no compatibility concern because it 
already supports multiple input columns (`HasInputCols`). In `Bucketizer`, we 
don't want to break its current API so it makes thing more complicated a bit.
    
    Actually I'm noticed by `ImputerModel` which also applies `withColumn` 
sequentially on each input column. I'd like to address this part with the 
`withColumns` API proposed in #17819. What do you think @MLnick?




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

Reply via email to