GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/19229

    [SPARK-22001][ML][SQL] ImputerModel can do withColumn for all input columns 
at one pass

    ## What changes were proposed in this pull request?
    
    SPARK-21690 makes one-pass `Imputer` by parallelizing the computation of 
all input columns. When we transform dataset with `ImputerModel`, we do 
`withColumn` on all input columns sequentially. We can also do this on all 
input columns at once by adding a `withColumns` API to `Dataset`.
    
    The new `withColumns` API is for internal use only now.
    
    ## How was this patch tested?
    
    Existing tests for `ImputerModel`'s change. Added tests for `withColumns` 
API.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-22001

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19229.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19229
    
----
commit 4efb64374b7c93bae3e9b0d2fc0ebc4f5ad1e1d5
Author: Liang-Chi Hsieh <vii...@gmail.com>
Date:   2017-09-14T03:49:16Z

    Do withColumn on all input columns at once.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to