[GitHub] spark pull request #20708: [SPARK-21209][MLLLIB] Implement Incremental PCA a...

sandecho Thu, 01 Mar 2018 11:47:21 -0800

GitHub user sandecho opened a pull request:

    https://github.com/apache/spark/pull/20708


    [SPARK-21209][MLLLIB] Implement Incremental PCA algorithm

    ## What changes were proposed in this pull request?
    
    A new feature called Incremental Principal Component Analysis 
Algorithm(IPCA) has been proposed. It divides the incoming data in batch size 
and compute the PCA of the individual batch to generate Principal Component of 
entire data.
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    Unit Testing
    [IPCA.zip](https://github.com/apache/spark/files/1772562/IPCA.zip)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sandecho/spark IPCA

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20708
    
----
commit 7900d21138de542fd89763a68417d74792725afd
Author: Sandeep Kumar Choudhary <tssandeepkumarchoudhary@...>
Date:   2018-03-01T13:35:20Z

    Implemented Incremental PCA

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20708: [SPARK-21209][MLLLIB] Implement Incremental PCA a...

Reply via email to