[GitHub] incubator-spark pull request: new MLlib documentation for optimiza...

martinjaggi Sat, 08 Feb 2014 11:36:56 -0800

GitHub user martinjaggi opened a pull request:

    https://github.com/apache/incubator-spark/pull/563


    new MLlib documentation for optimization, regression and classification

    new documentation with tex formulas, hopefully improving usability and 
reproducibility of the offered MLlib methods.
    also did some minor changes in the code for consistency. scala tests pass.
    
    for easier merging, we could maybe rebase these changes (only > feb 7 is 
relevant) after 
    https://github.com/apache/incubator-spark/pull/552
    is merged?
    
    jira:
    https://spark-project.atlassian.net/browse/MLLIB-19

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-spark polishing-opt-MLlib

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-spark/pull/563.patch

----
commit d73948db0d9bc36296054e79fec5b1a657b4eab4
Author: Martin Jaggi <[email protected]>
Date:   2014-02-06T15:57:23Z

    minor update on how to compile the documentation

commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb
Author: Martin Jaggi <[email protected]>
Date:   2014-02-06T15:59:43Z

    enable mathjax formula in the .md documentation files
    
    code by @shivaram

commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa
Author: Martin Jaggi <[email protected]>
Date:   2014-02-06T16:31:29Z

    split MLlib documentation by techniques
    
    and linked from the main mllib-guide.md site

commit dcd2142c164b2f602bf472bb152ad55bae82d31a
Author: Martin Jaggi <[email protected]>
Date:   2014-02-06T17:04:26Z

    enabling inline latex formulas with $.$
    
    same mathjax configuration as used in math.stackexchange.com
    
    sample usage in the linear algebra (SVD) documentation

commit 0364bfabbfc347f917216057a20c39b631842481
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T02:19:38Z

    minor polishing, as suggested by @pwendell

commit 93d74988c33a9e4ef0d15e39c8b8fc9e6c36bb28
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T16:33:24Z

    renaming LeastSquaresGradient
    
    not to confuse with squared regularizer or a squared gradient. added
    some more comments as what the loss functions are good for

commit e4cbe99bbcf7f53ebb8f1a0d2e0b869a4922bca4
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T16:34:45Z

    use d for the number of features
    
    try to be consistent, that n is the number of data examples in the RDD,
    and each of them has d entries (also in documentation)

commit 79768fd3429df5c6d56f05ac93bdd8cf4355d946
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T17:13:17Z

    correct scaling for MSE loss
    
    to be consistent with the documentation

commit 1e228062b01ac806c4bd032eb0975a8b92431fd9
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T17:15:44Z

    new classification and regression documentation
    
    with complete mathematical formulations. trying to be general for
    adding future ML methods as well. table of all subgradients used for
    reference.
    this change also required a small addition to the mathjax
    configuration, to allow equation numbers.

commit 89e472f4121debb175b625ab0c138e24c4e60de8
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T17:16:51Z

    new optimization documentation
    
    explaining GD and SGD and the distributed versions that MLlib
    implements.

commit a33be78a47bad1745a03a6e0ee1a4ea1a7893805
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T17:38:57Z

    better comments in SGD code for regression

commit 73f5e71e3d9a253ff378907fca202b8d6aae1268
Author: Martin Jaggi <[email protected]>
Date:   2014-02-07T22:41:42Z

    lambda R() in documentation

commit eec58c9c860def9b3b7604c990ec1697812bcbbf
Author: Martin Jaggi <[email protected]>
Date:   2014-02-08T17:31:05Z

    telling what updater actually does
    
    also use proper scaling for the L2 regularization (using 1/2 as in the
    documentation)

commit 2c1cf8d35145081a61865f55f4e48fcfbafddbbe
Author: Martin Jaggi <[email protected]>
Date:   2014-02-08T17:56:01Z

    remove broken url

commit ecbac73a7450fc90ef1509d9a410c9b627617130
Author: Martin Jaggi <[email protected]>
Date:   2014-02-08T17:57:12Z

    better description of GradientDescent

----

[GitHub] incubator-spark pull request: new MLlib documentation for optimiza...

Reply via email to