Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/422#discussion_r11866538
  
    --- Diff: docs/mllib-guide.md ---
    @@ -3,63 +3,120 @@ layout: global
     title: Machine Learning Library (MLlib)
     ---
     
    +MLlib is a Spark implementation of some common machine learning algorithms 
and utilities,
    +including classification, regression, clustering, collaborative
    +filtering, dimensionality reduction, as well as underlying optimization 
primitives:
     
    -MLlib is a Spark implementation of some common machine learning (ML)
    -functionality, as well associated tests and data generators.  MLlib
    -currently supports four common types of machine learning problem settings,
    -namely classification, regression, clustering and collaborative filtering,
    -as well as an underlying gradient descent optimization primitive and 
several
    -linear algebra methods.
    -
    -# Available Methods
    -The following links provide a detailed explanation of the methods and 
usage examples for each of them:
    -
    -* <a href="mllib-classification-regression.html">Classification and 
Regression</a>
    -  * Binary Classification
    -    * SVM (L1 and L2 regularized)
    -    * Logistic Regression (L1 and L2 regularized)
    -  * Linear Regression
    -    * Least Squares
    -    * Lasso
    -    * Ridge Regression
    -  * Decision Tree (for classification and regression)
    -* <a href="mllib-clustering.html">Clustering</a>
    -  * k-Means
    -* <a href="mllib-collaborative-filtering.html">Collaborative Filtering</a>
    -  * Matrix Factorization using Alternating Least Squares
    -* <a href="mllib-optimization.html">Optimization</a>
    -  * Gradient Descent and Stochastic Gradient Descent
    -* <a href="mllib-linear-algebra.html">Linear Algebra</a>
    -  * Singular Value Decomposition
    -  * Principal Component Analysis
    -
    -# Data Types
    -
    -Most MLlib algorithms operate on RDDs containing vectors. In Java and 
Scala, the
    -[Vector](api/mllib/index.html#org.apache.spark.mllib.linalg.Vector) class 
is used to
    -represent vectors. You can create either dense or sparse vectors using the
    -[Vectors](api/mllib/index.html#org.apache.spark.mllib.linalg.Vectors$) 
factory.
    -
    -In Python, MLlib can take the following vector types:
    -
    -* [NumPy](http://www.numpy.org) arrays
    -* Standard Python lists (e.g. `[1, 2, 3]`)
    -* The MLlib 
[SparseVector](api/pyspark/pyspark.mllib.linalg.SparseVector-class.html) class
    -* [SciPy sparse 
matrices](http://docs.scipy.org/doc/scipy/reference/sparse.html)
    -
    -For efficiency, we recommend using NumPy arrays over lists, and using the
    -[CSC 
format](http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html#scipy.sparse.csc_matrix)
    -for SciPy matrices, or MLlib's own SparseVector class.
    -
    -Several other simple data types are used throughout the library, e.g. the 
LabeledPoint
    -class 
([Java/Scala](api/mllib/index.html#org.apache.spark.mllib.regression.LabeledPoint),
    -[Python](api/pyspark/pyspark.mllib.regression.LabeledPoint-class.html)) 
for labeled data.
    -
    -# Dependencies
    -MLlib uses the [jblas](https://github.com/mikiobraun/jblas) linear algebra 
library, which itself
    -depends on native Fortran routines. You may need to install the
    -[gfortran runtime 
library](https://github.com/mikiobraun/jblas/wiki/Missing-Libraries)
    -if it is not already present on your nodes. MLlib will throw a linking 
error if it cannot
    -detect these libraries automatically.
    +* [Basics](mllib-basics.html)
    +  * data types 
    +  * summary statistics
    +* Classification and regression
    +  * [linear support vector machine 
(SVM)](mllib-linear-methods.html#linear-support-vector-machine-svm)
    +  * [logistic regression](mllib-linear-methods.html#logistic-regression)
    +  * [linear least squares, Lasso, and ridge 
regression](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression)
    +  * [decision tree](mllib-decision-tree.html)
    +  * [naive Bayes](mllib-naive-bayes.html)
    +* [Collaborative filtering](mllib-collaborative-filtering.html)
    +  * alternating least squares (ALS)
    +* [Clustering](mllib-clustering.html)
    +  * k-means
    +* [Dimensionality reduction](mllib-dimensionality-reduction.html)
    +  * singular value decomposition (SVD)
    +  * principal component analysis (PCA)
    +* [Optimization](mllib-optimization.html)
    +  * stochastic gradient descent
    +  * limited-memory BFGS (L-BFGS)
    +
    +MLlib is currently a beta component under active development.
    +The APIs may be changed in the future releases, and we will provide 
migration guide between releases.
    +
    +## Dependencies
    +
    +MLlib uses linear algebra packages [Breeze](http://www.scalanlp.org/), 
which depends on
    +[netlib-java](https://github.com/fommil/netlib-java), and
    +[jblas](https://github.com/mikiobraun/jblas).  `jblas` depend on native 
Fortran routines. You need
    +to install the
    +[gfortran runtime 
library](https://github.com/mikiobraun/jblas/wiki/Missing-Libraries) if it is 
not
    +already present on your nodes. MLlib will throw a linking error if it 
cannot detect these libraries
    +automatically.  Due to license issues, we do not include `netlib-java`'s 
native libraries in MLlib's
    +dependency set. If no native library is available at runtime, you will see 
a warning message.  To
    +use native libraries from `netlib-java`, please include artifact
    +`com.github.fommil.netlib:all:1.1.2` as a dependency of your project or 
build your own (see
    --- End diff --
    
    Mentioned both `netlib-java` and `jblas` need gfortran.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to