spectral-clustering.mdtext

squinn Fri, 02 May 2014 13:22:34 -0700

Author: squinn
Date: Fri May  2 20:21:23 2014
New Revision: 1592026

URL: http://svn.apache.org/r1592026
Log:
Fixed the latex syntax to match mathjax.


Modified:
    
mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext

Modified: 
mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext?rev=1592026&r1=1592025&r2=1592026&view=diff
==============================================================================
--- 
mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext
 (original)
+++ 
mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext
 Fri May  2 20:21:23 2014
@@ -6,13 +6,13 @@ Spectral clustering, as its name implies
 
 At its simplest, spectral clustering relies on the following four steps:
 
- 1. Computing a similarity (or _affinity_) matrix $\mathbf{A}$ from the data. 
This involves determining a pairwise distance function $f$ that takes a pair of 
data points and returns a scalar.
+ 1. Computing a similarity (or _affinity_) matrix \(\mathbf{A}\) from the 
data. This involves determining a pairwise distance function \(f\) that takes a 
pair of data points and returns a scalar.
 
- 2. Computing a graph Laplacian $\mathbf{L}$ from the affinity matrix. There 
are several types of graph Laplacians; which is used will often depends on the 
situation.
+ 2. Computing a graph Laplacian \(\mathbf{L}\) from the affinity matrix. There 
are several types of graph Laplacians; which is used will often depends on the 
situation.
 
- 3. Computing the eigenvectors and eigenvalues of $\mathbf{L}$. The degree of 
this decomposition is often modulated by $k$, or the number of clusters. Put 
another way, $k$ eigenvectors and eigenvalues are computed.
+ 3. Computing the eigenvectors and eigenvalues of \(\mathbf{L}\). The degree 
of this decomposition is often modulated by \(k\), or the number of clusters. 
Put another way, \(k\) eigenvectors and eigenvalues are computed.
 
- 4. The $k$ eigenvectors are used as "proxy" data for the original dataset, 
and fed into k-means clustering. The resulting cluster assignments are 
transparently passed back to the original data.
+ 4. The \(k\) eigenvectors are used as "proxy" data for the original dataset, 
and fed into k-means clustering. The resulting cluster assignments are 
transparently passed back to the original data.
 
 For more theoretical background on spectral clustering, such as how affinity 
matrices are computed, the different types of graph Laplacians, and whether the 
top or bottom eigenvectors and eigenvalues are computed, please read [Ulrike 
von Luxburg's article in _Statistics and Computing_ from December 
2007](http://link.springer.com/article/10.1007/s11222-007-9033-z). It provides 
an excellent description of the linear algebra operations behind spectral 
clustering, and imbues a thorough understanding of the types of situations in 
which it can be used.
 
@@ -24,11 +24,11 @@ As of Mahout 0.3, spectral clustering ha
 
 ## Input
 
-The input format for the algorithm currently takes the form of a Hadoop-backed 
affinity matrix, in text form. Each line of the text file specifies a single 
element of the affinity matrix: the row index $i$, the column index $j$, and 
the value:
+The input format for the algorithm currently takes the form of a Hadoop-backed 
affinity matrix, in text form. Each line of the text file specifies a single 
element of the affinity matrix: the row index \(i\), the column index \(j\), 
and the value:
 
 `i, j, value`
 
-The affinity matrix is symmetric, and any unspecified $i, j$ pairs are assumed 
to be 0 for sparsity. The row and column indices are 0-indexed. Thus, only the 
non-zero entries of either the upper or lower triangular need be specified.
+The affinity matrix is symmetric, and any unspecified \(i, j\) pairs are 
assumed to be 0 for sparsity. The row and column indices are 0-indexed. Thus, 
only the non-zero entries of either the upper or lower triangular need be 
specified.
 
 **([MAHOUT-1539](https://issues.apache.org/jira/browse/MAHOUT-1539) will allow 
for the creation of the affinity matrix to occur as part of the core spectral 
clustering algorithm, as opposed to the current requirement that the user 
create this matrix themselves and provide it, rather than the original data, to 
the algorithm)**

svn commit: r1592026 - /mahout/site/mahout_cms/trunk/content/users/clustering/spectral-clustering.mdtext

Reply via email to