[39/61] [partial] spark-website git commit: Add docs for Spark 2.4.0 and update the latest link

lixiao Thu, 08 Nov 2018 07:15:03 -0800

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/showDF.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/showDF.html 
b/site/docs/2.4.0/api/R/showDF.html
new file mode 100644
index 0000000..ddb311e
--- /dev/null
+++ b/site/docs/2.4.0/api/R/showDF.html
@@ -0,0 +1,128 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: showDF</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for showDF {SparkR}"><tr><td>showDF 
{SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
+
+<h2>showDF</h2>
+
+<h3>Description</h3>
+
+<p>Print the first numRows rows of a SparkDataFrame
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+showDF(x, ...)
+
+## S4 method for signature 'SparkDataFrame'
+showDF(x, numRows = 20, truncate = TRUE,
+  vertical = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>x</code></td>
+<td>
+<p>a SparkDataFrame.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>further arguments to be passed to or from other methods.</p>
+</td></tr>
+<tr valign="top"><td><code>numRows</code></td>
+<td>
+<p>the number of rows to print. Defaults to 20.</p>
+</td></tr>
+<tr valign="top"><td><code>truncate</code></td>
+<td>
+<p>whether truncate long strings. If <code>TRUE</code>, strings more than
+20 characters will be truncated. However, if set greater than zero,
+truncates strings longer than <code>truncate</code> characters and all cells
+will be aligned right.</p>
+</td></tr>
+<tr valign="top"><td><code>vertical</code></td>
+<td>
+<p>whether print output rows vertically (one line per column value).</p>
+</td></tr>
+</table>
+
+
+<h3>Note</h3>
+
+<p>showDF since 1.4.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p>Other SparkDataFrame functions: <code><a 
href="SparkDataFrame.html">SparkDataFrame-class</a></code>,
+<code><a href="summarize.html">agg</a></code>, <code><a 
href="alias.html">alias</a></code>,
+<code><a href="arrange.html">arrange</a></code>, <code><a 
href="as.data.frame.html">as.data.frame</a></code>,
+<code><a href="attach.html">attach,SparkDataFrame-method</a></code>,
+<code><a href="broadcast.html">broadcast</a></code>, <code><a 
href="cache.html">cache</a></code>,
+<code><a href="checkpoint.html">checkpoint</a></code>, <code><a 
href="coalesce.html">coalesce</a></code>,
+<code><a href="collect.html">collect</a></code>, <code><a 
href="columns.html">colnames</a></code>,
+<code><a href="coltypes.html">coltypes</a></code>,
+<code><a 
href="createOrReplaceTempView.html">createOrReplaceTempView</a></code>,
+<code><a href="crossJoin.html">crossJoin</a></code>, <code><a 
href="cube.html">cube</a></code>,
+<code><a href="dapplyCollect.html">dapplyCollect</a></code>, <code><a 
href="dapply.html">dapply</a></code>,
+<code><a href="describe.html">describe</a></code>, <code><a 
href="dim.html">dim</a></code>,
+<code><a href="distinct.html">distinct</a></code>, <code><a 
href="dropDuplicates.html">dropDuplicates</a></code>,
+<code><a href="nafunctions.html">dropna</a></code>, <code><a 
href="drop.html">drop</a></code>,
+<code><a href="dtypes.html">dtypes</a></code>, <code><a 
href="exceptAll.html">exceptAll</a></code>,
+<code><a href="except.html">except</a></code>, <code><a 
href="explain.html">explain</a></code>,
+<code><a href="filter.html">filter</a></code>, <code><a 
href="first.html">first</a></code>,
+<code><a href="gapplyCollect.html">gapplyCollect</a></code>, <code><a 
href="gapply.html">gapply</a></code>,
+<code><a href="getNumPartitions.html">getNumPartitions</a></code>, <code><a 
href="groupBy.html">group_by</a></code>,
+<code><a href="head.html">head</a></code>, <code><a 
href="hint.html">hint</a></code>,
+<code><a href="histogram.html">histogram</a></code>, <code><a 
href="insertInto.html">insertInto</a></code>,
+<code><a href="intersectAll.html">intersectAll</a></code>, <code><a 
href="intersect.html">intersect</a></code>,
+<code><a href="isLocal.html">isLocal</a></code>, <code><a 
href="isStreaming.html">isStreaming</a></code>,
+<code><a href="join.html">join</a></code>, <code><a 
href="limit.html">limit</a></code>,
+<code><a href="localCheckpoint.html">localCheckpoint</a></code>, <code><a 
href="merge.html">merge</a></code>,
+<code><a href="mutate.html">mutate</a></code>, <code><a 
href="ncol.html">ncol</a></code>,
+<code><a href="nrow.html">nrow</a></code>, <code><a 
href="persist.html">persist</a></code>,
+<code><a href="printSchema.html">printSchema</a></code>, <code><a 
href="randomSplit.html">randomSplit</a></code>,
+<code><a href="rbind.html">rbind</a></code>, <code><a 
href="rename.html">rename</a></code>,
+<code><a href="repartitionByRange.html">repartitionByRange</a></code>,
+<code><a href="repartition.html">repartition</a></code>, <code><a 
href="rollup.html">rollup</a></code>,
+<code><a href="sample.html">sample</a></code>, <code><a 
href="saveAsTable.html">saveAsTable</a></code>,
+<code><a href="schema.html">schema</a></code>, <code><a 
href="selectExpr.html">selectExpr</a></code>,
+<code><a href="select.html">select</a></code>, <code><a 
href="show.html">show</a></code>,
+<code><a href="storageLevel.html">storageLevel</a></code>, <code><a 
href="str.html">str</a></code>,
+<code><a href="subset.html">subset</a></code>, <code><a 
href="summary.html">summary</a></code>,
+<code><a href="take.html">take</a></code>, <code><a 
href="toJSON.html">toJSON</a></code>,
+<code><a href="unionByName.html">unionByName</a></code>, <code><a 
href="union.html">union</a></code>,
+<code><a href="unpersist.html">unpersist</a></code>, <code><a 
href="withColumn.html">withColumn</a></code>,
+<code><a href="withWatermark.html">withWatermark</a></code>, <code><a 
href="with.html">with</a></code>,
+<code><a href="write.df.html">write.df</a></code>, <code><a 
href="write.jdbc.html">write.jdbc</a></code>,
+<code><a href="write.json.html">write.json</a></code>, <code><a 
href="write.orc.html">write.orc</a></code>,
+<code><a href="write.parquet.html">write.parquet</a></code>, <code><a 
href="write.stream.html">write.stream</a></code>,
+<code><a href="write.text.html">write.text</a></code>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D path &lt;- &quot;path/to/file.json&quot;
+##D df &lt;- read.json(path)
+##D showDF(df)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>


http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.addFile.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.addFile.html 
b/site/docs/2.4.0/api/R/spark.addFile.html
new file mode 100644
index 0000000..c880842
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.addFile.html
@@ -0,0 +1,69 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Add a file or directory to 
be downloaded with this Spark job...</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.addFile 
{SparkR}"><tr><td>spark.addFile {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>Add a file or directory to be downloaded with this Spark job on every 
node.</h2>
+
+<h3>Description</h3>
+
+<p>The path passed can be either a local file, a file in HDFS (or other 
Hadoop-supported
+filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs,
+use spark.getSparkFiles(fileName) to find its download location.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.addFile(path, recursive = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>The path of the file to be added</p>
+</td></tr>
+<tr valign="top"><td><code>recursive</code></td>
+<td>
+<p>Whether to add files recursively from the path. Default is FALSE.</p>
+</td></tr>
+</table>
+
+
+<h3>Details</h3>
+
+<p>A directory can be given if the recursive option is set to true.
+Currently directories are only supported for Hadoop-supported filesystems.
+Refer Hadoop-supported filesystems at <a 
href="https://wiki.apache.org/hadoop/HCFS";>https://wiki.apache.org/hadoop/HCFS</a>.
+</p>
+<p>Note: A path can be added only once. Subsequent additions of the same path 
are ignored.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.addFile since 2.1.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D spark.addFile(&quot;~/myfile&quot;)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.als.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.als.html 
b/site/docs/2.4.0/api/R/spark.als.html
new file mode 100644
index 0000000..f887262
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.als.html
@@ -0,0 +1,204 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Alternating Least Squares 
(ALS) for Collaborative Filtering</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.als {SparkR}"><tr><td>spark.als 
{SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
+
+<h2>Alternating Least Squares (ALS) for Collaborative Filtering</h2>
+
+<h3>Description</h3>
+
+<p><code>spark.als</code> learns latent factors in collaborative filtering via 
alternating least
+squares. Users can call <code>summary</code> to obtain fitted latent factors, 
<code>predict</code>
+to make predictions on new data, and 
<code>write.ml</code>/<code>read.ml</code> to save/load fitted models.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.als(data, ...)
+
+## S4 method for signature 'SparkDataFrame'
+spark.als(data, ratingCol = "rating",
+  userCol = "user", itemCol = "item", rank = 10, regParam = 0.1,
+  maxIter = 10, nonnegative = FALSE, implicitPrefs = FALSE,
+  alpha = 1, numUserBlocks = 10, numItemBlocks = 10,
+  checkpointInterval = 10, seed = 0)
+
+## S4 method for signature 'ALSModel'
+summary(object)
+
+## S4 method for signature 'ALSModel'
+predict(object, newData)
+
+## S4 method for signature 'ALSModel,character'
+write.ml(object, path, overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional argument(s) passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>ratingCol</code></td>
+<td>
+<p>column name for ratings.</p>
+</td></tr>
+<tr valign="top"><td><code>userCol</code></td>
+<td>
+<p>column name for user ids. Ids must be (or can be coerced into) integers.</p>
+</td></tr>
+<tr valign="top"><td><code>itemCol</code></td>
+<td>
+<p>column name for item ids. Ids must be (or can be coerced into) integers.</p>
+</td></tr>
+<tr valign="top"><td><code>rank</code></td>
+<td>
+<p>rank of the matrix factorization (&gt; 0).</p>
+</td></tr>
+<tr valign="top"><td><code>regParam</code></td>
+<td>
+<p>regularization parameter (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>maximum number of iterations (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>nonnegative</code></td>
+<td>
+<p>logical value indicating whether to apply nonnegativity constraints.</p>
+</td></tr>
+<tr valign="top"><td><code>implicitPrefs</code></td>
+<td>
+<p>logical value indicating whether to use implicit preference.</p>
+</td></tr>
+<tr valign="top"><td><code>alpha</code></td>
+<td>
+<p>alpha parameter in the implicit preference formulation (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>numUserBlocks</code></td>
+<td>
+<p>number of user blocks used to parallelize computation (&gt; 0).</p>
+</td></tr>
+<tr valign="top"><td><code>numItemBlocks</code></td>
+<td>
+<p>number of item blocks used to parallelize computation (&gt; 0).</p>
+</td></tr>
+<tr valign="top"><td><code>checkpointInterval</code></td>
+<td>
+<p>number of checkpoint intervals (&gt;= 1) or disable checkpoint (-1).
+Note: this setting will be ignored if the checkpoint directory is not
+set.</p>
+</td></tr>
+<tr valign="top"><td><code>seed</code></td>
+<td>
+<p>integer seed for random number generation.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted ALS model.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>logical value indicating whether to overwrite if the output path
+already exists. Default is FALSE which means throw exception
+if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Details</h3>
+
+<p>For more details, see
+<a 
href="http://spark.apache.org/docs/latest/ml-collaborative-filtering.html";>MLlib:
+Collaborative Filtering</a>.
+</p>
+
+
+<h3>Value</h3>
+
+<p><code>spark.als</code> returns a fitted ALS model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list includes <code>user</code> (the names of the user column),
+<code>item</code> (the item column), <code>rating</code> (the rating column), 
<code>userFactors</code>
+(the estimated user factors), <code>itemFactors</code> (the estimated item 
factors),
+and <code>rank</code> (rank of the matrix factorization model).
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted values.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.als since 2.1.0
+</p>
+<p>summary(ALSModel) since 2.1.0
+</p>
+<p>predict(ALSModel) since 2.1.0
+</p>
+<p>write.ml(ALSModel, character) since 2.1.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a href="read.ml.html">read.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D ratings &lt;- list(list(0, 0, 4.0), list(0, 1, 2.0), list(1, 1, 3.0), 
list(1, 2, 4.0),
+##D                 list(2, 1, 1.0), list(2, 2, 5.0))
+##D df &lt;- createDataFrame(ratings, c(&quot;user&quot;, &quot;item&quot;, 
&quot;rating&quot;))
+##D model &lt;- spark.als(df, &quot;rating&quot;, &quot;user&quot;, 
&quot;item&quot;)
+##D 
+##D # extract latent factors
+##D stats &lt;- summary(model)
+##D userFactors &lt;- stats$userFactors
+##D itemFactors &lt;- stats$itemFactors
+##D 
+##D # make predictions
+##D predicted &lt;- predict(model, df)
+##D showDF(predicted)
+##D 
+##D # save and load the model
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+##D 
+##D # set other arguments
+##D modelS &lt;- spark.als(df, &quot;rating&quot;, &quot;user&quot;, 
&quot;item&quot;, rank = 20,
+##D                     regParam = 0.1, nonnegative = TRUE)
+##D statsS &lt;- summary(modelS)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.bisectingKmeans.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.bisectingKmeans.html 
b/site/docs/2.4.0/api/R/spark.bisectingKmeans.html
new file mode 100644
index 0000000..febb652
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.bisectingKmeans.html
@@ -0,0 +1,179 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Bisecting K-Means 
Clustering Model</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.bisectingKmeans 
{SparkR}"><tr><td>spark.bisectingKmeans {SparkR}</td><td style="text-align: 
right;">R Documentation</td></tr></table>
+
+<h2>Bisecting K-Means Clustering Model</h2>
+
+<h3>Description</h3>
+
+<p>Fits a bisecting k-means clustering model against a SparkDataFrame.
+Users can call <code>summary</code> to print a summary of the fitted model, 
<code>predict</code> to make
+predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to 
save/load fitted models.
+</p>
+<p>Get fitted result from a bisecting k-means model.
+Note: A saved-loaded model does not support this method.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.bisectingKmeans(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.bisectingKmeans(data, formula,
+  k = 4, maxIter = 20, seed = NULL, minDivisibleClusterSize = 1)
+
+## S4 method for signature 'BisectingKMeansModel'
+summary(object)
+
+## S4 method for signature 'BisectingKMeansModel'
+predict(object, newData)
+
+## S4 method for signature 'BisectingKMeansModel'
+fitted(object, method = c("centers",
+  "classes"))
+
+## S4 method for signature 'BisectingKMeansModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.
+Note that the response variable of formula is empty in 
spark.bisectingKmeans.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional argument(s) passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>k</code></td>
+<td>
+<p>the desired number of leaf clusters. Must be &gt; 1.
+The actual number could be smaller if there are no divisible leaf clusters.</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>maximum iteration number.</p>
+</td></tr>
+<tr valign="top"><td><code>seed</code></td>
+<td>
+<p>the random seed.</p>
+</td></tr>
+<tr valign="top"><td><code>minDivisibleClusterSize</code></td>
+<td>
+<p>The minimum number of points (if greater than or equal to 1.0)
+or the minimum proportion of points (if less than 1.0) of a
+divisible cluster. Note that it is an expert parameter. The
+default value should be good enough for most cases.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted bisecting k-means model.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>method</code></td>
+<td>
+<p>type of fitted results, <code>"centers"</code> for cluster centers
+or <code>"classes"</code> for assigned classes.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.bisectingKmeans</code> returns a fitted bisecting k-means model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list includes the model's <code>k</code> (number of cluster centers),
+<code>coefficients</code> (model cluster centers),
+<code>size</code> (number of data points in each cluster), <code>cluster</code>
+(cluster centers of the transformed data; cluster is NULL if is.loaded is 
TRUE),
+and <code>is.loaded</code> (whether the model is loaded from a saved file).
+</p>
+<p><code>predict</code> returns the predicted values based on a bisecting 
k-means model.
+</p>
+<p><code>fitted</code> returns a SparkDataFrame containing fitted values.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.bisectingKmeans since 2.2.0
+</p>
+<p>summary(BisectingKMeansModel) since 2.2.0
+</p>
+<p>predict(BisectingKMeansModel) since 2.2.0
+</p>
+<p>fitted since 2.2.0
+</p>
+<p>write.ml(BisectingKMeansModel, character) since 2.2.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a href="predict.html">predict</a>, <a href="read.ml.html">read.ml</a>, <a 
href="write.ml.html">write.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D t &lt;- as.data.frame(Titanic)
+##D df &lt;- createDataFrame(t)
+##D model &lt;- spark.bisectingKmeans(df, Class ~ Survived, k = 4)
+##D summary(model)
+##D 
+##D # get fitted result from a bisecting k-means model
+##D fitted.model &lt;- fitted(model, &quot;centers&quot;)
+##D showDF(fitted.model)
+##D 
+##D # fitted values on training data
+##D fitted &lt;- predict(model, df)
+##D head(select(fitted, &quot;Class&quot;, &quot;prediction&quot;))
+##D 
+##D # save fitted model to input path
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D 
+##D # can also read back the saved model and print
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.decisionTree.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.decisionTree.html 
b/site/docs/2.4.0/api/R/spark.decisionTree.html
new file mode 100644
index 0000000..a3821a6
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.decisionTree.html
@@ -0,0 +1,234 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Decision Tree Model for 
Regression and Classification</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.decisionTree 
{SparkR}"><tr><td>spark.decisionTree {SparkR}</td><td style="text-align: 
right;">R Documentation</td></tr></table>
+
+<h2>Decision Tree Model for Regression and Classification</h2>
+
+<h3>Description</h3>
+
+<p><code>spark.decisionTree</code> fits a Decision Tree Regression model or 
Classification model on
+a SparkDataFrame. Users can call <code>summary</code> to get a summary of the 
fitted Decision Tree
+model, <code>predict</code> to make predictions on new data, and 
<code>write.ml</code>/<code>read.ml</code> to
+save/load fitted models.
+For more details, see
+<a 
href="http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree-regression";>
+Decision Tree Regression</a> and
+<a 
href="http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree-classifier";>
+Decision Tree Classification</a>
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.decisionTree(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.decisionTree(data, formula,
+  type = c("regression", "classification"), maxDepth = 5,
+  maxBins = 32, impurity = NULL, seed = NULL,
+  minInstancesPerNode = 1, minInfoGain = 0, checkpointInterval = 10,
+  maxMemoryInMB = 256, cacheNodeIds = FALSE,
+  handleInvalid = c("error", "keep", "skip"))
+
+## S4 method for signature 'DecisionTreeRegressionModel'
+summary(object)
+
+## S3 method for class 'summary.DecisionTreeRegressionModel'
+print(x, ...)
+
+## S4 method for signature 'DecisionTreeClassificationModel'
+summary(object)
+
+## S3 method for class 'summary.DecisionTreeClassificationModel'
+print(x, ...)
+
+## S4 method for signature 'DecisionTreeRegressionModel'
+predict(object, newData)
+
+## S4 method for signature 'DecisionTreeClassificationModel'
+predict(object, newData)
+
+## S4 method for signature 'DecisionTreeRegressionModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+
+## S4 method for signature 'DecisionTreeClassificationModel,character'
+write.ml(object,
+  path, overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', ':', '+', and '-'.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional arguments passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>type</code></td>
+<td>
+<p>type of model, one of &quot;regression&quot; or &quot;classification&quot;, 
to fit</p>
+</td></tr>
+<tr valign="top"><td><code>maxDepth</code></td>
+<td>
+<p>Maximum depth of the tree (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>maxBins</code></td>
+<td>
+<p>Maximum number of bins used for discretizing continuous features and for 
choosing
+how to split on features at each node. More bins give higher granularity. Must 
be
+&gt;= 2 and &gt;= number of categories in any categorical feature.</p>
+</td></tr>
+<tr valign="top"><td><code>impurity</code></td>
+<td>
+<p>Criterion used for information gain calculation.
+For regression, must be &quot;variance&quot;. For classification, must be one 
of
+&quot;entropy&quot; and &quot;gini&quot;, default is &quot;gini&quot;.</p>
+</td></tr>
+<tr valign="top"><td><code>seed</code></td>
+<td>
+<p>integer seed for random number generation.</p>
+</td></tr>
+<tr valign="top"><td><code>minInstancesPerNode</code></td>
+<td>
+<p>Minimum number of instances each child must have after split.</p>
+</td></tr>
+<tr valign="top"><td><code>minInfoGain</code></td>
+<td>
+<p>Minimum information gain for a split to be considered at a tree node.</p>
+</td></tr>
+<tr valign="top"><td><code>checkpointInterval</code></td>
+<td>
+<p>Param for set checkpoint interval (&gt;= 1) or disable checkpoint (-1).
+Note: this setting will be ignored if the checkpoint directory is not
+set.</p>
+</td></tr>
+<tr valign="top"><td><code>maxMemoryInMB</code></td>
+<td>
+<p>Maximum memory in MB allocated to histogram aggregation.</p>
+</td></tr>
+<tr valign="top"><td><code>cacheNodeIds</code></td>
+<td>
+<p>If FALSE, the algorithm will pass trees to executors to match instances with
+nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching
+can speed up training of deeper trees. Users can set how often should the
+cache be checkpointed or disable it by setting checkpointInterval.</p>
+</td></tr>
+<tr valign="top"><td><code>handleInvalid</code></td>
+<td>
+<p>How to handle invalid data (unseen labels or NULL values) in features and
+label column of string type in classification model.
+Supported options: &quot;skip&quot; (filter out rows with invalid data),
+&quot;error&quot; (throw an error), &quot;keep&quot; (put invalid data in
+a special additional bucket, at index numLabels). Default
+is &quot;error&quot;.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>A fitted Decision Tree regression model or classification model.</p>
+</td></tr>
+<tr valign="top"><td><code>x</code></td>
+<td>
+<p>summary object of Decision Tree regression model or classification model
+returned by <code>summary</code>.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>The directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>Overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.decisionTree</code> returns a fitted Decision Tree model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list of components includes <code>formula</code> (formula),
+<code>numFeatures</code> (number of features), <code>features</code> (list of 
features),
+<code>featureImportances</code> (feature importances), and 
<code>maxDepth</code> (max depth of
+trees).
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted labeled 
in a column named
+&quot;prediction&quot;.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.decisionTree since 2.3.0
+</p>
+<p>summary(DecisionTreeRegressionModel) since 2.3.0
+</p>
+<p>print.summary.DecisionTreeRegressionModel since 2.3.0
+</p>
+<p>summary(DecisionTreeClassificationModel) since 2.3.0
+</p>
+<p>print.summary.DecisionTreeClassificationModel since 2.3.0
+</p>
+<p>predict(DecisionTreeRegressionModel) since 2.3.0
+</p>
+<p>predict(DecisionTreeClassificationModel) since 2.3.0
+</p>
+<p>write.ml(DecisionTreeRegressionModel, character) since 2.3.0
+</p>
+<p>write.ml(DecisionTreeClassificationModel, character) since 2.3.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D # fit a Decision Tree Regression Model
+##D df &lt;- createDataFrame(longley)
+##D model &lt;- spark.decisionTree(df, Employed ~ ., type = 
&quot;regression&quot;, maxDepth = 5, maxBins = 16)
+##D 
+##D # get the summary of the model
+##D summary(model)
+##D 
+##D # make predictions
+##D predictions &lt;- predict(model, df)
+##D 
+##D # save and load the model
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+##D 
+##D # fit a Decision Tree Classification Model
+##D t &lt;- as.data.frame(Titanic)
+##D df &lt;- createDataFrame(t)
+##D model &lt;- spark.decisionTree(df, Survived ~ Freq + Age, 
&quot;classification&quot;)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.fpGrowth.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.fpGrowth.html 
b/site/docs/2.4.0/api/R/spark.fpGrowth.html
new file mode 100644
index 0000000..682dbeb
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.fpGrowth.html
@@ -0,0 +1,182 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: FP-growth</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.fpGrowth 
{SparkR}"><tr><td>spark.fpGrowth {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>FP-growth</h2>
+
+<h3>Description</h3>
+
+<p>A parallel FP-growth algorithm to mine frequent itemsets.
+<code>spark.fpGrowth</code> fits a FP-growth model on a SparkDataFrame. Users 
can
+<code>spark.freqItemsets</code> to get frequent itemsets, 
<code>spark.associationRules</code> to get
+association rules, <code>predict</code> to make predictions on new data based 
on generated association
+rules, and <code>write.ml</code>/<code>read.ml</code> to save/load fitted 
models.
+For more details, see
+<a 
href="https://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html#fp-growth";>
+FP-growth</a>.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.fpGrowth(data, ...)
+
+spark.freqItemsets(object)
+
+spark.associationRules(object)
+
+## S4 method for signature 'SparkDataFrame'
+spark.fpGrowth(data, minSupport = 0.3,
+  minConfidence = 0.8, itemsCol = "items", numPartitions = NULL)
+
+## S4 method for signature 'FPGrowthModel'
+spark.freqItemsets(object)
+
+## S4 method for signature 'FPGrowthModel'
+spark.associationRules(object)
+
+## S4 method for signature 'FPGrowthModel'
+predict(object, newData)
+
+## S4 method for signature 'FPGrowthModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>A SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional argument(s) passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted FPGrowth model.</p>
+</td></tr>
+<tr valign="top"><td><code>minSupport</code></td>
+<td>
+<p>Minimal support level.</p>
+</td></tr>
+<tr valign="top"><td><code>minConfidence</code></td>
+<td>
+<p>Minimal confidence level.</p>
+</td></tr>
+<tr valign="top"><td><code>itemsCol</code></td>
+<td>
+<p>Features column name.</p>
+</td></tr>
+<tr valign="top"><td><code>numPartitions</code></td>
+<td>
+<p>Number of partitions used for fitting.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>logical value indicating whether to overwrite if the output path
+already exists. Default is FALSE which means throw exception
+if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.fpGrowth</code> returns a fitted FPGrowth model.
+</p>
+<p>A <code>SparkDataFrame</code> with frequent itemsets.
+The <code>SparkDataFrame</code> contains two columns:
+<code>items</code> (an array of the same type as the input column)
+and <code>freq</code> (frequency of the itemset).
+</p>
+<p>A <code>SparkDataFrame</code> with association rules.
+The <code>SparkDataFrame</code> contains four columns:
+<code>antecedent</code> (an array of the same type as the input column),
+<code>consequent</code> (an array of the same type as the input column),
+<code>condfidence</code> (confidence for the rule)
+and <code>lift</code> (lift for the rule)
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted values.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.fpGrowth since 2.2.0
+</p>
+<p>spark.freqItemsets(FPGrowthModel) since 2.2.0
+</p>
+<p>spark.associationRules(FPGrowthModel) since 2.2.0
+</p>
+<p>predict(FPGrowthModel) since 2.2.0
+</p>
+<p>write.ml(FPGrowthModel, character) since 2.2.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a href="read.ml.html">read.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D raw_data &lt;- read.df(
+##D   &quot;data/mllib/sample_fpgrowth.txt&quot;,
+##D   source = &quot;csv&quot;,
+##D   schema = structType(structField(&quot;raw_items&quot;, 
&quot;string&quot;)))
+##D 
+##D data &lt;- selectExpr(raw_data, &quot;split(raw_items, &#39; &#39;) as 
items&quot;)
+##D model &lt;- spark.fpGrowth(data)
+##D 
+##D # Show frequent itemsets
+##D frequent_itemsets &lt;- spark.freqItemsets(model)
+##D showDF(frequent_itemsets)
+##D 
+##D # Show association rules
+##D association_rules &lt;- spark.associationRules(model)
+##D showDF(association_rules)
+##D 
+##D # Predict on new data
+##D new_itemsets &lt;- data.frame(items = c(&quot;t&quot;, &quot;t,s&quot;))
+##D new_data &lt;- selectExpr(createDataFrame(new_itemsets), 
&quot;split(items, &#39;,&#39;) as items&quot;)
+##D predict(model, new_data)
+##D 
+##D # Save and load model
+##D path &lt;- &quot;/path/to/model&quot;
+##D write.ml(model, path)
+##D read.ml(path)
+##D 
+##D # Optional arguments
+##D baskets_data &lt;- selectExpr(createDataFrame(itemsets), 
&quot;split(items, &#39;,&#39;) as baskets&quot;)
+##D another_model &lt;- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5,
+##D                                 itemsCol = &quot;baskets&quot;, 
numPartitions = 10)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.gaussianMixture.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.gaussianMixture.html 
b/site/docs/2.4.0/api/R/spark.gaussianMixture.html
new file mode 100644
index 0000000..16760bf
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.gaussianMixture.html
@@ -0,0 +1,156 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Multivariate Gaussian 
Mixture Model (GMM)</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.gaussianMixture 
{SparkR}"><tr><td>spark.gaussianMixture {SparkR}</td><td style="text-align: 
right;">R Documentation</td></tr></table>
+
+<h2>Multivariate Gaussian Mixture Model (GMM)</h2>
+
+<h3>Description</h3>
+
+<p>Fits multivariate gaussian mixture model against a SparkDataFrame, 
similarly to R's
+mvnormalmixEM(). Users can call <code>summary</code> to print a summary of the 
fitted model,
+<code>predict</code> to make predictions on new data, and 
<code>write.ml</code>/<code>read.ml</code>
+to save/load fitted models.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.gaussianMixture(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.gaussianMixture(data, formula,
+  k = 2, maxIter = 100, tol = 0.01)
+
+## S4 method for signature 'GaussianMixtureModel'
+summary(object)
+
+## S4 method for signature 'GaussianMixtureModel'
+predict(object, newData)
+
+## S4 method for signature 'GaussianMixtureModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.
+Note that the response variable of formula is empty in 
spark.gaussianMixture.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional arguments passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>k</code></td>
+<td>
+<p>number of independent Gaussians in the mixture model.</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>maximum iteration number.</p>
+</td></tr>
+<tr valign="top"><td><code>tol</code></td>
+<td>
+<p>the convergence tolerance.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted gaussian mixture model.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.gaussianMixture</code> returns a fitted multivariate gaussian 
mixture model.
+</p>
+<p><code>summary</code> returns summary of the fitted model, which is a list.
+The list includes the model's <code>lambda</code> (lambda), <code>mu</code> 
(mu),
+<code>sigma</code> (sigma), <code>loglik</code> (loglik), and 
<code>posterior</code> (posterior).
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted labels 
in a column named
+&quot;prediction&quot;.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.gaussianMixture since 2.1.0
+</p>
+<p>summary(GaussianMixtureModel) since 2.1.0
+</p>
+<p>predict(GaussianMixtureModel) since 2.1.0
+</p>
+<p>write.ml(GaussianMixtureModel, character) since 2.1.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p>mixtools: <a 
href="https://cran.r-project.org/package=mixtools";>https://cran.r-project.org/package=mixtools</a>
+</p>
+<p><a href="predict.html">predict</a>, <a href="read.ml.html">read.ml</a>, <a 
href="write.ml.html">write.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D library(mvtnorm)
+##D set.seed(100)
+##D a &lt;- rmvnorm(4, c(0, 0))
+##D b &lt;- rmvnorm(6, c(3, 4))
+##D data &lt;- rbind(a, b)
+##D df &lt;- createDataFrame(as.data.frame(data))
+##D model &lt;- spark.gaussianMixture(df, ~ V1 + V2, k = 2)
+##D summary(model)
+##D 
+##D # fitted values on training data
+##D fitted &lt;- predict(model, df)
+##D head(select(fitted, &quot;V1&quot;, &quot;prediction&quot;))
+##D 
+##D # save fitted model to input path
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D 
+##D # can also read back the saved model and print
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.gbt.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.gbt.html 
b/site/docs/2.4.0/api/R/spark.gbt.html
new file mode 100644
index 0000000..3eeff52
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.gbt.html
@@ -0,0 +1,257 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Gradient Boosted Tree 
Model for Regression and Classification</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.gbt {SparkR}"><tr><td>spark.gbt 
{SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
+
+<h2>Gradient Boosted Tree Model for Regression and Classification</h2>
+
+<h3>Description</h3>
+
+<p><code>spark.gbt</code> fits a Gradient Boosted Tree Regression model or 
Classification model on a
+SparkDataFrame. Users can call <code>summary</code> to get a summary of the 
fitted
+Gradient Boosted Tree model, <code>predict</code> to make predictions on new 
data, and
+<code>write.ml</code>/<code>read.ml</code> to save/load fitted models.
+For more details, see
+<a 
href="http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression";>
+GBT Regression</a> and
+<a 
href="http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier";>
+GBT Classification</a>
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.gbt(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.gbt(data, formula,
+  type = c("regression", "classification"), maxDepth = 5,
+  maxBins = 32, maxIter = 20, stepSize = 0.1, lossType = NULL,
+  seed = NULL, subsamplingRate = 1, minInstancesPerNode = 1,
+  minInfoGain = 0, checkpointInterval = 10, maxMemoryInMB = 256,
+  cacheNodeIds = FALSE, handleInvalid = c("error", "keep", "skip"))
+
+## S4 method for signature 'GBTRegressionModel'
+summary(object)
+
+## S3 method for class 'summary.GBTRegressionModel'
+print(x, ...)
+
+## S4 method for signature 'GBTClassificationModel'
+summary(object)
+
+## S3 method for class 'summary.GBTClassificationModel'
+print(x, ...)
+
+## S4 method for signature 'GBTRegressionModel'
+predict(object, newData)
+
+## S4 method for signature 'GBTClassificationModel'
+predict(object, newData)
+
+## S4 method for signature 'GBTRegressionModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+
+## S4 method for signature 'GBTClassificationModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', ':', '+', and '-'.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional arguments passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>type</code></td>
+<td>
+<p>type of model, one of &quot;regression&quot; or &quot;classification&quot;, 
to fit</p>
+</td></tr>
+<tr valign="top"><td><code>maxDepth</code></td>
+<td>
+<p>Maximum depth of the tree (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>maxBins</code></td>
+<td>
+<p>Maximum number of bins used for discretizing continuous features and for 
choosing
+how to split on features at each node. More bins give higher granularity. Must 
be
+&gt;= 2 and &gt;= number of categories in any categorical feature.</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>Param for maximum number of iterations (&gt;= 0).</p>
+</td></tr>
+<tr valign="top"><td><code>stepSize</code></td>
+<td>
+<p>Param for Step size to be used for each iteration of optimization.</p>
+</td></tr>
+<tr valign="top"><td><code>lossType</code></td>
+<td>
+<p>Loss function which GBT tries to minimize.
+For classification, must be &quot;logistic&quot;. For regression, must be one 
of
+&quot;squared&quot; (L2) and &quot;absolute&quot; (L1), default is 
&quot;squared&quot;.</p>
+</td></tr>
+<tr valign="top"><td><code>seed</code></td>
+<td>
+<p>integer seed for random number generation.</p>
+</td></tr>
+<tr valign="top"><td><code>subsamplingRate</code></td>
+<td>
+<p>Fraction of the training data used for learning each decision tree, in
+range (0, 1].</p>
+</td></tr>
+<tr valign="top"><td><code>minInstancesPerNode</code></td>
+<td>
+<p>Minimum number of instances each child must have after split. If a
+split causes the left or right child to have fewer than
+minInstancesPerNode, the split will be discarded as invalid. Should be
+&gt;= 1.</p>
+</td></tr>
+<tr valign="top"><td><code>minInfoGain</code></td>
+<td>
+<p>Minimum information gain for a split to be considered at a tree node.</p>
+</td></tr>
+<tr valign="top"><td><code>checkpointInterval</code></td>
+<td>
+<p>Param for set checkpoint interval (&gt;= 1) or disable checkpoint (-1).
+Note: this setting will be ignored if the checkpoint directory is not
+set.</p>
+</td></tr>
+<tr valign="top"><td><code>maxMemoryInMB</code></td>
+<td>
+<p>Maximum memory in MB allocated to histogram aggregation.</p>
+</td></tr>
+<tr valign="top"><td><code>cacheNodeIds</code></td>
+<td>
+<p>If FALSE, the algorithm will pass trees to executors to match instances with
+nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching
+can speed up training of deeper trees. Users can set how often should the
+cache be checkpointed or disable it by setting checkpointInterval.</p>
+</td></tr>
+<tr valign="top"><td><code>handleInvalid</code></td>
+<td>
+<p>How to handle invalid data (unseen labels or NULL values) in features and
+label column of string type in classification model.
+Supported options: &quot;skip&quot; (filter out rows with invalid data),
+&quot;error&quot; (throw an error), &quot;keep&quot; (put invalid data in
+a special additional bucket, at index numLabels). Default
+is &quot;error&quot;.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>A fitted Gradient Boosted Tree regression model or classification model.</p>
+</td></tr>
+<tr valign="top"><td><code>x</code></td>
+<td>
+<p>summary object of Gradient Boosted Tree regression model or classification 
model
+returned by <code>summary</code>.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>The directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>Overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.gbt</code> returns a fitted Gradient Boosted Tree model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list of components includes <code>formula</code> (formula),
+<code>numFeatures</code> (number of features), <code>features</code> (list of 
features),
+<code>featureImportances</code> (feature importances), <code>maxDepth</code> 
(max depth of trees),
+<code>numTrees</code> (number of trees), and <code>treeWeights</code> (tree 
weights).
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted labeled 
in a column named
+&quot;prediction&quot;.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.gbt since 2.1.0
+</p>
+<p>summary(GBTRegressionModel) since 2.1.0
+</p>
+<p>print.summary.GBTRegressionModel since 2.1.0
+</p>
+<p>summary(GBTClassificationModel) since 2.1.0
+</p>
+<p>print.summary.GBTClassificationModel since 2.1.0
+</p>
+<p>predict(GBTRegressionModel) since 2.1.0
+</p>
+<p>predict(GBTClassificationModel) since 2.1.0
+</p>
+<p>write.ml(GBTRegressionModel, character) since 2.1.0
+</p>
+<p>write.ml(GBTClassificationModel, character) since 2.1.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D # fit a Gradient Boosted Tree Regression Model
+##D df &lt;- createDataFrame(longley)
+##D model &lt;- spark.gbt(df, Employed ~ ., type = &quot;regression&quot;, 
maxDepth = 5, maxBins = 16)
+##D 
+##D # get the summary of the model
+##D summary(model)
+##D 
+##D # make predictions
+##D predictions &lt;- predict(model, df)
+##D 
+##D # save and load the model
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+##D 
+##D # fit a Gradient Boosted Tree Classification Model
+##D # label must be binary - Only binary classification is supported for GBT.
+##D t &lt;- as.data.frame(Titanic)
+##D df &lt;- createDataFrame(t)
+##D model &lt;- spark.gbt(df, Survived ~ Age + Freq, 
&quot;classification&quot;)
+##D 
+##D # numeric label is also supported
+##D t2 &lt;- as.data.frame(Titanic)
+##D t2$NumericGender &lt;- ifelse(t2$Sex == &quot;Male&quot;, 0, 1)
+##D df &lt;- createDataFrame(t2)
+##D model &lt;- spark.gbt(df, NumericGender ~ ., type = 
&quot;classification&quot;)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.getSparkFiles.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.getSparkFiles.html 
b/site/docs/2.4.0/api/R/spark.getSparkFiles.html
new file mode 100644
index 0000000..4749368
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.getSparkFiles.html
@@ -0,0 +1,59 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Get the absolute path of a 
file added through spark.addFile.</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.getSparkFiles 
{SparkR}"><tr><td>spark.getSparkFiles {SparkR}</td><td style="text-align: 
right;">R Documentation</td></tr></table>
+
+<h2>Get the absolute path of a file added through spark.addFile.</h2>
+
+<h3>Description</h3>
+
+<p>Get the absolute path of a file added through spark.addFile.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.getSparkFiles(fileName)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>fileName</code></td>
+<td>
+<p>The name of the file added through spark.addFile</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p>the absolute path of a file added through spark.addFile.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.getSparkFiles since 2.1.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D spark.getSparkFiles(&quot;myfile&quot;)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.getSparkFilesRootDirectory.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.getSparkFilesRootDirectory.html 
b/site/docs/2.4.0/api/R/spark.getSparkFilesRootDirectory.html
new file mode 100644
index 0000000..1e42bed
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.getSparkFilesRootDirectory.html
@@ -0,0 +1,49 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Get the root directory 
that contains files added through...</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.getSparkFilesRootDirectory 
{SparkR}"><tr><td>spark.getSparkFilesRootDirectory {SparkR}</td><td 
style="text-align: right;">R Documentation</td></tr></table>
+
+<h2>Get the root directory that contains files added through 
spark.addFile.</h2>
+
+<h3>Description</h3>
+
+<p>Get the root directory that contains files added through spark.addFile.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.getSparkFilesRootDirectory()
+</pre>
+
+
+<h3>Value</h3>
+
+<p>the root directory that contains files added through spark.addFile
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.getSparkFilesRootDirectory since 2.1.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D spark.getSparkFilesRootDirectory()
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.glm.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.glm.html 
b/site/docs/2.4.0/api/R/spark.glm.html
new file mode 100644
index 0000000..32fcfd9
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.glm.html
@@ -0,0 +1,234 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Generalized Linear 
Models</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.glm {SparkR}"><tr><td>spark.glm 
{SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
+
+<h2>Generalized Linear Models</h2>
+
+<h3>Description</h3>
+
+<p>Fits generalized linear model against a SparkDataFrame.
+Users can call <code>summary</code> to print a summary of the fitted model, 
<code>predict</code> to make
+predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to 
save/load fitted models.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.glm(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.glm(data, formula,
+  family = gaussian, tol = 1e-06, maxIter = 25, weightCol = NULL,
+  regParam = 0, var.power = 0, link.power = 1 - var.power,
+  stringIndexerOrderType = c("frequencyDesc", "frequencyAsc",
+  "alphabetDesc", "alphabetAsc"), offsetCol = NULL)
+
+## S4 method for signature 'GeneralizedLinearRegressionModel'
+summary(object)
+
+## S3 method for class 'summary.GeneralizedLinearRegressionModel'
+print(x, ...)
+
+## S4 method for signature 'GeneralizedLinearRegressionModel'
+predict(object, newData)
+
+## S4 method for signature 'GeneralizedLinearRegressionModel,character'
+write.ml(object,
+  path, overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional arguments passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>family</code></td>
+<td>
+<p>a description of the error distribution and link function to be used in the 
model.
+This can be a character string naming a family function, a family function or
+the result of a call to a family function. Refer R family at
+<a 
href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html";>https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>.
+Currently these families are supported: <code>binomial</code>, 
<code>gaussian</code>,
+<code>Gamma</code>, <code>poisson</code> and <code>tweedie</code>.
+</p>
+<p>Note that there are two ways to specify the tweedie family.
+</p>
+
+<ul>
+<li><p> Set <code>family = "tweedie"</code> and specify the var.power and 
link.power;
+</p>
+</li>
+<li><p> When package <code>statmod</code> is loaded, the tweedie family is 
specified
+using the family definition therein, i.e., <code>tweedie(var.power, 
link.power)</code>.
+</p>
+</li></ul>
+</td></tr>
+<tr valign="top"><td><code>tol</code></td>
+<td>
+<p>positive convergence tolerance of iterations.</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>integer giving the maximal number of IRLS iterations.</p>
+</td></tr>
+<tr valign="top"><td><code>weightCol</code></td>
+<td>
+<p>the weight column name. If this is not set or <code>NULL</code>, we treat 
all instance
+weights as 1.0.</p>
+</td></tr>
+<tr valign="top"><td><code>regParam</code></td>
+<td>
+<p>regularization parameter for L2 regularization.</p>
+</td></tr>
+<tr valign="top"><td><code>var.power</code></td>
+<td>
+<p>the power in the variance function of the Tweedie distribution which 
provides
+the relationship between the variance and mean of the distribution. Only
+applicable to the Tweedie family.</p>
+</td></tr>
+<tr valign="top"><td><code>link.power</code></td>
+<td>
+<p>the index in the power link function. Only applicable to the Tweedie 
family.</p>
+</td></tr>
+<tr valign="top"><td><code>stringIndexerOrderType</code></td>
+<td>
+<p>how to order categories of a string feature column. This is used to
+decide the base level of a string feature as the last category
+after ordering is dropped when encoding strings. Supported options
+are &quot;frequencyDesc&quot;, &quot;frequencyAsc&quot;, 
&quot;alphabetDesc&quot;, and
+&quot;alphabetAsc&quot;. The default value is &quot;frequencyDesc&quot;. When 
the
+ordering is set to &quot;alphabetDesc&quot;, this drops the same category
+as R when encoding strings.</p>
+</td></tr>
+<tr valign="top"><td><code>offsetCol</code></td>
+<td>
+<p>the offset column name. If this is not set or empty, we treat all instance
+offsets as 0.0. The feature specified as offset has a constant coefficient of
+1.0.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted generalized linear model.</p>
+</td></tr>
+<tr valign="top"><td><code>x</code></td>
+<td>
+<p>summary object of fitted generalized linear model returned by 
<code>summary</code> function.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.glm</code> returns a fitted generalized linear model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list of components includes at least the <code>coefficients</code> 
(coefficients matrix,
+which includes coefficients, standard error of coefficients, t value and p 
value),
+<code>null.deviance</code> (null/residual degrees of freedom), 
<code>aic</code> (AIC)
+and <code>iter</code> (number of iterations IRLS takes). If there are 
collinear columns in
+the data, the coefficients matrix only provides coefficients.
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted labels 
in a column named
+&quot;prediction&quot;.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.glm since 2.0.0
+</p>
+<p>summary(GeneralizedLinearRegressionModel) since 2.0.0
+</p>
+<p>print.summary.GeneralizedLinearRegressionModel since 2.0.0
+</p>
+<p>predict(GeneralizedLinearRegressionModel) since 1.5.0
+</p>
+<p>write.ml(GeneralizedLinearRegressionModel, character) since 2.0.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a href="glm.html">glm</a>, <a href="read.ml.html">read.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D t &lt;- as.data.frame(Titanic, stringsAsFactors = FALSE)
+##D df &lt;- createDataFrame(t)
+##D model &lt;- spark.glm(df, Freq ~ Sex + Age, family = &quot;gaussian&quot;)
+##D summary(model)
+##D 
+##D # fitted values on training data
+##D fitted &lt;- predict(model, df)
+##D head(select(fitted, &quot;Freq&quot;, &quot;prediction&quot;))
+##D 
+##D # save fitted model to input path
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D 
+##D # can also read back the saved model and print
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+##D 
+##D # note that the default string encoding is different from R&#39;s glm
+##D model2 &lt;- glm(Freq ~ Sex + Age, family = &quot;gaussian&quot;, data = t)
+##D summary(model2)
+##D # use stringIndexerOrderType = &quot;alphabetDesc&quot; to force string 
encoding
+##D # to be consistent with R
+##D model3 &lt;- spark.glm(df, Freq ~ Sex + Age, family = &quot;gaussian&quot;,
+##D                    stringIndexerOrderType = &quot;alphabetDesc&quot;)
+##D summary(model3)
+##D 
+##D # fit tweedie model
+##D model &lt;- spark.glm(df, Freq ~ Sex + Age, family = &quot;tweedie&quot;,
+##D                    var.power = 1.2, link.power = 0)
+##D summary(model)
+##D 
+##D # use the tweedie family from statmod
+##D library(statmod)
+##D model &lt;- spark.glm(df, Freq ~ Sex + Age, family = tweedie(1.2, 0))
+##D summary(model)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.isoreg.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.isoreg.html 
b/site/docs/2.4.0/api/R/spark.isoreg.html
new file mode 100644
index 0000000..faaab44
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.isoreg.html
@@ -0,0 +1,146 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Isotonic Regression 
Model</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.isoreg 
{SparkR}"><tr><td>spark.isoreg {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>Isotonic Regression Model</h2>
+
+<h3>Description</h3>
+
+<p>Fits an Isotonic Regression model against a SparkDataFrame, similarly to 
R's isoreg().
+Users can print, make predictions on the produced model and save the model to 
the input path.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.isoreg(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.isoreg(data, formula,
+  isotonic = TRUE, featureIndex = 0, weightCol = NULL)
+
+## S4 method for signature 'IsotonicRegressionModel'
+summary(object)
+
+## S4 method for signature 'IsotonicRegressionModel'
+predict(object, newData)
+
+## S4 method for signature 'IsotonicRegressionModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>A symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional arguments passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>isotonic</code></td>
+<td>
+<p>Whether the output sequence should be isotonic/increasing (TRUE) or
+antitonic/decreasing (FALSE).</p>
+</td></tr>
+<tr valign="top"><td><code>featureIndex</code></td>
+<td>
+<p>The index of the feature if <code>featuresCol</code> is a vector column
+(default: 0), no effect otherwise.</p>
+</td></tr>
+<tr valign="top"><td><code>weightCol</code></td>
+<td>
+<p>The weight column name.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted IsotonicRegressionModel.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>The directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>Overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.isoreg</code> returns a fitted Isotonic Regression model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list includes model's <code>boundaries</code> (boundaries in increasing 
order)
+and <code>predictions</code> (predictions associated with the boundaries at 
the same index).
+</p>
+<p><code>predict</code> returns a SparkDataFrame containing predicted values.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.isoreg since 2.1.0
+</p>
+<p>summary(IsotonicRegressionModel) since 2.1.0
+</p>
+<p>predict(IsotonicRegressionModel) since 2.1.0
+</p>
+<p>write.ml(IsotonicRegression, character) since 2.1.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D data &lt;- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+##D         list(5.0, 3.0), list(1.0, 4.0))
+##D df &lt;- createDataFrame(data, c(&quot;label&quot;, &quot;feature&quot;))
+##D model &lt;- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+##D # return model boundaries and prediction as lists
+##D result &lt;- summary(model, df)
+##D # prediction based on fitted model
+##D predict_data &lt;- list(list(-2.0), list(-1.0), list(0.5),
+##D                 list(0.75), list(1.0), list(2.0), list(9.0))
+##D predict_df &lt;- createDataFrame(predict_data, c(&quot;feature&quot;))
+##D # get prediction column
+##D predict_result &lt;- collect(select(predict(model, predict_df), 
&quot;prediction&quot;))
+##D 
+##D # save fitted model to input path
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D 
+##D # can also read back the saved model and print
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.kmeans.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.kmeans.html 
b/site/docs/2.4.0/api/R/spark.kmeans.html
new file mode 100644
index 0000000..ffd5b71
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.kmeans.html
@@ -0,0 +1,168 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: K-Means Clustering 
Model</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.kmeans 
{SparkR}"><tr><td>spark.kmeans {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>K-Means Clustering Model</h2>
+
+<h3>Description</h3>
+
+<p>Fits a k-means clustering model against a SparkDataFrame, similarly to R's 
kmeans().
+Users can call <code>summary</code> to print a summary of the fitted model, 
<code>predict</code> to make
+predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to 
save/load fitted models.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.kmeans(data, formula, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.kmeans(data, formula, k = 2,
+  maxIter = 20, initMode = c("k-means||", "random"), seed = NULL,
+  initSteps = 2, tol = 1e-04)
+
+## S4 method for signature 'KMeansModel'
+summary(object)
+
+## S4 method for signature 'KMeansModel'
+predict(object, newData)
+
+## S4 method for signature 'KMeansModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame for training.</p>
+</td></tr>
+<tr valign="top"><td><code>formula</code></td>
+<td>
+<p>a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.
+Note that the response variable of formula is empty in spark.kmeans.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional argument(s) passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>k</code></td>
+<td>
+<p>number of centers.</p>
+</td></tr>
+<tr valign="top"><td><code>maxIter</code></td>
+<td>
+<p>maximum iteration number.</p>
+</td></tr>
+<tr valign="top"><td><code>initMode</code></td>
+<td>
+<p>the initialization algorithm chosen to fit the model.</p>
+</td></tr>
+<tr valign="top"><td><code>seed</code></td>
+<td>
+<p>the random seed for cluster initialization.</p>
+</td></tr>
+<tr valign="top"><td><code>initSteps</code></td>
+<td>
+<p>the number of steps for the k-means|| initialization mode.
+This is an advanced setting, the default of 2 is almost always enough.
+Must be &gt; 0.</p>
+</td></tr>
+<tr valign="top"><td><code>tol</code></td>
+<td>
+<p>convergence tolerance of iterations.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>a fitted k-means model.</p>
+</td></tr>
+<tr valign="top"><td><code>newData</code></td>
+<td>
+<p>a SparkDataFrame for testing.</p>
+</td></tr>
+<tr valign="top"><td><code>path</code></td>
+<td>
+<p>the directory where the model is saved.</p>
+</td></tr>
+<tr valign="top"><td><code>overwrite</code></td>
+<td>
+<p>overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.kmeans</code> returns a fitted k-means model.
+</p>
+<p><code>summary</code> returns summary information of the fitted model, which 
is a list.
+The list includes the model's <code>k</code> (the configured number of cluster 
centers),
+<code>coefficients</code> (model cluster centers),
+<code>size</code> (number of data points in each cluster), <code>cluster</code>
+(cluster centers of the transformed data), is.loaded (whether the model is 
loaded
+from a saved file), and <code>clusterSize</code>
+(the actual number of cluster centers. When using initMode = 
&quot;random&quot;,
+<code>clusterSize</code> may not equal to <code>k</code>).
+</p>
+<p><code>predict</code> returns the predicted values based on a k-means model.
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.kmeans since 2.0.0
+</p>
+<p>summary(KMeansModel) since 2.0.0
+</p>
+<p>predict(KMeansModel) since 2.0.0
+</p>
+<p>write.ml(KMeansModel, character) since 2.0.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a href="predict.html">predict</a>, <a href="read.ml.html">read.ml</a>, <a 
href="write.ml.html">write.ml</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D t &lt;- as.data.frame(Titanic)
+##D df &lt;- createDataFrame(t)
+##D model &lt;- spark.kmeans(df, Class ~ Survived, k = 4, initMode = 
&quot;random&quot;)
+##D summary(model)
+##D 
+##D # fitted values on training data
+##D fitted &lt;- predict(model, df)
+##D head(select(fitted, &quot;Class&quot;, &quot;prediction&quot;))
+##D 
+##D # save fitted model to input path
+##D path &lt;- &quot;path/to/model&quot;
+##D write.ml(model, path)
+##D 
+##D # can also read back the saved model and print
+##D savedModel &lt;- read.ml(path)
+##D summary(savedModel)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.kstest.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.kstest.html 
b/site/docs/2.4.0/api/R/spark.kstest.html
new file mode 100644
index 0000000..262b4e7
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.kstest.html
@@ -0,0 +1,130 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: (One-Sample) 
Kolmogorov-Smirnov Test</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.kstest 
{SparkR}"><tr><td>spark.kstest {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>(One-Sample) Kolmogorov-Smirnov Test</h2>
+
+<h3>Description</h3>
+
+<p><code>spark.kstest</code> Conduct the two-sided Kolmogorov-Smirnov (KS) 
test for data sampled from a
+continuous distribution.
+</p>
+<p>By comparing the largest difference between the empirical cumulative
+distribution of the sample data and the theoretical distribution we can 
provide a test for the
+the null hypothesis that the sample data comes from that theoretical 
distribution.
+</p>
+<p>Users can call <code>summary</code> to obtain a summary of the test, and 
<code>print.summary.KSTest</code>
+to print out a summary result.
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.kstest(data, ...)
+
+## S4 method for signature 'SparkDataFrame'
+spark.kstest(data, testCol = "test",
+  nullHypothesis = c("norm"), distParams = c(0, 1))
+
+## S4 method for signature 'KSTest'
+summary(object)
+
+## S3 method for class 'summary.KSTest'
+print(x, ...)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>data</code></td>
+<td>
+<p>a SparkDataFrame of user data.</p>
+</td></tr>
+<tr valign="top"><td><code>...</code></td>
+<td>
+<p>additional argument(s) passed to the method.</p>
+</td></tr>
+<tr valign="top"><td><code>testCol</code></td>
+<td>
+<p>column name where the test data is from. It should be a column of double 
type.</p>
+</td></tr>
+<tr valign="top"><td><code>nullHypothesis</code></td>
+<td>
+<p>name of the theoretical distribution tested against. Currently only
+<code>"norm"</code> for normal distribution is supported.</p>
+</td></tr>
+<tr valign="top"><td><code>distParams</code></td>
+<td>
+<p>parameters(s) of the distribution. For <code>nullHypothesis = "norm"</code>,
+we can provide as a vector the mean and standard deviation of
+the distribution. If none is provided, then standard normal will be used.
+If only one is provided, then the standard deviation will be set to be one.</p>
+</td></tr>
+<tr valign="top"><td><code>object</code></td>
+<td>
+<p>test result object of KSTest by <code>spark.kstest</code>.</p>
+</td></tr>
+<tr valign="top"><td><code>x</code></td>
+<td>
+<p>summary object of KSTest returned by <code>summary</code>.</p>
+</td></tr>
+</table>
+
+
+<h3>Value</h3>
+
+<p><code>spark.kstest</code> returns a test result object.
+</p>
+<p><code>summary</code> returns summary information of KSTest object, which is 
a list.
+The list includes the <code>p.value</code> (p-value), <code>statistic</code> 
(test statistic
+computed for the test), <code>nullHypothesis</code> (the null hypothesis with 
its
+parameters tested against) and <code>degreesOfFreedom</code> (degrees of 
freedom of the test).
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.kstest since 2.1.0
+</p>
+<p>summary(KSTest) since 2.1.0
+</p>
+<p>print.summary.KSTest since 2.1.0
+</p>
+
+
+<h3>See Also</h3>
+
+<p><a 
href="http://spark.apache.org/docs/latest/mllib-statistics.html#hypothesis-testing";>
+MLlib: Hypothesis Testing</a>
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D data &lt;- data.frame(test = c(0.1, 0.15, 0.2, 0.3, 0.25))
+##D df &lt;- createDataFrame(data)
+##D test &lt;- spark.kstest(df, &quot;test&quot;, &quot;norm&quot;, c(0, 1))
+##D 
+##D # get a summary of the test result
+##D testSummary &lt;- summary(test)
+##D testSummary
+##D 
+##D # print out the summary in an organized way
+##D print.summary.KSTest(testSummary)
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/52917ac4/site/docs/2.4.0/api/R/spark.lapply.html
----------------------------------------------------------------------
diff --git a/site/docs/2.4.0/api/R/spark.lapply.html 
b/site/docs/2.4.0/api/R/spark.lapply.html
new file mode 100644
index 0000000..a3f70e0
--- /dev/null
+++ b/site/docs/2.4.0/api/R/spark.lapply.html
@@ -0,0 +1,95 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";><html 
xmlns="http://www.w3.org/1999/xhtml";><head><title>R: Run a function over a list 
of elements, distributing the...</title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<link rel="stylesheet" type="text/css" href="R.css" />
+
+<link rel="stylesheet" 
href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css";>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";></script>
+<script>hljs.initHighlightingOnLoad();</script>
+</head><body>
+
+<table width="100%" summary="page for spark.lapply 
{SparkR}"><tr><td>spark.lapply {SparkR}</td><td style="text-align: right;">R 
Documentation</td></tr></table>
+
+<h2>Run a function over a list of elements, distributing the computations with 
Spark</h2>
+
+<h3>Description</h3>
+
+<p>Run a function over a list of elements, distributing the computations with 
Spark. Applies a
+function in a manner that is similar to doParallel or lapply to elements of a 
list.
+The computations are distributed using Spark. It is conceptually the same as 
the following code:
+lapply(list, func)
+</p>
+
+
+<h3>Usage</h3>
+
+<pre>
+spark.lapply(list, func)
+</pre>
+
+
+<h3>Arguments</h3>
+
+<table summary="R argblock">
+<tr valign="top"><td><code>list</code></td>
+<td>
+<p>the list of elements</p>
+</td></tr>
+<tr valign="top"><td><code>func</code></td>
+<td>
+<p>a function that takes one argument.</p>
+</td></tr>
+</table>
+
+
+<h3>Details</h3>
+
+<p>Known limitations:
+</p>
+
+<ul>
+<li><p> variable scoping and capture: compared to R's rich support for 
variable resolutions,
+the distributed nature of SparkR limits how variables are resolved at runtime. 
All the
+variables that are available through lexical scoping are embedded in the 
closure of the
+function and available as read-only variables within the function. The 
environment variables
+should be stored into temporary variables outside the function, and not 
directly accessed
+within the function.
+</p>
+</li>
+<li><p> loading external packages: In order to use a package, you need to load 
it inside the
+closure. For example, if you rely on the MASS module, here is how you would 
use it:
+</p>
+<pre>
+    train &lt;- function(hyperparam) {
+      library(MASS)
+      lm.ridge("y ~ x+z", data, lambda=hyperparam)
+      model
+    }
+  </pre>
+</li></ul>
+
+
+
+<h3>Value</h3>
+
+<p>a list of results (the exact type being determined by the function)
+</p>
+
+
+<h3>Note</h3>
+
+<p>spark.lapply since 2.0.0
+</p>
+
+
+<h3>Examples</h3>
+
+<pre><code class="r">## Not run: 
+##D sparkR.session()
+##D doubled &lt;- spark.lapply(1:10, function(x){2 * x})
+## End(Not run)
+</code></pre>
+
+
+<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 
<a href="00Index.html">Index</a>]</div>
+</body></html>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[39/61] [partial] spark-website git commit: Add docs for Spark 2.4.0 and update the latest link

Reply via email to