http://git-wip-us.apache.org/repos/asf/spark-website/blob/6bbac496/site/docs/2.1.2/api/python/pyspark.ml.html
----------------------------------------------------------------------
diff --git a/site/docs/2.1.2/api/python/pyspark.ml.html 
b/site/docs/2.1.2/api/python/pyspark.ml.html
index c7034f0..557d570 100644
--- a/site/docs/2.1.2/api/python/pyspark.ml.html
+++ b/site/docs/2.1.2/api/python/pyspark.ml.html
@@ -567,7 +567,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.Pipeline">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.</code><code 
class="descname">Pipeline</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/pipeline.html#Pipeline"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.Pipeline" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.</code><code 
class="descname">Pipeline</code><span 
class="sig-paren">(</span><em>stages=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/pipeline.html#Pipeline"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.Pipeline" title="Permalink to this definition">¶</a></dt>
 <dd><p>A simple pipeline, which acts as an estimator. A Pipeline consists
 of a sequence of stages, each of which is either an
 <a class="reference internal" href="#pyspark.ml.Estimator" 
title="pyspark.ml.Estimator"><code class="xref py py-class docutils 
literal"><span class="pre">Estimator</span></code></a> or a <a class="reference 
internal" href="#pyspark.ml.Transformer" title="pyspark.ml.Transformer"><code 
class="xref py py-class docutils literal"><span 
class="pre">Transformer</span></code></a>. When
@@ -1250,7 +1250,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 <span id="pyspark-ml-feature-module"></span><h2>pyspark.ml.feature module<a 
class="headerlink" href="#module-pyspark.ml.feature" title="Permalink to this 
headline">¶</a></h2>
 <dl class="class">
 <dt id="pyspark.ml.feature.Binarizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Binarizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Binarizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Binarizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Binarizer</code><span 
class="sig-paren">(</span><em>threshold=0.0</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#Binarizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Binarizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Binarize a column of continuous features given a threshold.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([(</span><span 
class="mf">0.5</span><span class="p">,)],</span> <span class="p">[</span><span 
class="s2">&quot;values&quot;</span><span class="p">])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">binarizer</span> <span 
class="o">=</span> <span class="n">Binarizer</span><span 
class="p">(</span><span class="n">threshold</span><span class="o">=</span><span 
class="mf">1.0</span><span class="p">,</span> <span 
class="n">inputCol</span><span class="o">=</span><span 
class="s2">&quot;values&quot;</span><span class="p">,</span> <span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;features&quot;</span><span class="p">)</span>
@@ -1520,7 +1520,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.Bucketizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Bucketizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Bucketizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Bucketizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Bucketizer</code><span 
class="sig-paren">(</span><em>splits=None</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em>, <em>handleInvalid='error'</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Bucketizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Bucketizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Maps a column of continuous features to a column of feature buckets.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">values</span> <span 
class="o">=</span> <span class="p">[(</span><span class="mf">0.1</span><span 
class="p">,),</span> <span class="p">(</span><span class="mf">0.4</span><span 
class="p">,),</span> <span class="p">(</span><span class="mf">1.2</span><span 
class="p">,),</span> <span class="p">(</span><span class="mf">1.5</span><span 
class="p">,),</span> <span class="p">(</span><span class="nb">float</span><span 
class="p">(</span><span class="s2">&quot;nan&quot;</span><span 
class="p">),),</span> <span class="p">(</span><span 
class="nb">float</span><span class="p">(</span><span 
class="s2">&quot;nan&quot;</span><span class="p">),)]</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">(</span><span 
class="n">values</span><span class="p">,</span> <span class="p">[</span><span 
class="s2">&quot;values&quot;</span><span class="p">])</span>
@@ -1824,7 +1824,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.ChiSqSelector">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">ChiSqSelector</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#ChiSqSelector"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.ChiSqSelector" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">ChiSqSelector</code><span 
class="sig-paren">(</span><em>numTopFeatures=50</em>, 
<em>featuresCol='features'</em>, <em>outputCol=None</em>, 
<em>labelCol='label'</em>, <em>selectorType='numTopFeatures'</em>, 
<em>percentile=0.1</em>, <em>fpr=0.05</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#ChiSqSelector"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.ChiSqSelector" title="Permalink to this 
definition">¶</a></dt>
 <dd><div class="admonition note">
 <p class="first admonition-title">Note</p>
 <p class="last">Experimental</p>
@@ -2399,7 +2399,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.CountVectorizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">CountVectorizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#CountVectorizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.CountVectorizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">CountVectorizer</code><span 
class="sig-paren">(</span><em>minTF=1.0</em>, <em>minDF=1.0</em>, 
<em>vocabSize=262144</em>, <em>binary=False</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#CountVectorizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.CountVectorizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Extracts a vocabulary from document collections and generates a <a 
class="reference internal" href="#pyspark.ml.feature.CountVectorizerModel" 
title="pyspark.ml.feature.CountVectorizerModel"><code class="xref py py-attr 
docutils literal"><span class="pre">CountVectorizerModel</span></code></a>.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">(</span>
 <span class="gp">... </span>   <span class="p">[(</span><span 
class="mi">0</span><span class="p">,</span> <span class="p">[</span><span 
class="s2">&quot;a&quot;</span><span class="p">,</span> <span 
class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;c&quot;</span><span class="p">]),</span> <span 
class="p">(</span><span class="mi">1</span><span class="p">,</span> <span 
class="p">[</span><span class="s2">&quot;a&quot;</span><span class="p">,</span> 
<span class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;c&quot;</span><span class="p">,</span> <span 
class="s2">&quot;a&quot;</span><span class="p">])],</span>
@@ -2952,7 +2952,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.DCT">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">DCT</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#DCT"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.DCT" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">DCT</code><span 
class="sig-paren">(</span><em>inverse=False</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#DCT"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.DCT" title="Permalink to this definition">¶</a></dt>
 <dd><p>A feature transformer that takes the 1D discrete cosine transform
 of a real vector. No zero padding is performed on the input vector.
 It returns a real vector of the same length representing the DCT.
@@ -3230,7 +3230,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.ElementwiseProduct">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">ElementwiseProduct</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#ElementwiseProduct"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.ElementwiseProduct" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">ElementwiseProduct</code><span 
class="sig-paren">(</span><em>scalingVec=None</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#ElementwiseProduct"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.ElementwiseProduct" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Outputs the Hadamard product (i.e., the element-wise product) of each 
input vector
 with a provided &#8220;weight&#8221; vector. In other words, it scales each 
column of the dataset
 by a scalar multiplier.</p>
@@ -3501,7 +3501,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.HashingTF">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">HashingTF</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#HashingTF"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.HashingTF" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">HashingTF</code><span 
class="sig-paren">(</span><em>numFeatures=262144</em>, <em>binary=False</em>, 
<em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#HashingTF"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.HashingTF" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Maps a sequence of terms to their term frequencies using the hashing 
trick.
 Currently we use Austin Appleby&#8217;s MurmurHash 3 algorithm 
(MurmurHash3_x86_32)
 to calculate the hash code value for the term object.
@@ -3793,7 +3793,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.IDF">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">IDF</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#IDF"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.IDF" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">IDF</code><span 
class="sig-paren">(</span><em>minDocFreq=0</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#IDF"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.IDF" title="Permalink to this definition">¶</a></dt>
 <dd><p>Compute the Inverse Document Frequency (IDF) given a collection of 
documents.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span 
class="nn">pyspark.ml.linalg</span> <span class="k">import</span> <span 
class="n">DenseVector</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([(</span><span 
class="n">DenseVector</span><span class="p">([</span><span 
class="mf">1.0</span><span class="p">,</span> <span class="mf">2.0</span><span 
class="p">]),),</span>
@@ -4272,7 +4272,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.IndexToString">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">IndexToString</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#IndexToString"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.IndexToString" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">IndexToString</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>labels=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#IndexToString"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.IndexToString" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A <code class="xref py py-class docutils literal"><span 
class="pre">Transformer</span></code> that maps a column of indices back to a 
new column of
 corresponding string values.
 The index-string mapping is either from the ML attributes of the input column,
@@ -4530,7 +4530,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.MaxAbsScaler">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">MaxAbsScaler</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#MaxAbsScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.MaxAbsScaler" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">MaxAbsScaler</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#MaxAbsScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.MaxAbsScaler" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Rescale each feature individually to range [-1, 1] by dividing through 
the largest maximum
 absolute value in each feature. It does not shift/center the data, and thus 
does not destroy
 any sparsity.</p>
@@ -4988,7 +4988,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.MinMaxScaler">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">MinMaxScaler</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#MinMaxScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.MinMaxScaler" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">MinMaxScaler</code><span 
class="sig-paren">(</span><em>min=0.0</em>, <em>max=1.0</em>, 
<em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#MinMaxScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.MinMaxScaler" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Rescale each feature individually to a common range [min, max] linearly 
using column summary
 statistics, which is also known as min-max normalization or Rescaling. The 
rescaled value for
 feature E is calculated as,</p>
@@ -5514,7 +5514,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.NGram">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">NGram</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#NGram"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.NGram" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">NGram</code><span class="sig-paren">(</span><em>n=2</em>, 
<em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#NGram"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.NGram" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A feature transformer that converts the input array of strings into an 
array of n-grams. Null
 values in the input array are ignored.
 It returns an array of n-grams where each n-gram is represented by a 
space-separated string of
@@ -5525,15 +5525,15 @@ returned.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([</span><span 
class="n">Row</span><span class="p">(</span><span 
class="n">inputTokens</span><span class="o">=</span><span 
class="p">[</span><span class="s2">&quot;a&quot;</span><span class="p">,</span> 
<span class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;c&quot;</span><span class="p">,</span> <span 
class="s2">&quot;d&quot;</span><span class="p">,</span> <span 
class="s2">&quot;e&quot;</span><span class="p">])])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span> <span 
class="o">=</span> <span class="n">NGram</span><span class="p">(</span><span 
class="n">n</span><span class="o">=</span><span class="mi">2</span><span 
class="p">,</span> <span class="n">inputCol</span><span class="o">=</span><span 
class="s2">&quot;inputTokens&quot;</span><span class="p">,</span> <span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;nGrams&quot;</span><span class="p">)</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(inputTokens=[u&#39;a&#39;, u&#39;b&#39;, u&#39;c&#39;, 
u&#39;d&#39;, u&#39;e&#39;], nGrams=[u&#39;a b&#39;, u&#39;b c&#39;, u&#39;c 
d&#39;, u&#39;d e&#39;])</span>
+<span class="go">Row(inputTokens=[&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, 
&#39;d&#39;, &#39;e&#39;], nGrams=[&#39;a b&#39;, &#39;b c&#39;, &#39;c d&#39;, 
&#39;d e&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Change n-gram 
length</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="n">n</span><span class="o">=</span><span class="mi">4</span><span 
class="p">)</span><span class="o">.</span><span class="n">transform</span><span 
class="p">(</span><span class="n">df</span><span class="p">)</span><span 
class="o">.</span><span class="n">head</span><span class="p">()</span>
-<span class="go">Row(inputTokens=[u&#39;a&#39;, u&#39;b&#39;, u&#39;c&#39;, 
u&#39;d&#39;, u&#39;e&#39;], nGrams=[u&#39;a b c d&#39;, u&#39;b c d 
e&#39;])</span>
+<span class="go">Row(inputTokens=[&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, 
&#39;d&#39;, &#39;e&#39;], nGrams=[&#39;a b c d&#39;, &#39;b c d e&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Temporarily modify 
output column.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">,</span> <span class="p">{</span><span 
class="n">ngram</span><span class="o">.</span><span 
class="n">outputCol</span><span class="p">:</span> <span 
class="s2">&quot;output&quot;</span><span class="p">})</span><span 
class="o">.</span><span class="n">head</span><span class="p">()</span>
-<span class="go">Row(inputTokens=[u&#39;a&#39;, u&#39;b&#39;, u&#39;c&#39;, 
u&#39;d&#39;, u&#39;e&#39;], output=[u&#39;a b c d&#39;, u&#39;b c d 
e&#39;])</span>
+<span class="go">Row(inputTokens=[&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, 
&#39;d&#39;, &#39;e&#39;], output=[&#39;a b c d&#39;, &#39;b c d e&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(inputTokens=[u&#39;a&#39;, u&#39;b&#39;, u&#39;c&#39;, 
u&#39;d&#39;, u&#39;e&#39;], nGrams=[u&#39;a b c d&#39;, u&#39;b c d 
e&#39;])</span>
+<span class="go">Row(inputTokens=[&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, 
&#39;d&#39;, &#39;e&#39;], nGrams=[&#39;a b c d&#39;, &#39;b c d e&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Must use keyword 
arguments to specify params.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">ngram</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="s2">&quot;text&quot;</span><span class="p">)</span>
 <span class="gt">Traceback (most recent call last):</span>
@@ -5798,7 +5798,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.Normalizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Normalizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Normalizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Normalizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Normalizer</code><span 
class="sig-paren">(</span><em>p=2.0</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#Normalizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Normalizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><blockquote>
 <div>Normalize a vector to have unit norm using the given 
p-norm.</div></blockquote>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span 
class="nn">pyspark.ml.linalg</span> <span class="k">import</span> <span 
class="n">Vectors</span>
@@ -6071,7 +6071,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.OneHotEncoder">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">OneHotEncoder</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#OneHotEncoder"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.OneHotEncoder" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">OneHotEncoder</code><span 
class="sig-paren">(</span><em>dropLast=True</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#OneHotEncoder"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.OneHotEncoder" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A one-hot encoder that maps a column of category indices to a
 column of binary vectors, with at most a single one-value per row
 that indicates the input category index.
@@ -6361,7 +6361,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.PCA">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">PCA</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#PCA"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.PCA" title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">PCA</code><span class="sig-paren">(</span><em>k=None</em>, 
<em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#PCA"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.PCA" title="Permalink to this definition">¶</a></dt>
 <dd><p>PCA trains a model to project vectors to a lower dimensional space of 
the
 top <a class="reference internal" href="#pyspark.ml.feature.PCA.k" 
title="pyspark.ml.feature.PCA.k"><code class="xref py py-attr docutils 
literal"><span class="pre">k</span></code></a> principal components.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span 
class="nn">pyspark.ml.linalg</span> <span class="k">import</span> <span 
class="n">Vectors</span>
@@ -6851,7 +6851,7 @@ Each column is one principal component.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.feature.PolynomialExpansion">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">PolynomialExpansion</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#PolynomialExpansion"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.PolynomialExpansion" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">PolynomialExpansion</code><span 
class="sig-paren">(</span><em>degree=2</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#PolynomialExpansion"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.PolynomialExpansion" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Perform feature expansion in a polynomial space. As said in <a 
class="reference external" 
href="http://en.wikipedia.org/wiki/Polynomial_expansion";>wikipedia of 
Polynomial Expansion</a>, &#8220;In mathematics, an
 expansion of a product of sums expresses it as a sum of products by using the 
fact that
 multiplication distributes over addition&#8221;. Take a 2-variable feature 
vector as an example:
@@ -7122,7 +7122,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.QuantileDiscretizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">QuantileDiscretizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#QuantileDiscretizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.QuantileDiscretizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">QuantileDiscretizer</code><span 
class="sig-paren">(</span><em>numBuckets=2</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em>, <em>relativeError=0.001</em>, 
<em>handleInvalid='error'</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#QuantileDiscretizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.QuantileDiscretizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><div class="admonition note">
 <p class="first admonition-title">Note</p>
 <p class="last">Experimental</p>
@@ -7460,7 +7460,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.RegexTokenizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">RegexTokenizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#RegexTokenizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.RegexTokenizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">RegexTokenizer</code><span 
class="sig-paren">(</span><em>minTokenLength=1</em>, <em>gaps=True</em>, 
<em>pattern='\s+'</em>, <em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>toLowercase=True</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#RegexTokenizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.RegexTokenizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A regex based tokenizer that extracts tokens either by using the
 provided regex pattern (in Java dialect) to split the text
 (default) or repeatedly matching the regex (if gaps is false).
@@ -7470,15 +7470,15 @@ It returns an array of strings that can be empty.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([(</span><span 
class="s2">&quot;A B  c&quot;</span><span class="p">,)],</span> <span 
class="p">[</span><span class="s2">&quot;text&quot;</span><span 
class="p">])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span> <span 
class="o">=</span> <span class="n">RegexTokenizer</span><span 
class="p">(</span><span class="n">inputCol</span><span class="o">=</span><span 
class="s2">&quot;text&quot;</span><span class="p">,</span> <span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;words&quot;</span><span class="p">)</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;A B  c&#39;, words=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;A B  c&#39;, words=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Change a 
parameter.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;tokens&quot;</span><span class="p">)</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;A B  c&#39;, tokens=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;A B  c&#39;, tokens=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Temporarily modify a 
parameter.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">,</span> <span class="p">{</span><span 
class="n">reTokenizer</span><span class="o">.</span><span 
class="n">outputCol</span><span class="p">:</span> <span 
class="s2">&quot;words&quot;</span><span class="p">})</span><span 
class="o">.</span><span class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;A B  c&#39;, words=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;A B  c&#39;, words=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;A B  c&#39;, tokens=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;A B  c&#39;, tokens=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Must use keyword 
arguments to specify params.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">reTokenizer</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="s2">&quot;text&quot;</span><span class="p">)</span>
 <span class="gt">Traceback (most recent call last):</span>
@@ -7814,7 +7814,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.RFormula">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">RFormula</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#RFormula"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.RFormula" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">RFormula</code><span 
class="sig-paren">(</span><em>formula=None</em>, 
<em>featuresCol='features'</em>, <em>labelCol='label'</em>, 
<em>forceIndexLabel=False</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#RFormula"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.RFormula" title="Permalink to this 
definition">¶</a></dt>
 <dd><div class="admonition note">
 <p class="first admonition-title">Note</p>
 <p class="last">Experimental</p>
@@ -8346,7 +8346,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.SQLTransformer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">SQLTransformer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#SQLTransformer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.SQLTransformer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">SQLTransformer</code><span 
class="sig-paren">(</span><em>statement=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#SQLTransformer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.SQLTransformer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Implements the transforms which are defined by SQL statement.
 Currently we only support SQL syntax like &#8216;SELECT ... FROM 
__THIS__&#8217;
 where &#8216;__THIS__&#8217; represents the underlying table of the input 
dataset.</p>
@@ -8580,7 +8580,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.StandardScaler">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StandardScaler</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#StandardScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StandardScaler" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StandardScaler</code><span 
class="sig-paren">(</span><em>withMean=False</em>, <em>withStd=True</em>, 
<em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#StandardScaler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StandardScaler" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Standardizes features by removing the mean and scaling to unit variance 
using column summary
 statistics on the samples in the training set.</p>
 <p>The &#8220;unit std&#8221; is computed using the <a class="reference 
external" 
href="https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation";>corrected
 sample standard deviation</a>,
@@ -9094,7 +9094,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.StopWordsRemover">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StopWordsRemover</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#StopWordsRemover"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StopWordsRemover" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StopWordsRemover</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>stopWords=None</em>, <em>caseSensitive=False</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#StopWordsRemover"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StopWordsRemover" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A feature transformer that filters out stop words from input.</p>
 <div class="admonition note">
 <p class="first admonition-title">Note</p>
@@ -9399,7 +9399,7 @@ uses <code class="xref py py-func docutils literal"><span 
class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.StringIndexer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StringIndexer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#StringIndexer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StringIndexer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">StringIndexer</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>handleInvalid='error'</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#StringIndexer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.StringIndexer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A label indexer that maps a string column of labels to an ML column of 
label indices.
 If the input column is numeric, we cast it to string and index the string 
values.
 The indices are in [0, numLabels), ordered by label frequencies.
@@ -9877,21 +9877,21 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.Tokenizer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Tokenizer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Tokenizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Tokenizer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Tokenizer</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Tokenizer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Tokenizer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A tokenizer that converts the input string to lowercase and then
 splits it by white spaces.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([(</span><span 
class="s2">&quot;a b c&quot;</span><span class="p">,)],</span> <span 
class="p">[</span><span class="s2">&quot;text&quot;</span><span 
class="p">])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span> <span 
class="o">=</span> <span class="n">Tokenizer</span><span 
class="p">(</span><span class="n">inputCol</span><span class="o">=</span><span 
class="s2">&quot;text&quot;</span><span class="p">,</span> <span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;words&quot;</span><span class="p">)</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;a b c&#39;, words=[u&#39;a&#39;, u&#39;b&#39;, 
u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;a b c&#39;, words=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Change a 
parameter.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;tokens&quot;</span><span class="p">)</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;a b c&#39;, tokens=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;a b c&#39;, tokens=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Temporarily modify a 
parameter.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">,</span> <span class="p">{</span><span 
class="n">tokenizer</span><span class="o">.</span><span 
class="n">outputCol</span><span class="p">:</span> <span 
class="s2">&quot;words&quot;</span><span class="p">})</span><span 
class="o">.</span><span class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;a b c&#39;, words=[u&#39;a&#39;, u&#39;b&#39;, 
u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;a b c&#39;, words=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span><span 
class="o">.</span><span class="n">transform</span><span class="p">(</span><span 
class="n">df</span><span class="p">)</span><span class="o">.</span><span 
class="n">head</span><span class="p">()</span>
-<span class="go">Row(text=u&#39;a b c&#39;, tokens=[u&#39;a&#39;, 
u&#39;b&#39;, u&#39;c&#39;])</span>
+<span class="go">Row(text=&#39;a b c&#39;, tokens=[&#39;a&#39;, &#39;b&#39;, 
&#39;c&#39;])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="c1"># Must use keyword 
arguments to specify params.</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">tokenizer</span><span 
class="o">.</span><span class="n">setParams</span><span class="p">(</span><span 
class="s2">&quot;text&quot;</span><span class="p">)</span>
 <span class="gt">Traceback (most recent call last):</span>
@@ -10133,7 +10133,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.VectorAssembler">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorAssembler</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorAssembler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorAssembler" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorAssembler</code><span 
class="sig-paren">(</span><em>inputCols=None</em>, <em>outputCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorAssembler"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorAssembler" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A feature transformer that merges multiple columns into a vector 
column.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span 
class="o">=</span> <span class="n">spark</span><span class="o">.</span><span 
class="n">createDataFrame</span><span class="p">([(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span 
class="p">,</span> <span class="mi">3</span><span class="p">)],</span> <span 
class="p">[</span><span class="s2">&quot;a&quot;</span><span class="p">,</span> 
<span class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;c&quot;</span><span class="p">])</span>
 <span class="gp">&gt;&gt;&gt; </span><span class="n">vecAssembler</span> <span 
class="o">=</span> <span class="n">VectorAssembler</span><span 
class="p">(</span><span class="n">inputCols</span><span class="o">=</span><span 
class="p">[</span><span class="s2">&quot;a&quot;</span><span class="p">,</span> 
<span class="s2">&quot;b&quot;</span><span class="p">,</span> <span 
class="s2">&quot;c&quot;</span><span class="p">],</span> <span 
class="n">outputCol</span><span class="o">=</span><span 
class="s2">&quot;features&quot;</span><span class="p">)</span>
@@ -10380,15 +10380,15 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.VectorIndexer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorIndexer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorIndexer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorIndexer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorIndexer</code><span 
class="sig-paren">(</span><em>maxCategories=20</em>, <em>inputCol=None</em>, 
<em>outputCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/feature.html#VectorIndexer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorIndexer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Class for indexing categorical feature columns in a dataset of 
<cite>Vector</cite>.</p>
 <dl class="docutils">
 <dt>This has 2 usage modes:</dt>
 <dd><blockquote class="first">
-<div><ul>
+<div><ul class="simple">
 <li><dl class="first docutils">
 <dt>Automatically identify categorical features (default behavior)</dt>
-<dd><ul class="first last simple">
+<dd><ul class="first last">
 <li>This helps process a dataset of unknown vectors into a dataset with some 
continuous
 features and some categorical features. The choice between continuous and 
categorical
 is based upon a maxCategories parameter.</li>
@@ -10403,7 +10403,7 @@ and feature 1 will be declared continuous.</li>
 </li>
 <li><dl class="first docutils">
 <dt>Index all features, if all features are categorical</dt>
-<dd><ul class="first last simple">
+<dd><ul class="first last">
 <li>If maxCategories is set to be very large, then this will build an index of 
unique
 values for all features.</li>
 <li>Warning: This can cause problems if features are continuous since this 
will collect ALL
@@ -10945,7 +10945,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.feature.VectorSlicer">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorSlicer</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorSlicer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorSlicer" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">VectorSlicer</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>indices=None</em>, <em>names=None</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorSlicer"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorSlicer" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>This class takes a feature vector and outputs a new feature vector with 
a subarray
 of the original features.</p>
 <p>The subset of features can be specified with either indices 
(<cite>setIndices()</cite>)
@@ -11204,7 +11204,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="method">
 <dt id="pyspark.ml.feature.VectorSlicer.setParams">
-<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorSlicer.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorSlicer.setParams" title="Permalink to this 
definition">¶</a></dt>
+<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>indices=None</em>, <em>names=None</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/feature.html#VectorSlicer.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.VectorSlicer.setParams" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>setParams(self, inputCol=None, outputCol=None, indices=None, 
names=None):
 Sets params for this VectorSlicer.</p>
 <div class="versionadded">
@@ -11246,7 +11246,7 @@ Sets params for this VectorSlicer.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.feature.Word2Vec">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Word2Vec</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Word2Vec"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Word2Vec" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.feature.</code><code 
class="descname">Word2Vec</code><span 
class="sig-paren">(</span><em>vectorSize=100</em>, <em>minCount=5</em>, 
<em>numPartitions=1</em>, <em>stepSize=0.025</em>, <em>maxIter=1</em>, 
<em>seed=None</em>, <em>inputCol=None</em>, <em>outputCol=None</em>, 
<em>windowSize=5</em>, <em>maxSentenceLength=1000</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/feature.html#Word2Vec"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.feature.Word2Vec" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Word2Vec trains a model of <cite>Map(String, Vector)</cite>, i.e. 
transforms a word into a code for further
 natural language processing or machine learning process.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="n">sent</span> <span 
class="o">=</span> <span class="p">(</span><span class="s2">&quot;a b 
&quot;</span> <span class="o">*</span> <span class="mi">100</span> <span 
class="o">+</span> <span class="s2">&quot;a c &quot;</span> <span 
class="o">*</span> <span class="mi">10</span><span class="p">)</span><span 
class="o">.</span><span class="n">split</span><span class="p">(</span><span 
class="s2">&quot; &quot;</span><span class="p">)</span>
@@ -11901,7 +11901,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 <span 
id="pyspark-ml-classification-module"></span><h2>pyspark.ml.classification 
module<a class="headerlink" href="#module-pyspark.ml.classification" 
title="Permalink to this headline">¶</a></h2>
 <dl class="class">
 <dt id="pyspark.ml.classification.LogisticRegression">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">LogisticRegression</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#LogisticRegression"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.LogisticRegression" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">LogisticRegression</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>maxIter=100</em>, <em>regParam=0.0</em>, <em>elasticNetParam=0.0</em>, 
<em>tol=1e-06</em>, <em>fitIntercept=True</em>, <em>threshold=0.5</em>, 
<em>thresholds=None</em>, <em>probabilityCol='probability'</em>, 
<em>rawPredictionCol='rawPrediction'</em>, <em>standardization=True</em>, 
<em>weightCol=None</em>, <em>aggregationDepth=2</em>, 
<em>family='auto'</em><span class="sig-paren">)</span><a class="reference 
internal" 
href="_modules/pyspark/ml/classification.html#LogisticRegression"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.LogisticRegression" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Logistic regression.
 This class supports multinomial logistic (softmax) and binomial logistic 
regression.</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span 
class="nn">pyspark.sql</span> <span class="k">import</span> <span 
class="n">Row</span>
@@ -13192,7 +13192,7 @@ versions.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.classification.DecisionTreeClassifier">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">DecisionTreeClassifier</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#DecisionTreeClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.DecisionTreeClassifier" title="Permalink to 
this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">DecisionTreeClassifier</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>probabilityCol='probability'</em>, 
<em>rawPredictionCol='rawPrediction'</em>, <em>maxDepth=5</em>, 
<em>maxBins=32</em>, <em>minInstancesPerNode=1</em>, <em>minInfoGain=0.0</em>, 
<em>maxMemoryInMB=256</em>, <em>cacheNodeIds=False</em>, 
<em>checkpointInterval=10</em>, <em>impurity='gini'</em>, 
<em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#DecisionTreeClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.DecisionTreeClassifier" title="Permalink to 
this definition">¶</a></dt>
 <dd><p><a class="reference external" 
href="http://en.wikipedia.org/wiki/Decision_tree_learning";>Decision tree</a>
 learning algorithm for classification.
 It supports both binary and multiclass labels, as well as both continuous and 
categorical
@@ -13950,7 +13950,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.classification.GBTClassifier">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">GBTClassifier</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#GBTClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.GBTClassifier" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">GBTClassifier</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>maxDepth=5</em>, <em>maxBins=32</em>, <em>minInstancesPerNode=1</em>, 
<em>minInfoGain=0.0</em>, <em>maxMemoryInMB=256</em>, 
<em>cacheNodeIds=False</em>, <em>checkpointInterval=10</em>, 
<em>lossType='logistic'</em>, <em>maxIter=20</em>, <em>stepSize=0.1</em>, 
<em>seed=None</em>, <em>subsamplingRate=1.0</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#GBTClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.GBTClassifier" title="Permalink to this 
definition">¶</a></dt>
 <dd><p><a class="reference external" 
href="http://en.wikipedia.org/wiki/Gradient_boosting";>Gradient-Boosted Trees 
(GBTs)</a>
 learning algorithm for classification.
 It supports binary labels, as well as both continuous and categorical 
features.</p>
@@ -14735,7 +14735,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.classification.RandomForestClassifier">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">RandomForestClassifier</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#RandomForestClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.RandomForestClassifier" title="Permalink to 
this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">RandomForestClassifier</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>probabilityCol='probability'</em>, 
<em>rawPredictionCol='rawPrediction'</em>, <em>maxDepth=5</em>, 
<em>maxBins=32</em>, <em>minInstancesPerNode=1</em>, <em>minInfoGain=0.0</em>, 
<em>maxMemoryInMB=256</em>, <em>cacheNodeIds=False</em>, 
<em>checkpointInterval=10</em>, <em>impurity='gini'</em>, <em>numTrees=20</em>, 
<em>featureSubsetStrategy='auto'</em>, <em>seed=None</em>, 
<em>subsamplingRate=1.0</em><span class="sig-paren">)</span><a class="reference 
internal" 
href="_modules/pyspark/ml/classification.html#RandomForestClassifier"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.RandomForestClassifier" title="Permalink to 
this definition">¶</a></dt>
 <dd><p><a class="reference external" 
href="http://en.wikipedia.org/wiki/Random_forest";>Random Forest</a>
 learning algorithm for classification.
 It supports both binary and multiclass labels, as well as both continuous and 
categorical
@@ -15569,7 +15569,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.classification.NaiveBayes">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">NaiveBayes</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#NaiveBayes"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.NaiveBayes" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">NaiveBayes</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>probabilityCol='probability'</em>, 
<em>rawPredictionCol='rawPrediction'</em>, <em>smoothing=1.0</em>, 
<em>modelType='multinomial'</em>, <em>thresholds=None</em>, 
<em>weightCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" href="_modules/pyspark/ml/classification.html#NaiveBayes"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.NaiveBayes" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Naive Bayes Classifiers.
 It supports both Multinomial and Bernoulli NB. <a class="reference external" 
href="http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html";>Multinomial
 NB</a>
 can handle finitely supported discrete data. For example, by converting 
documents into
@@ -16204,7 +16204,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.classification.MultilayerPerceptronClassifier">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">MultilayerPerceptronClassifier</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#MultilayerPerceptronClassifier"><span
 class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.MultilayerPerceptronClassifier" 
title="Permalink to this definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">MultilayerPerceptronClassifier</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>maxIter=100</em>, <em>tol=1e-06</em>, <em>seed=None</em>, 
<em>layers=None</em>, <em>blockSize=128</em>, <em>stepSize=0.03</em>, 
<em>solver='l-bfgs'</em>, <em>initialWeights=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#MultilayerPerceptronClassifier"><span
 class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.MultilayerPerceptronClassifier" 
title="Permalink to this definition">¶</a></dt>
 <dd><p>Classifier trainer based on the Multilayer Perceptron.
 Each layer has sigmoid activation function, output layer has softmax.
 Number of inputs has to be equal to the size of feature vectors.
@@ -16881,7 +16881,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.classification.OneVsRest">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">OneVsRest</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#OneVsRest"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.OneVsRest" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.classification.</code><code 
class="descname">OneVsRest</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>classifier=None</em>, <em>weightCol=None</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#OneVsRest"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.OneVsRest" title="Permalink to this 
definition">¶</a></dt>
 <dd><div class="admonition note">
 <p class="first admonition-title">Note</p>
 <p class="last">Experimental</p>
@@ -17180,7 +17180,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="method">
 <dt id="pyspark.ml.classification.OneVsRest.setParams">
-<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/classification.html#OneVsRest.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.OneVsRest.setParams" title="Permalink to this 
definition">¶</a></dt>
+<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>featuresCol=None</em>, <em>labelCol=None</em>, 
<em>predictionCol=None</em>, <em>classifier=None</em>, 
<em>weightCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" 
href="_modules/pyspark/ml/classification.html#OneVsRest.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.classification.OneVsRest.setParams" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>setParams(self, featuresCol=None, labelCol=None, predictionCol=None,    
               classifier=None, weightCol=None):
 Sets params for OneVsRest.</p>
 <div class="versionadded">
@@ -17521,7 +17521,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 <span id="pyspark-ml-clustering-module"></span><h2>pyspark.ml.clustering 
module<a class="headerlink" href="#module-pyspark.ml.clustering" 
title="Permalink to this headline">¶</a></h2>
 <dl class="class">
 <dt id="pyspark.ml.clustering.BisectingKMeans">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">BisectingKMeans</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#BisectingKMeans"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.BisectingKMeans" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">BisectingKMeans</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>predictionCol='prediction'</em>, <em>maxIter=20</em>, <em>seed=None</em>, 
<em>k=4</em>, <em>minDivisibleClusterSize=1.0</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#BisectingKMeans"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.BisectingKMeans" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>A bisecting k-means algorithm based on the paper &#8220;A comparison of 
document clustering
 techniques&#8221; by Steinbach, Karypis, and Kumar, with modification to fit 
Spark.
 The algorithm starts from a single cluster that contains all points.
@@ -18174,7 +18174,7 @@ training set. An exception is thrown if no summary 
exists.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.clustering.KMeans">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">KMeans</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#KMeans"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.KMeans" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">KMeans</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>predictionCol='prediction'</em>, <em>k=2</em>, 
<em>initMode='k-means||'</em>, <em>initSteps=2</em>, <em>tol=0.0001</em>, 
<em>maxIter=20</em>, <em>seed=None</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/clustering.html#KMeans"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.KMeans" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>K-means clustering with a k-means++ like initialization mode
 (the k-means|| algorithm by Bahmani et al).</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span 
class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span 
class="nn">pyspark.ml.linalg</span> <span class="k">import</span> <span 
class="n">Vectors</span>
@@ -18794,7 +18794,7 @@ training set. An exception is thrown if no summary 
exists.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.clustering.GaussianMixture">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">GaussianMixture</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#GaussianMixture"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.GaussianMixture" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">GaussianMixture</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>predictionCol='prediction'</em>, <em>k=2</em>, 
<em>probabilityCol='probability'</em>, <em>tol=0.01</em>, <em>maxIter=100</em>, 
<em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#GaussianMixture"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.GaussianMixture" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>GaussianMixture clustering.
 This class performs expectation maximization for multivariate Gaussian
 Mixture Models (GMMs).  A GMM represents a composite distribution of
@@ -19510,7 +19510,7 @@ where weights[i] is the weight for Gaussian i, and 
weights sum to 1.</p>
 
 <dl class="class">
 <dt id="pyspark.ml.clustering.LDA">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">LDA</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#LDA"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.LDA" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.clustering.</code><code 
class="descname">LDA</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, <em>maxIter=20</em>, 
<em>seed=None</em>, <em>checkpointInterval=10</em>, <em>k=10</em>, 
<em>optimizer='online'</em>, <em>learningOffset=1024.0</em>, 
<em>learningDecay=0.51</em>, <em>subsamplingRate=0.05</em>, 
<em>optimizeDocConcentration=True</em>, <em>docConcentration=None</em>, 
<em>topicConcentration=None</em>, 
<em>topicDistributionCol='topicDistribution'</em>, 
<em>keepLastCheckpoint=True</em><span class="sig-paren">)</span><a 
class="reference internal" href="_modules/pyspark/ml/clustering.html#LDA"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.LDA" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Latent Dirichlet Allocation (LDA), a topic model designed for text 
documents.</p>
 <p>Terminology:</p>
 <blockquote>
@@ -20031,7 +20031,7 @@ Currenlty only support &#8216;em&#8217; and 
&#8216;online&#8217;.</p>
 
 <dl class="method">
 <dt id="pyspark.ml.clustering.LDA.setParams">
-<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/clustering.html#LDA.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.LDA.setParams" title="Permalink to this 
definition">¶</a></dt>
+<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, <em>maxIter=20</em>, 
<em>seed=None</em>, <em>checkpointInterval=10</em>, <em>k=10</em>, 
<em>optimizer='online'</em>, <em>learningOffset=1024.0</em>, 
<em>learningDecay=0.51</em>, <em>subsamplingRate=0.05</em>, 
<em>optimizeDocConcentration=True</em>, <em>docConcentration=None</em>, 
<em>topicConcentration=None</em>, 
<em>topicDistributionCol='topicDistribution'</em>, 
<em>keepLastCheckpoint=True</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/clustering.html#LDA.setParams"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.clustering.LDA.setParams" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>setParams(self, featuresCol=&#8221;features&#8221;, maxIter=20, 
seed=None, checkpointInterval=10,                  k=10, 
optimizer=&#8221;online&#8221;, learningOffset=1024.0, learningDecay=0.51,      
            subsamplingRate=0.05, optimizeDocConcentration=True,                
  docConcentration=None, topicConcentration=None,                  
topicDistributionCol=&#8221;topicDistribution&#8221;, 
keepLastCheckpoint=True):</p>
 <p>Sets params for LDA.</p>
 <div class="versionadded">
@@ -21356,7 +21356,7 @@ or array.array.</p>
 <span 
id="pyspark-ml-recommendation-module"></span><h2>pyspark.ml.recommendation 
module<a class="headerlink" href="#module-pyspark.ml.recommendation" 
title="Permalink to this headline">¶</a></h2>
 <dl class="class">
 <dt id="pyspark.ml.recommendation.ALS">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.recommendation.</code><code 
class="descname">ALS</code><span class="sig-paren">(</span><em>*args</em>, 
<em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/recommendation.html#ALS"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.recommendation.ALS" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.recommendation.</code><code 
class="descname">ALS</code><span class="sig-paren">(</span><em>rank=10</em>, 
<em>maxIter=10</em>, <em>regParam=0.1</em>, <em>numUserBlocks=10</em>, 
<em>numItemBlocks=10</em>, <em>implicitPrefs=False</em>, <em>alpha=1.0</em>, 
<em>userCol='user'</em>, <em>itemCol='item'</em>, <em>seed=None</em>, 
<em>ratingCol='rating'</em>, <em>nonnegative=False</em>, 
<em>checkpointInterval=10</em>, 
<em>intermediateStorageLevel='MEMORY_AND_DISK'</em>, 
<em>finalStorageLevel='MEMORY_AND_DISK'</em><span class="sig-paren">)</span><a 
class="reference internal" 
href="_modules/pyspark/ml/recommendation.html#ALS"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.recommendation.ALS" title="Permalink to this 
definition">¶</a></dt>
 <dd><p>Alternating Least Squares (ALS) matrix factorization.</p>
 <p>ALS attempts to estimate the ratings matrix <cite>R</cite> as the product of
 two lower-rank matrices, <cite>X</cite> and <cite>Y</cite>, i.e. <cite>X * Yt 
= R</cite>. Typically
@@ -22184,7 +22184,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 <span id="pyspark-ml-regression-module"></span><h2>pyspark.ml.regression 
module<a class="headerlink" href="#module-pyspark.ml.regression" 
title="Permalink to this headline">¶</a></h2>
 <dl class="class">
 <dt id="pyspark.ml.regression.AFTSurvivalRegression">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.regression.</code><code 
class="descname">AFTSurvivalRegression</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#AFTSurvivalRegression"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.AFTSurvivalRegression" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.regression.</code><code 
class="descname">AFTSurvivalRegression</code><span 
class="sig-paren">(</span><em>featuresCol='features', labelCol='label', 
predictionCol='prediction', fitIntercept=True, maxIter=100, tol=1e-06, 
censorCol='censor', quantileProbabilities=[0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 
0.9, 0.95, 0.99], quantilesCol=None, aggregationDepth=2</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#AFTSurvivalRegression"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.AFTSurvivalRegression" title="Permalink to this 
definition">¶</a></dt>
 <dd><div class="admonition note">
 <p class="first admonition-title">Note</p>
 <p class="last">Experimental</p>
@@ -22563,7 +22563,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="method">
 <dt id="pyspark.ml.regression.AFTSurvivalRegression.setParams">
-<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#AFTSurvivalRegression.setParams"><span
 class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.AFTSurvivalRegression.setParams" title="Permalink 
to this definition">¶</a></dt>
+<code class="descname">setParams</code><span 
class="sig-paren">(</span><em>featuresCol='features', labelCol='label', 
predictionCol='prediction', fitIntercept=True, maxIter=100, tol=1e-06, 
censorCol='censor', quantileProbabilities=[0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 
0.9, 0.95, 0.99], quantilesCol=None, aggregationDepth=2</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#AFTSurvivalRegression.setParams"><span
 class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.AFTSurvivalRegression.setParams" title="Permalink 
to this definition">¶</a></dt>
 <dd><p>setParams(self, featuresCol=&#8221;features&#8221;, 
labelCol=&#8221;label&#8221;, predictionCol=&#8221;prediction&#8221;,           
        fitIntercept=True, maxIter=100, tol=1E-6, 
censorCol=&#8221;censor&#8221;,                   quantileProbabilities=[0.01, 
0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99],                   
quantilesCol=None, aggregationDepth=2):</p>
 <div class="versionadded">
 <p><span class="versionmodified">New in version 1.6.0.</span></p>
@@ -22852,7 +22852,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.regression.DecisionTreeRegressor">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.regression.</code><code 
class="descname">DecisionTreeRegressor</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#DecisionTreeRegressor"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.DecisionTreeRegressor" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code 
class="descclassname">pyspark.ml.regression.</code><code 
class="descname">DecisionTreeRegressor</code><span 
class="sig-paren">(</span><em>featuresCol='features'</em>, 
<em>labelCol='label'</em>, <em>predictionCol='prediction'</em>, 
<em>maxDepth=5</em>, <em>maxBins=32</em>, <em>minInstancesPerNode=1</em>, 
<em>minInfoGain=0.0</em>, <em>maxMemoryInMB=256</em>, 
<em>cacheNodeIds=False</em>, <em>checkpointInterval=10</em>, 
<em>impurity='variance'</em>, <em>seed=None</em>, 
<em>varianceCol=None</em><span class="sig-paren">)</span><a class="reference 
internal" 
href="_modules/pyspark/ml/regression.html#DecisionTreeRegressor"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.DecisionTreeRegressor" title="Permalink to this 
definition">¶</a></dt>
 <dd><p><a class="reference external" 
href="http://en.wikipedia.org/wiki/Decision_tree_learning";>Decision tree</a>
 learning algorithm for regression.
 It supports both continuous and categorical features.</p>
@@ -23572,7 +23572,7 @@ uses <code class="xref py py-func docutils 
literal"><span class="pre">dir()</spa
 
 <dl class="class">
 <dt id="pyspark.ml.regression.GBTRegressor">
-<em class="property">class </em><code 
class="descclassname">pyspark.ml.regression.</code><code 
class="descname">GBTRegressor</code><span 
class="sig-paren">(</span><em>*args</em>, <em>**kwargs</em><span 
class="sig-paren">)</span><a class="reference internal" 
href="_modules/pyspark/ml/regression.html#GBTRegressor"><span 
class="viewcode-link">[source]</span></a><a class="headerlink" 
href="#pyspark.ml.regression.GBTRegressor" title="Permalink to this 
definition">¶</a></dt>
+<em class="property">class </em><code class="descclassname">pyspark.m

<TRUNCATED>

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to