http://git-wip-us.apache.org/repos/asf/spark-website/blob/da71a5c1/site/docs/2.1.3/api/python/pyspark.mllib.html ---------------------------------------------------------------------- diff --git a/site/docs/2.1.3/api/python/pyspark.mllib.html b/site/docs/2.1.3/api/python/pyspark.mllib.html index 705c126..fcb1b09 100644 --- a/site/docs/2.1.3/api/python/pyspark.mllib.html +++ b/site/docs/2.1.3/api/python/pyspark.mllib.html @@ -2624,7 +2624,7 @@ Compositionality.</p> <p>Querying for synonyms of a word will not return that word:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="s2">"a"</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'b', u'c']</span> +<span class="go">['b', 'c']</span> </pre></div> </div> <p>But querying for synonyms of a vector may return the word whose @@ -2632,7 +2632,7 @@ representation is that vector:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">vec</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="s2">"a"</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="n">vec</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'a', u'b']</span> +<span class="go">['a', 'b']</span> </pre></div> </div> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">os</span><span class="o">,</span> <span class="nn">tempfile</span> @@ -2643,7 +2643,7 @@ representation is that vector:</p> <span class="go">True</span> <span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">sameModel</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="s2">"a"</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'b', u'c']</span> +<span class="go">['b', 'c']</span> <span class="gp">>>> </span><span class="kn">from</span> <span class="nn">shutil</span> <span class="k">import</span> <span class="n">rmtree</span> <span class="gp">>>> </span><span class="k">try</span><span class="p">:</span> <span class="gp">... </span> <span class="n">rmtree</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> @@ -3034,7 +3034,7 @@ using the Parallel FP-Growth algorithm.</p> <span class="gp">>>> </span><span class="n">rdd</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">model</span> <span class="o">=</span> <span class="n">FPGrowth</span><span class="o">.</span><span class="n">train</span><span class="p">(</span><span class="n">rdd</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">freqItemsets</span><span class="p">()</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), ...</span> +<span class="go">[FreqItemset(items=['a'], freq=4), FreqItemset(items=['c'], freq=3), ...</span> <span class="gp">>>> </span><span class="n">model_path</span> <span class="o">=</span> <span class="n">temp_path</span> <span class="o">+</span> <span class="s2">"/fpm"</span> <span class="gp">>>> </span><span class="n">model</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">sc</span><span class="p">,</span> <span class="n">model_path</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">sameModel</span> <span class="o">=</span> <span class="n">FPGrowthModel</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">sc</span><span class="p">,</span> <span class="n">model_path</span><span class="p">)</span> @@ -3132,7 +3132,7 @@ another iteration of distributed prefix growth is run. <span class="gp">>>> </span><span class="n">rdd</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">model</span> <span class="o">=</span> <span class="n">PrefixSpan</span><span class="o">.</span><span class="n">train</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">freqSequences</span><span class="p">()</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[FreqSequence(sequence=[[u'a']], freq=3), FreqSequence(sequence=[[u'a'], [u'a']], freq=1), ...</span> +<span class="go">[FreqSequence(sequence=[['a']], freq=3), FreqSequence(sequence=[['a'], ['a']], freq=1), ...</span> </pre></div> </div> <div class="versionadded"> @@ -4884,7 +4884,7 @@ distribution with the input mean.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.exponentialVectorRDD"> -<em class="property">static </em><code class="descname">exponentialVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.exponentialVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.exponentialVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">exponentialVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.exponentialVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.exponentialVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Exponential distribution with the input mean.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -4970,7 +4970,7 @@ distribution with the input shape and scale.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.gammaVectorRDD"> -<em class="property">static </em><code class="descname">gammaVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.gammaVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.gammaVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">gammaVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>shape</em>, <em>scale</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.gammaVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.gammaVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Gamma distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5060,7 +5060,7 @@ distribution with the input mean and standard distribution.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.logNormalVectorRDD"> -<em class="property">static </em><code class="descname">logNormalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.logNormalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.logNormalVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">logNormalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>std</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.logNormalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.logNormalVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the log normal distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5146,7 +5146,7 @@ to some other normal N(mean, sigma^2), use <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.normalVectorRDD"> -<em class="property">static </em><code class="descname">normalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.normalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.normalVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">normalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.normalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.normalVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the standard normal distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5224,7 +5224,7 @@ distribution with the input mean.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.poissonVectorRDD"> -<em class="property">static </em><code class="descname">poissonVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.poissonVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.poissonVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">poissonVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.poissonVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.poissonVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5308,7 +5308,7 @@ to U(a, b), use <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.uniformVectorRDD"> -<em class="property">static </em><code class="descname">uniformVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.uniformVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.uniformVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">uniformVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.uniformVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.uniformVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the uniform distribution U(0.0, 1.0).</p> <table class="docutils field-list" frame="void" rules="none"> @@ -6579,9 +6579,9 @@ of freedom, p-value, the method used, and the null hypothesis.</p> <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">pearson</span><span class="o">.</span><span class="n">pValue</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span> <span class="go">0.8187</span> <span class="gp">>>> </span><span class="n">pearson</span><span class="o">.</span><span class="n">method</span> -<span class="go">u'pearson'</span> +<span class="go">'pearson'</span> <span class="gp">>>> </span><span class="n">pearson</span><span class="o">.</span><span class="n">nullHypothesis</span> -<span class="go">u'observed follows the same distribution as expected.'</span> +<span class="go">'observed follows the same distribution as expected.'</span> </pre></div> </div> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">observed</span> <span class="o">=</span> <span class="n">Vectors</span><span class="o">.</span><span class="n">dense</span><span class="p">([</span><span class="mi">21</span><span class="p">,</span> <span class="mi">38</span><span class="p">,</span> <span class="mi">43</span><span class="p">,</span> <span class="mi">80</span><span class="p">])</span> @@ -6761,7 +6761,7 @@ the method used, and the null hypothesis.</p> <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">ksmodel</span><span class="o">.</span><span class="n">statistic</span><span class="p">,</span> <span class="mi">3</span><span class="p">))</span> <span class="go">0.175</span> <span class="gp">>>> </span><span class="n">ksmodel</span><span class="o">.</span><span class="n">nullHypothesis</span> -<span class="go">u'Sample follows theoretical distribution'</span> +<span class="go">'Sample follows theoretical distribution'</span> </pre></div> </div> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">data</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="mf">2.0</span><span class="p">,</span> <span class="mf">3.0</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">])</span>
--------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org