http://git-wip-us.apache.org/repos/asf/spark-website/blob/6bbac496/site/docs/2.1.2/api/python/_modules/pyspark/streaming/dstream.html ---------------------------------------------------------------------- diff --git a/site/docs/2.1.2/api/python/_modules/pyspark/streaming/dstream.html b/site/docs/2.1.2/api/python/_modules/pyspark/streaming/dstream.html index 48eb5cc..853b230 100644 --- a/site/docs/2.1.2/api/python/_modules/pyspark/streaming/dstream.html +++ b/site/docs/2.1.2/api/python/_modules/pyspark/streaming/dstream.html @@ -202,7 +202,7 @@ <span class="sd">"""</span> <span class="sd"> Apply a function to each RDD in this DStream.</span> <span class="sd"> """</span> - <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> + <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> <span class="n">old_func</span> <span class="o">=</span> <span class="n">func</span> <span class="n">func</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">t</span><span class="p">,</span> <span class="n">rdd</span><span class="p">:</span> <span class="n">old_func</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> <span class="n">jfunc</span> <span class="o">=</span> <span class="n">TransformFunction</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_sc</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_jrdd_deserializer</span><span class="p">)</span> @@ -338,10 +338,10 @@ <span class="sd"> `func` can have one argument of `rdd`, or have two arguments of</span> <span class="sd"> (`time`, `rdd`)</span> <span class="sd"> """</span> - <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> + <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> <span class="n">oldfunc</span> <span class="o">=</span> <span class="n">func</span> <span class="n">func</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">t</span><span class="p">,</span> <span class="n">rdd</span><span class="p">:</span> <span class="n">oldfunc</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> - <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"func should take one or two arguments"</span> + <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"func should take one or two arguments"</span> <span class="k">return</span> <span class="n">TransformedDStream</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span></div> <div class="viewcode-block" id="DStream.transformWith"><a class="viewcode-back" href="../../../pyspark.streaming.html#pyspark.streaming.DStream.transformWith">[docs]</a> <span class="k">def</span> <span class="nf">transformWith</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">keepSerializer</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span> @@ -352,10 +352,10 @@ <span class="sd"> `func` can have two arguments of (`rdd_a`, `rdd_b`) or have three</span> <span class="sd"> arguments of (`time`, `rdd_a`, `rdd_b`)</span> <span class="sd"> """</span> - <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span> + <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span> <span class="n">oldfunc</span> <span class="o">=</span> <span class="n">func</span> <span class="n">func</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">t</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">oldfunc</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> - <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">"func should take two or three arguments"</span> + <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">"func should take two or three arguments"</span> <span class="n">jfunc</span> <span class="o">=</span> <span class="n">TransformFunction</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_sc</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_jrdd_deserializer</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">_jrdd_deserializer</span><span class="p">)</span> <span class="n">dstream</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_sc</span><span class="o">.</span><span class="n">_jvm</span><span class="o">.</span><span class="n">PythonTransformed2DStream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_jdstream</span><span class="o">.</span><span class="n">dstream</span><span class="p">(),</span> <span class="n">other</span><span class="o">.</span><span class="n">_jdstream</span><span class="o">.</span><span class="n">dstream</span><span class="p">(),</span> <span class="n">jfunc</span><span class="p">)</span>
http://git-wip-us.apache.org/repos/asf/spark-website/blob/6bbac496/site/docs/2.1.2/api/python/_modules/pyspark/streaming/kafka.html ---------------------------------------------------------------------- diff --git a/site/docs/2.1.2/api/python/_modules/pyspark/streaming/kafka.html b/site/docs/2.1.2/api/python/_modules/pyspark/streaming/kafka.html index 52f3960..c7e8fbf 100644 --- a/site/docs/2.1.2/api/python/_modules/pyspark/streaming/kafka.html +++ b/site/docs/2.1.2/api/python/_modules/pyspark/streaming/kafka.html @@ -288,7 +288,7 @@ <span class="bp">self</span><span class="o">.</span><span class="n">untilOffset</span> <span class="o">=</span> <span class="n">untilOffset</span> <span class="k">def</span> <span class="nf">__eq__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span> - <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">other</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">__class__</span><span class="p">):</span> + <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">other</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__class__</span><span class="p">):</span> <span class="k">return</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">topic</span> <span class="o">==</span> <span class="n">other</span><span class="o">.</span><span class="n">topic</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">partition</span> <span class="o">==</span> <span class="n">other</span><span class="o">.</span><span class="n">partition</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">fromOffset</span> <span class="o">==</span> <span class="n">other</span><span class="o">.</span><span class="n">fromOffset</span> @@ -297,7 +297,7 @@ <span class="k">return</span> <span class="kc">False</span> <span class="k">def</span> <span class="nf">__ne__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span> - <span class="k">return</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">__eq__</span><span class="p">(</span><span class="n">other</span><span class="p">)</span> + <span class="k">return</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="fm">__eq__</span><span class="p">(</span><span class="n">other</span><span class="p">)</span> <span class="k">def</span> <span class="nf">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="s2">"OffsetRange(topic: </span><span class="si">%s</span><span class="s2">, partition: </span><span class="si">%d</span><span class="s2">, range: [</span><span class="si">%d</span><span class="s2"> -> </span><span class="si">%d</span><span class="s2">]"</span> \ @@ -326,17 +326,17 @@ <span class="k">return</span> <span class="n">helper</span><span class="o">.</span><span class="n">createTopicAndPartition</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_topic</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_partition</span><span class="p">)</span> <span class="k">def</span> <span class="nf">__eq__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span> - <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">other</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">__class__</span><span class="p">):</span> + <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">other</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__class__</span><span class="p">):</span> <span class="k">return</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_topic</span> <span class="o">==</span> <span class="n">other</span><span class="o">.</span><span class="n">_topic</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">_partition</span> <span class="o">==</span> <span class="n">other</span><span class="o">.</span><span class="n">_partition</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="k">return</span> <span class="kc">False</span> <span class="k">def</span> <span class="nf">__ne__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span> - <span class="k">return</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">__eq__</span><span class="p">(</span><span class="n">other</span><span class="p">)</span> + <span class="k">return</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="fm">__eq__</span><span class="p">(</span><span class="n">other</span><span class="p">)</span> <span class="k">def</span> <span class="nf">__hash__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> - <span class="k">return</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_topic</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_partition</span><span class="p">)</span><span class="o">.</span><span class="n">__hash__</span><span class="p">()</span></div> + <span class="k">return</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_topic</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_partition</span><span class="p">)</span><span class="o">.</span><span class="fm">__hash__</span><span class="p">()</span></div> <div class="viewcode-block" id="Broker"><a class="viewcode-back" href="../../../pyspark.streaming.html#pyspark.streaming.kafka.Broker">[docs]</a><span class="k">class</span> <span class="nc">Broker</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> @@ -363,7 +363,7 @@ <span class="sd"> """</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jrdd</span><span class="p">,</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">):</span> - <span class="n">RDD</span><span class="o">.</span><span class="n">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jrdd</span><span class="p">,</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">)</span> + <span class="n">RDD</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jrdd</span><span class="p">,</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">)</span> <span class="k">def</span> <span class="nf">offsetRanges</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="sd">"""</span> @@ -383,13 +383,13 @@ <span class="sd"> """</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jdstream</span><span class="p">,</span> <span class="n">ssc</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">):</span> - <span class="n">DStream</span><span class="o">.</span><span class="n">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jdstream</span><span class="p">,</span> <span class="n">ssc</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">)</span> + <span class="n">DStream</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">jdstream</span><span class="p">,</span> <span class="n">ssc</span><span class="p">,</span> <span class="n">jrdd_deserializer</span><span class="p">)</span> <span class="k">def</span> <span class="nf">foreachRDD</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">func</span><span class="p">):</span> <span class="sd">"""</span> <span class="sd"> Apply a function to each RDD in this DStream.</span> <span class="sd"> """</span> - <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> + <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> <span class="n">old_func</span> <span class="o">=</span> <span class="n">func</span> <span class="n">func</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">r</span><span class="p">,</span> <span class="n">rdd</span><span class="p">:</span> <span class="n">old_func</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> <span class="n">jfunc</span> <span class="o">=</span> <span class="n">TransformFunction</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_sc</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_jrdd_deserializer</span><span class="p">)</span> \ @@ -405,10 +405,10 @@ <span class="sd"> `func` can have one argument of `rdd`, or have two arguments of</span> <span class="sd"> (`time`, `rdd`)</span> <span class="sd"> """</span> - <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> + <span class="k">if</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span> <span class="n">oldfunc</span> <span class="o">=</span> <span class="n">func</span> <span class="n">func</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">t</span><span class="p">,</span> <span class="n">rdd</span><span class="p">:</span> <span class="n">oldfunc</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> - <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="n">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"func should take one or two arguments"</span> + <span class="k">assert</span> <span class="n">func</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span> <span class="o">==</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"func should take one or two arguments"</span> <span class="k">return</span> <span class="n">KafkaTransformedDStream</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> @@ -419,7 +419,7 @@ <span class="sd"> """</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">prev</span><span class="p">,</span> <span class="n">func</span><span class="p">):</span> - <span class="n">TransformedDStream</span><span class="o">.</span><span class="n">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">prev</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> + <span class="n">TransformedDStream</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">prev</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> <span class="nd">@property</span> <span class="k">def</span> <span class="nf">_jdstream</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> @@ -462,7 +462,7 @@ <span class="o">%</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">topic</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">partition</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">offset</span><span class="p">)</span> <span class="k">def</span> <span class="nf">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> - <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">__str__</span><span class="p">()</span> + <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="fm">__str__</span><span class="p">()</span> <span class="k">def</span> <span class="nf">__reduce__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="p">(</span><span class="n">KafkaMessageAndMetadata</span><span class="p">,</span> http://git-wip-us.apache.org/repos/asf/spark-website/blob/6bbac496/site/docs/2.1.2/api/python/_static/pygments.css ---------------------------------------------------------------------- diff --git a/site/docs/2.1.2/api/python/_static/pygments.css b/site/docs/2.1.2/api/python/_static/pygments.css index 8213e90..20c4814 100644 --- a/site/docs/2.1.2/api/python/_static/pygments.css +++ b/site/docs/2.1.2/api/python/_static/pygments.css @@ -47,8 +47,10 @@ .highlight .mh { color: #208050 } /* Literal.Number.Hex */ .highlight .mi { color: #208050 } /* Literal.Number.Integer */ .highlight .mo { color: #208050 } /* Literal.Number.Oct */ +.highlight .sa { color: #4070a0 } /* Literal.String.Affix */ .highlight .sb { color: #4070a0 } /* Literal.String.Backtick */ .highlight .sc { color: #4070a0 } /* Literal.String.Char */ +.highlight .dl { color: #4070a0 } /* Literal.String.Delimiter */ .highlight .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */ .highlight .s2 { color: #4070a0 } /* Literal.String.Double */ .highlight .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */ @@ -59,7 +61,9 @@ .highlight .s1 { color: #4070a0 } /* Literal.String.Single */ .highlight .ss { color: #517918 } /* Literal.String.Symbol */ .highlight .bp { color: #007020 } /* Name.Builtin.Pseudo */ +.highlight .fm { color: #06287e } /* Name.Function.Magic */ .highlight .vc { color: #bb60d5 } /* Name.Variable.Class */ .highlight .vg { color: #bb60d5 } /* Name.Variable.Global */ .highlight .vi { color: #bb60d5 } /* Name.Variable.Instance */ +.highlight .vm { color: #bb60d5 } /* Name.Variable.Magic */ .highlight .il { color: #208050 } /* Literal.Number.Integer.Long */ \ No newline at end of file http://git-wip-us.apache.org/repos/asf/spark-website/blob/6bbac496/site/docs/2.1.2/api/python/pyspark.html ---------------------------------------------------------------------- diff --git a/site/docs/2.1.2/api/python/pyspark.html b/site/docs/2.1.2/api/python/pyspark.html index cce60da..2f32994 100644 --- a/site/docs/2.1.2/api/python/pyspark.html +++ b/site/docs/2.1.2/api/python/pyspark.html @@ -69,47 +69,40 @@ <p>PySpark is the Python API for Spark.</p> <p>Public classes:</p> <blockquote> -<div><ul> +<div><ul class="simple"> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.SparkContext" title="pyspark.SparkContext"><code class="xref py py-class docutils literal"><span class="pre">SparkContext</span></code></a>:</dt> -<dd><p class="first last">Main entry point for Spark functionality.</p> -</dd> +<dd>Main entry point for Spark functionality.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.RDD" title="pyspark.RDD"><code class="xref py py-class docutils literal"><span class="pre">RDD</span></code></a>:</dt> -<dd><p class="first last">A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.</p> -</dd> +<dd>A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.Broadcast" title="pyspark.Broadcast"><code class="xref py py-class docutils literal"><span class="pre">Broadcast</span></code></a>:</dt> -<dd><p class="first last">A broadcast variable that gets reused across tasks.</p> -</dd> +<dd>A broadcast variable that gets reused across tasks.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.Accumulator" title="pyspark.Accumulator"><code class="xref py py-class docutils literal"><span class="pre">Accumulator</span></code></a>:</dt> -<dd><p class="first last">An “add-only” shared variable that tasks can only add values to.</p> -</dd> +<dd>An “add-only” shared variable that tasks can only add values to.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.SparkConf" title="pyspark.SparkConf"><code class="xref py py-class docutils literal"><span class="pre">SparkConf</span></code></a>:</dt> -<dd><p class="first last">For configuring Spark.</p> -</dd> +<dd>For configuring Spark.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.SparkFiles" title="pyspark.SparkFiles"><code class="xref py py-class docutils literal"><span class="pre">SparkFiles</span></code></a>:</dt> -<dd><p class="first last">Access files shipped with jobs.</p> -</dd> +<dd>Access files shipped with jobs.</dd> </dl> </li> <li><dl class="first docutils"> <dt><a class="reference internal" href="#pyspark.StorageLevel" title="pyspark.StorageLevel"><code class="xref py py-class docutils literal"><span class="pre">StorageLevel</span></code></a>:</dt> -<dd><p class="first last">Finer-grained cache persistence levels.</p> -</dd> +<dd>Finer-grained cache persistence levels.</dd> </dl> </li> </ul> @@ -277,7 +270,7 @@ Its format depends on the scheduler implementation.</p> <li>in case of YARN something like ‘application_1433865536131_34483’</li> </ul> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sc</span><span class="o">.</span><span class="n">applicationId</span> -<span class="go">u'local-...'</span> +<span class="go">'local-...'</span> </pre></div> </div> </dd></dl> @@ -756,7 +749,7 @@ Spark 1.2)</p> <span class="gp">... </span> <span class="n">_</span> <span class="o">=</span> <span class="n">testFile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"Hello world!"</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">textFile</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">textFile</span><span class="o">.</span><span class="n">collect</span><span class="p">()</span> -<span class="go">[u'Hello world!']</span> +<span class="go">['Hello world!']</span> </pre></div> </div> </dd></dl> @@ -779,10 +772,10 @@ serializer:</p> <span class="gp">... </span> <span class="n">_</span> <span class="o">=</span> <span class="n">testFile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"Hello"</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">textFile</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">textFile</span><span class="o">.</span><span class="n">collect</span><span class="p">()</span> -<span class="go">[u'Hello']</span> +<span class="go">['Hello']</span> <span class="gp">>>> </span><span class="n">parallelized</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="s2">"World!"</span><span class="p">])</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">sc</span><span class="o">.</span><span class="n">union</span><span class="p">([</span><span class="n">textFile</span><span class="p">,</span> <span class="n">parallelized</span><span class="p">])</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[u'Hello', 'World!']</span> +<span class="go">['Hello', 'World!']</span> </pre></div> </div> </dd></dl> @@ -832,7 +825,7 @@ fully in memory.</p> <span class="gp">... </span> <span class="n">_</span> <span class="o">=</span> <span class="n">file2</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"2"</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">textFiles</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">wholeTextFiles</span><span class="p">(</span><span class="n">dirPath</span><span class="p">)</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">textFiles</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[(u'.../1.txt', u'1'), (u'.../2.txt', u'2')]</span> +<span class="go">[('.../1.txt', '1'), ('.../2.txt', '2')]</span> </pre></div> </div> </dd></dl> @@ -1684,7 +1677,7 @@ If no storage level is specified defaults to (<code class="xref py py-class docu <code class="descname">pipe</code><span class="sig-paren">(</span><em>command</em>, <em>env=None</em>, <em>checkCode=False</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/rdd.html#RDD.pipe"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.RDD.pipe" title="Permalink to this definition">¶</a></dt> <dd><p>Return an RDD created by piping elements to a forked external process.</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="s1">'1'</span><span class="p">,</span> <span class="s1">'2'</span><span class="p">,</span> <span class="s1">''</span><span class="p">,</span> <span class="s1">'3'</span><span class="p">])</span><span class="o">.</span><span class="n">pipe</span><span class="p">(</span><span class="s1">'cat'</span><span class="p">)</span><span class="o">.</span><span class="n">collect</span><span class="p">()</span> -<span class="go">[u'1', u'2', u'', u'3']</span> +<span class="go">['1', '2', '', '3']</span> </pre></div> </div> <table class="docutils field-list" frame="void" rules="none"> @@ -1799,7 +1792,7 @@ using <cite>coalesce</cite>, which can avoid performing a shuffle.</p> <dl class="method"> <dt id="pyspark.RDD.repartitionAndSortWithinPartitions"> -<code class="descname">repartitionAndSortWithinPartitions</code><span class="sig-paren">(</span><em>numPartitions=None</em>, <em>partitionFunc=<function portable_hash></em>, <em>ascending=True</em>, <em>keyfunc=<function <lambda>></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/rdd.html#RDD.repartitionAndSortWithinPartitions"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.RDD.repartitionAndSortWithinPartitions" title="Permalink to this definition">¶</a></dt> +<code class="descname">repartitionAndSortWithinPartitions</code><span class="sig-paren">(</span><em>numPartitions=None</em>, <em>partitionFunc=<function portable_hash></em>, <em>ascending=True</em>, <em>keyfunc=<function RDD.<lambda>></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/rdd.html#RDD.repartitionAndSortWithinPartitions"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.RDD.repartitionAndSortWithinPartitions" title="Permalink to this definition">¶</a></dt> <dd><p>Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">rdd</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)])</span> @@ -2088,8 +2081,8 @@ RDD’s key and value types. The mechanism is as follows:</p> <span class="gp">>>> </span><span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="s1">'foo'</span><span class="p">,</span> <span class="s1">'bar'</span><span class="p">])</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="p">(</span><span class="n">tempFile3</span><span class="o">.</span><span class="n">name</span><span class="p">,</span> <span class="n">codec</span><span class="p">)</span> <span class="gp">>>> </span><span class="kn">from</span> <span class="nn">fileinput</span> <span class="k">import</span> <span class="nb">input</span><span class="p">,</span> <span class="n">hook_compressed</span> <span class="gp">>>> </span><span class="n">result</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="nb">input</span><span class="p">(</span><span class="n">glob</span><span class="p">(</span><span class="n">tempFile3</span><span class="o">.</span><span class="n">name</span> <span class="o">+</span> <span class="s2">"/part*.gz"</span><span class="p">),</span> <span class="n">openhook</span><span class="o">=</span><span class="n">hook_compressed</span><span class="p">))</span> -<span class="gp">>>> </span><span class="n">b</span><span class="s1">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">result</span><span class="p">)</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">)</span> -<span class="go">u'bar\nfoo\n'</span> +<span class="gp">>>> </span><span class="sa">b</span><span class="s1">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">result</span><span class="p">)</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">)</span> +<span class="go">'bar\nfoo\n'</span> </pre></div> </div> </dd></dl> @@ -2100,7 +2093,7 @@ RDD’s key and value types. The mechanism is as follows:</p> <dd><p>Assign a name to this RDD.</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">rdd1</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span> <span class="gp">>>> </span><span class="n">rdd1</span><span class="o">.</span><span class="n">setName</span><span class="p">(</span><span class="s1">'RDD1'</span><span class="p">)</span><span class="o">.</span><span class="n">name</span><span class="p">()</span> -<span class="go">u'RDD1'</span> +<span class="go">'RDD1'</span> </pre></div> </div> </dd></dl> @@ -2120,7 +2113,7 @@ RDD’s key and value types. The mechanism is as follows:</p> <dl class="method"> <dt id="pyspark.RDD.sortByKey"> -<code class="descname">sortByKey</code><span class="sig-paren">(</span><em>ascending=True</em>, <em>numPartitions=None</em>, <em>keyfunc=<function <lambda>></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/rdd.html#RDD.sortByKey"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.RDD.sortByKey" title="Permalink to this definition">¶</a></dt> +<code class="descname">sortByKey</code><span class="sig-paren">(</span><em>ascending=True</em>, <em>numPartitions=None</em>, <em>keyfunc=<function RDD.<lambda>></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/rdd.html#RDD.sortByKey"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.RDD.sortByKey" title="Permalink to this definition">¶</a></dt> <dd><p>Sorts this RDD, which is assumed to consist of (key, value) pairs. # noqa</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">tmp</span> <span class="o">=</span> <span class="p">[(</span><span class="s1">'a'</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s1">'b'</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="p">(</span><span class="s1">'1'</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="p">(</span><span class="s1">'d'</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="p">(</span><span class="s1">'2'</span><span class="p">,</span> <span class="mi">5</span><span class="p">)]</span> @@ -2659,7 +2652,7 @@ not be as fast as more specialized serializers.</p> <dl class="method"> <dt id="pyspark.PickleSerializer.loads"> -<code class="descname">loads</code><span class="sig-paren">(</span><em>obj</em>, <em>encoding=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/serializers.html#PickleSerializer.loads"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.PickleSerializer.loads" title="Permalink to this definition">¶</a></dt> +<code class="descname">loads</code><span class="sig-paren">(</span><em>obj</em>, <em>encoding='bytes'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/serializers.html#PickleSerializer.loads"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.PickleSerializer.loads" title="Permalink to this definition">¶</a></dt> <dd></dd></dl> </dd></dl> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org