This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push: new 0d69f7f Publishing website 2021/02/20 06:03:00 at commit 70335bb 0d69f7f is described below commit 0d69f7f28ddfc8956f292d2ec02aba2d01b0c57c Author: jenkins <bui...@apache.org> AuthorDate: Sat Feb 20 06:03:00 2021 +0000 Publishing website 2021/02/20 06:03:00 at commit 70335bb --- website/generated-content/documentation/index.xml | 370 +-------------------- .../python/aggregation/combineglobally/index.html | 73 +--- .../python/aggregation/combineperkey/index.html | 94 +----- .../python/aggregation/combinevalues/index.html | 85 +---- website/generated-content/sitemap.xml | 2 +- 5 files changed, 10 insertions(+), 614 deletions(-) diff --git a/website/generated-content/documentation/index.xml b/website/generated-content/documentation/index.xml index 24f6b16..03c9c01 100644 --- a/website/generated-content/documentation/index.xml +++ b/website/generated-content/documentation/index.xml @@ -9084,118 +9084,7 @@ They are passed as additional positional arguments or keyword arguments to the f </td> </table> <p><br><br><br></p> -<h3 id="example-4-combining-with-side-inputs-as-singletons">Example 4: Combining with side inputs as singletons</h3> -<p>If the <code>PCollection</code> has a single value, such as the average from another computation, -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p> -<p>In this example, we pass a <code>PCollection</code> the value <code>'🥕'</code> as a singleton. -We then use that value to exclude specific items.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">single_exclude</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create single_exclude&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="s1">&#39;🥕&#39;</span><span class="p">])</span> -<span class="n">common_items_with_exceptions</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create produce&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">{</span><span class="s1">&#39;🍓&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍌&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🌶️&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍇&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🥝&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥔&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍉&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🍍&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🥑&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🌽&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥥&#39;</span><span class="p">},</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Get common items with exceptions&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineGlobally</span><span class="p">(</span> -<span class="k">lambda</span> <span class="n">sets</span><span class="p">,</span> <span class="n">single_exclude</span><span class="p">:</span> \ -<span class="nb">set</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="n">sets</span> <span class="ow">or</span> <span class="p">[</span><span class="nb">set</span><span class="p">()]))</span> <span class="o">-</span> <span class="p">{</span><span class="n">single_exclude</span><span class="p">},</span> -<span class="n">single_exclude</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsSingleton</span><span class="p">(</span><span class="n">single_exclude</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">)</span> -<span class="p">)</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>{&#39;🍅&#39;}</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-5-combining-with-side-inputs-as-iterators">Example 5: Combining with side inputs as iterators</h3> -<p>If the <code>PCollection</code> has multiple values, pass the <code>PCollection</code> as an <em>iterator</em>. -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won&rsquo;t fit into memory.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">exclude</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create exclude&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="s1">&#39;🥕&#39;</span><span class="p">])</span> -<span class="n">common_items_with_exceptions</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create produce&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">{</span><span class="s1">&#39;🍓&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍌&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🌶️&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍇&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🥝&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥔&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍉&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🍍&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🥑&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🌽&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥥&#39;</span><span class="p">},</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Get common items with exceptions&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineGlobally</span><span class="p">(</span> -<span class="k">lambda</span> <span class="n">sets</span><span class="p">,</span> <span class="n">exclude</span><span class="p">:</span> \ -<span class="nb">set</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="n">sets</span> <span class="ow">or</span> <span class="p">[</span><span class="nb">set</span><span class="p">()]))</span> <span class="o">-</span> <span class="nb">set</span><span class="p">(</span><span class="n">exclude</span>< [...] -<span class="n">exclude</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsIter</span><span class="p">(</span><span class="n">exclude</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">)</span> -<span class="p">)</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>{&#39;🍅&#39;}</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<blockquote> -<p><strong>Note</strong>: You can pass the <code>PCollection</code> as a <em>list</em> with <code>beam.pvalue.AsList(pcollection)</code>, -but this requires that all the elements fit into memory.</p> -</blockquote> -<h3 id="example-6-combining-with-side-inputs-as-dictionaries">Example 6: Combining with side inputs as dictionaries</h3> -<p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won&rsquo;t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">def</span> <span class="nf">get_custom_common_items</span><span class="p">(</span><span class="n">sets</span><span class="p">,</span> <span class="n">options</span><span class="p">):</span> -<span class="n">sets</span> <span class="o">=</span> <span class="n">sets</span> <span class="ow">or</span> <span class="p">[</span><span class="nb">set</span><span class="p">()]</span> -<span class="n">common_items</span> <span class="o">=</span> <span class="nb">set</span><span class="o">.</span><span class="n">intersection</span><span class="p">(</span><span class="o">*</span><span class="n">sets</span><span class="p">)</span> -<span class="n">common_items</span> <span class="o">|=</span> <span class="n">options</span><span class="p">[</span><span class="s1">&#39;include&#39;</span><span class="p">]</span> <span class="c1"># union</span> -<span class="n">common_items</span> <span class="o">&amp;=</span> <span class="n">options</span><span class="p">[</span><span class="s1">&#39;exclude&#39;</span><span class="p">]</span> <span class="c1"># intersection</span> -<span class="k">return</span> <span class="n">common_items</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">options</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create options&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;exclude&#39;</span><span class="p">,</span> <span class="p">{</span><span class="s1">&#39;🥕&#39;</span><span class="p">}),</span> -<span class="p">(</span><span class="s1">&#39;include&#39;</span><span class="p">,</span> <span class="p">{</span><span class="s1">&#39;🍇&#39;</span><span class="p">,</span> <span class="s1">&#39;🌽&#39;</span><span class="p">}),</span> -<span class="p">])</span> -<span class="n">custom_common_items</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create produce&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">{</span><span class="s1">&#39;🍓&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍌&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🌶️&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍇&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🥝&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥔&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🍉&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🍍&#39;</span><span class="p">},</span> -<span class="p">{</span><span class="s1">&#39;🥑&#39;</span><span class="p">,</span> <span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="s1">&#39;🌽&#39;</span><span class="p">,</span> <span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="s1">&#39;🥥&#39;</span><span class="p">},</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Get common items&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineGlobally</span><span class="p">(</span> -<span class="n">get_custom_common_items</span><span class="p">,</span> <span class="n">options</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsDict</span><span class="p">(</span><span class="n">options</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>{&#39;🍅&#39;, &#39;🍇&#39;, &#39;🌽&#39;}</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-7-combining-with-a-combinefn">Example 7: Combining with a <code>CombineFn</code></h3> +<h3 id="example-4-combining-with-a-combinefn">Example 4: Combining with a <code>CombineFn</code></h3> <p>The more general way to combine elements, and the most flexible, is with a class that inherits from <code>CombineFn</code>.</p> <ul> <li> @@ -9461,138 +9350,7 @@ They are passed as additional positional arguments or keyword arguments to the f </td> </table> <p><br><br><br></p> -<h3 id="example-5-combining-with-side-inputs-as-singletons">Example 5: Combining with side inputs as singletons</h3> -<p>If the <code>PCollection</code> has a single value, such as the average from another computation, -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p> -<p>In this example, we pass a <code>PCollection</code> the value <code>8</code> as a singleton. -We then use that value as the <code>max_value</code> for our saturated sum.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create max_value&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="mi">8</span><span class="p">])</span> -<span class="n">saturated_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Saturated sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombinePerKey</span><span class="p">(</span> -<span class="k">lambda</span> <span class="n">values</span><span class="p">,</span> -<span class="n">max_value</span><span class="p">:</span> <span class="nb">min</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">),</span> <span class="n">max_value</span><span class="p">),</span> -<span class="n">max_value</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsSingleton</span><span class="p">(</span><span class="n">max_value</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 1) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-6-combining-with-side-inputs-as-iterators">Example 6: Combining with side inputs as iterators</h3> -<p>If the <code>PCollection</code> has multiple values, pass the <code>PCollection</code> as an <em>iterator</em>. -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won&rsquo;t fit into memory.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">def</span> <span class="nf">bounded_sum</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">data_range</span><span class="p">):</span> -<span class="n">min_value</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">data_range</span><span class="p">)</span> -<span class="n">result</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="n">min_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">min_value</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">data_range</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&gt;</span> <span class="n">max_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">max_value</span> -<span class="k">return</span> <span class="n">result</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">data_range</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create data_range&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p" [...] -<span class="n">bounded_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Bounded sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombinePerKey</span><span class="p">(</span> -<span class="n">bounded_sum</span><span class="p">,</span> <span class="n">data_range</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsIter</span><span class="p">(</span><span class="n">data_range</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 2) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<blockquote> -<p><strong>Note</strong>: You can pass the <code>PCollection</code> as a <em>list</em> with <code>beam.pvalue.AsList(pcollection)</code>, -but this requires that all the elements fit into memory.</p> -</blockquote> -<h3 id="example-7-combining-with-side-inputs-as-dictionaries">Example 7: Combining with side inputs as dictionaries</h3> -<p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won&rsquo;t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">def</span> <span class="nf">bounded_sum</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">data_range</span><span class="p">):</span> -<span class="n">min_value</span> <span class="o">=</span> <span class="n">data_range</span><span class="p">[</span><span class="s1">&#39;min&#39;</span><span class="p">]</span> -<span class="n">result</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="n">min_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">min_value</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="n">data_range</span><span class="p">[</span><span class="s1">&#39;max&#39;</span><span class="p">]</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&gt;</span> <span class="n">max_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">max_value</span> -<span class="k">return</span> <span class="n">result</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">data_range</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create data_range&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;min&#39;</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;max&#39;</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> -<span class="p">])</span> -<span class="n">bounded_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Bounded sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombinePerKey</span><span class="p">(</span> -<span class="n">bounded_sum</span><span class="p">,</span> <span class="n">data_range</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsDict</span><span class="p">(</span><span class="n">data_range</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 2) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-8-combining-with-a-combinefn">Example 8: Combining with a <code>CombineFn</code></h3> +<h3 id="example-5-combining-with-a-combinefn">Example 5: Combining with a <code>CombineFn</code></h3> <p>The more general way to combine elements, and the most flexible, is with a class that inherits from <code>CombineFn</code>.</p> <ul> <li> @@ -9848,129 +9606,7 @@ They are passed as additional positional arguments or keyword arguments to the f </td> </table> <p><br><br><br></p> -<h3 id="example-5-combining-with-side-inputs-as-singletons">Example 5: Combining with side inputs as singletons</h3> -<p>If the <code>PCollection</code> has a single value, such as the average from another computation, -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p> -<p>In this example, we pass a <code>PCollection</code> the value <code>8</code> as a singleton. -We then use that value as the <code>max_value</code> for our saturated sum.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create max_value&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="mi">8</span><span class="p">])</span> -<span class="n">saturated_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Saturated sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineValues</span><span class="p">(</span> -<span class="k">lambda</span> <span class="n">values</span><span class="p">,</span> -<span class="n">max_value</span><span class="p">:</span> <span class="nb">min</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">),</span> <span class="n">max_value</span><span class="p">),</span> -<span class="n">max_value</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsSingleton</span><span class="p">(</span><span class="n">max_value</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 1) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-6-combining-with-side-inputs-as-iterators">Example 6: Combining with side inputs as iterators</h3> -<p>If the <code>PCollection</code> has multiple values, pass the <code>PCollection</code> as an <em>iterator</em>. -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won&rsquo;t fit into memory.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">def</span> <span class="nf">bounded_sum</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">data_range</span><span class="p">):</span> -<span class="n">min_value</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">data_range</span><span class="p">)</span> -<span class="n">result</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="n">min_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">min_value</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">data_range</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&gt;</span> <span class="n">max_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">max_value</span> -<span class="k">return</span> <span class="n">result</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">data_range</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create data_range&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span><span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p" [...] -<span class="n">bounded_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Bounded sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineValues</span><span class="p">(</span> -<span class="n">bounded_sum</span><span class="p">,</span> <span class="n">data_range</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsIter</span><span class="p">(</span><span class="n">data_range</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 2) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<blockquote> -<p><strong>Note</strong>: You can pass the <code>PCollection</code> as a <em>list</em> with <code>beam.pvalue.AsList(pcollection)</code>, -but this requires that all the elements fit into memory.</p> -</blockquote> -<h3 id="example-7-combining-with-side-inputs-as-dictionaries">Example 7: Combining with side inputs as dictionaries</h3> -<p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won&rsquo;t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p> -<div class=language-py> -<div class="highlight"><pre class="chroma"><code class="language-py" data-lang="py"><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span> -<span class="k">def</span> <span class="nf">bounded_sum</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">data_range</span><span class="p">):</span> -<span class="n">min_value</span> <span class="o">=</span> <span class="n">data_range</span><span class="p">[</span><span class="s1">&#39;min&#39;</span><span class="p">]</span> -<span class="n">result</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">values</span><span class="p">)</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="n">min_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">min_value</span> -<span class="n">max_value</span> <span class="o">=</span> <span class="n">data_range</span><span class="p">[</span><span class="s1">&#39;max&#39;</span><span class="p">]</span> -<span class="k">if</span> <span class="n">result</span> <span class="o">&gt;</span> <span class="n">max_value</span><span class="p">:</span> -<span class="k">return</span> <span class="n">max_value</span> -<span class="k">return</span> <span class="n">result</span> -<span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">()</span> <span class="k">as</span> <span class="n">pipeline</span><span class="p">:</span> -<span class="n">data_range</span> <span class="o">=</span> <span class="n">pipeline</span> <span class="o">|</span> <span class="s1">&#39;Create data_range&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;min&#39;</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> -<span class="p">(</span><span class="s1">&#39;max&#39;</span><span class="p">,</span> <span class="mi">8</span><span class="p">),</span> -<span class="p">])</span> -<span class="n">bounded_total</span> <span class="o">=</span> <span class="p">(</span> -<span class="n">pipeline</span> -<span class="o">|</span> <span class="s1">&#39;Create plant counts&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">Create</span><span class="p">([</span> -<span class="p">(</span><span class="s1">&#39;🥕&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍆&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">]),</span> -<span class="p">(</span><span class="s1">&#39;🍅&#39;</span><span class="p">,</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> -<span class="p">])</span> -<span class="o">|</span> <span class="s1">&#39;Bounded sum&#39;</span> <span class="o">&gt;&gt;</span> <span class="n">beam</span><span class="o">.</span><span class="n">CombineValues</span><span class="p">(</span> -<span class="n">bounded_sum</span><span class="p">,</span> <span class="n">data_range</span><span class="o">=</span><span class="n">beam</span><span class="o">.</span><span class="n">pvalue</span><span class="o">.</span><span class="n">AsDict</span><span class="p">(</span><span class="n">data_range</span><span class="p">))</span> -<span class="o">|</span> <span class="n">beam</span><span class="o">.</span><span class="n">Map</span><span class="p">(</span><span class="k">print</span><span class="p">))</span></code></pre></div> -</div> -<p class="notebook-skip">Output:</p> -<div class=notebook-skip> -<pre><code>(&#39;🥕&#39;, 5) -(&#39;🍆&#39;, 2) -(&#39;🍅&#39;, 8)</code></pre> -</div> -<table align="left" style="margin-right:1em" class=".language-py" > -<td> -<a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code" /> View source code</a> -</td> -</table> -<p><br><br><br></p> -<h3 id="example-8-combining-with-a-combinefn">Example 8: Combining with a <code>CombineFn</code></h3> +<h3 id="example-5-combining-with-a-combinefn">Example 5: Combining with a <code>CombineFn</code></h3> <p>The more general way to combine elements, and the most flexible, is with a class that inherits from <code>CombineFn</code>.</p> <ul> <li> diff --git a/website/generated-content/documentation/transforms/python/aggregation/combineglobally/index.html b/website/generated-content/documentation/transforms/python/aggregation/combineglobally/index.html index 99fc412..dc829a8 100644 --- a/website/generated-content/documentation/transforms/python/aggregation/combineglobally/index.html +++ b/website/generated-content/documentation/transforms/python/aggregation/combineglobally/index.html @@ -1,7 +1,7 @@ <!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><title>CombineGlobally</title><meta name=description content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specif [...] <span class=sr-only>Toggle navigation</span> <span class=icon-bar></span><span class=icon-bar></span><span class=icon-bar></span></button> -<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] +<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] Pydoc</a></td></table><p><br><br><br></p><p>Combines all elements in a collection.</p><p>See more information in the <a href=/documentation/programming-guide/#combine>Beam Programming Guide</a>.</p><h2 id=examples>Examples</h2><p>In the following examples, we create a pipeline with a <code>PCollection</code> of produce. Then, we apply <code>CombineGlobally</code> in multiple ways to combine all the elements in the <code>PCollection</code>.</p><p><code>CombineGlobally</code> accepts a function that takes an <code>iterable</code> of elements as an input, and combines them to return a single element.</p><h3 id=example-1-combining-with-a-function>Example 1: Combining with a function</h3><p>We define a function <code>get_common_items</code> which takes an <code>iterable</code> of sets as an input, and calcul [...] @@ -52,76 +52,7 @@ They are passed as additional positional arguments or keyword arguments to the f <span class=nb>set</span><span class=o>.</span><span class=n>intersection</span><span class=p>(</span><span class=o>*</span><span class=p>(</span><span class=n>sets</span> <span class=ow>or</span> <span class=p>[</span><span class=nb>set</span><span class=p>()]))</span> <span class=o>-</span> <span class=n>exclude</span><span class=p>,</span> <span class=n>exclude</span><span class=o>=</span><span class=p>{</span><span class=s1>'🥕'</span><span class=p>})</span> <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>)</span> - <span class=p>)</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>{'🍅'}</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View [...] -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p><p>In this example, we pass a <code>PCollection</code> the value <code>'🥕'</code> as a singleton. -We then use that value to exclude specific items.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>single_exclude</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create single_exclude'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=s1>'🥕'</span><span class=p>])</span> - - <span class=n>common_items_with_exceptions</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create produce'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>{</span><span class=s1>'🍓'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍌'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🌶️'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍇'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🥝'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥔'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍉'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍆'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🍍'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🥑'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🌽'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥥'</span><span class=p>},</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Get common items with exceptions'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineGlobally</span><span class=p>(</span> - <span class=k>lambda</span> <span class=n>sets</span><span class=p>,</span> <span class=n>single_exclude</span><span class=p>:</span> \ - <span class=nb>set</span><span class=o>.</span><span class=n>intersection</span><span class=p>(</span><span class=o>*</span><span class=p>(</span><span class=n>sets</span> <span class=ow>or</span> <span class=p>[</span><span class=nb>set</span><span class=p>()]))</span> <span class=o>-</span> <span class=p>{</span><span class=n>single_exclude</span><span class=p>},</span> - <span class=n>single_exclude</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsSingleton</span><span class=p>(</span><span class=n>single_exclude</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>)</span> - <span class=p>)</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>{'🍅'}</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View [...] -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won’t fit into memory.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>exclude</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create exclude'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=s1>'🥕'</span><span class=p>])</span> - - <span class=n>common_items_with_exceptions</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create produce'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>{</span><span class=s1>'🍓'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍌'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🌶️'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍇'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🥝'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥔'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍉'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍆'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🍍'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🥑'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🌽'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥥'</span><span class=p>},</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Get common items with exceptions'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineGlobally</span><span class=p>(</span> - <span class=k>lambda</span> <span class=n>sets</span><span class=p>,</span> <span class=n>exclude</span><span class=p>:</span> \ - <span class=nb>set</span><span class=o>.</span><span class=n>intersection</span><span class=p>(</span><span class=o>*</span><span class=p>(</span><span class=n>sets</span> <span class=ow>or</span> <span class=p>[</span><span class=nb>set</span><span class=p>()]))</span> <span class=o>-</span> <span class=nb>set</span><span class=p>(</span><span class=n>exclude</span><span class=p>),</span> - <span class=n>exclude</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsIter</span><span class=p>(</span><span class=n>exclude</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>)</span> - <span class=p>)</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>{'🍅'}</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View [...] -but this requires that all the elements fit into memory.</p></blockquote><h3 id=example-6-combining-with-side-inputs-as-dictionaries>Example 6: Combining with side inputs as dictionaries</h3><p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won’t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>def</span> <span class=nf>get_custom_common_items</span><span class=p>(</span><span class=n>sets</span><span class=p>,</span> <span class=n>options</span><span class=p>):</span> - <span class=n>sets</span> <span class=o>=</span> <span class=n>sets</span> <span class=ow>or</span> <span class=p>[</span><span class=nb>set</span><span class=p>()]</span> - <span class=n>common_items</span> <span class=o>=</span> <span class=nb>set</span><span class=o>.</span><span class=n>intersection</span><span class=p>(</span><span class=o>*</span><span class=n>sets</span><span class=p>)</span> - <span class=n>common_items</span> <span class=o>|=</span> <span class=n>options</span><span class=p>[</span><span class=s1>'include'</span><span class=p>]</span> <span class=c1># union</span> - <span class=n>common_items</span> <span class=o>&=</span> <span class=n>options</span><span class=p>[</span><span class=s1>'exclude'</span><span class=p>]</span> <span class=c1># intersection</span> - <span class=k>return</span> <span class=n>common_items</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>options</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create options'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'exclude'</span><span class=p>,</span> <span class=p>{</span><span class=s1>'🥕'</span><span class=p>}),</span> - <span class=p>(</span><span class=s1>'include'</span><span class=p>,</span> <span class=p>{</span><span class=s1>'🍇'</span><span class=p>,</span> <span class=s1>'🌽'</span><span class=p>}),</span> - <span class=p>])</span> - - <span class=n>custom_common_items</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create produce'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>{</span><span class=s1>'🍓'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍌'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🌶️'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍇'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🥝'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥔'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🍉'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🍆'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🍍'</span><span class=p>},</span> - <span class=p>{</span><span class=s1>'🥑'</span><span class=p>,</span> <span class=s1>'🥕'</span><span class=p>,</span> <span class=s1>'🌽'</span><span class=p>,</span> <span class=s1>'🍅'</span><span class=p>,</span> <span class=s1>'🥥'</span><span class=p>},</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Get common items'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineGlobally</span><span class=p>(</span> - <span class=n>get_custom_common_items</span><span class=p>,</span> <span class=n>options</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsDict</span><span class=p>(</span><span class=n>options</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>{'🍅', '🍇', '🌽'}</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ex [...] + <span class=p>)</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>{'🍅'}</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineglobally.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View [...] This creates an empty accumulator. For example, an empty accumulator for a sum would be <code>0</code>, while an empty accumulator for a product (multiplication) would be <code>1</code>.</p></li><li><p><a href=https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn.add_input><code>CombineFn.add_input()</code></a>: Called <em>once per element</em>. diff --git a/website/generated-content/documentation/transforms/python/aggregation/combineperkey/index.html b/website/generated-content/documentation/transforms/python/aggregation/combineperkey/index.html index 6d3ab3f..e48ac1a 100644 --- a/website/generated-content/documentation/transforms/python/aggregation/combineperkey/index.html +++ b/website/generated-content/documentation/transforms/python/aggregation/combineperkey/index.html @@ -1,7 +1,7 @@ <!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><title>CombinePerKey</title><meta name=description content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific [...] <span class=sr-only>Toggle navigation</span> <span class=icon-bar></span><span class=icon-bar></span><span class=icon-bar></span></button> -<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] +<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] Pydoc</a></td></table><p><br><br><br></p><p>Combines all elements for each key in a collection.</p><p>See more information in the <a href=/documentation/programming-guide/#combine>Beam Programming Guide</a>.</p><h2 id=examples>Examples</h2><p>In the following examples, we create a pipeline with a <code>PCollection</code> of produce. Then, we apply <code>CombinePerKey</code> in multiple ways to combine all the elements in the <code>PCollection</code>.</p><p><code>CombinePerKey</code> accepts a function that takes a list of values as an input, and combines them for each key.</p><h3 id=example-1-combining-with-a-predefined-function>Example 1: Combining with a predefined function</h3><p>We use the function <a href=https://docs.python.org/3/library/functions.html#sum><code>sum</code></a> @@ -76,97 +76,7 @@ They are passed as additional positional arguments or keyword arguments to the f <span class=k>lambda</span> <span class=n>values</span><span class=p>,</span> <span class=n>max_value</span><span class=p>:</span> <span class=nb>min</span><span class=p>(</span><span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>),</span> <span class=n>max_value</span><span class=p>),</span> <span class=n>max_value</span><span class=o>=</span><span class=mi>8</span><span class=p>)</span> <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) ('🍆', 1) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-5-combining-with-side-inputs-as-singletons>Example 5: C [...] -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p><p>In this example, we pass a <code>PCollection</code> the value <code>8</code> as a singleton. -We then use that value as the <code>max_value</code> for our saturated sum.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create max_value'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=mi>8</span><span class=p>])</span> - - <span class=n>saturated_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>2</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=mi>1</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>4</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>5</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Saturated sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombinePerKey</span><span class=p>(</span> - <span class=k>lambda</span> <span class=n>values</span><span class=p>,</span> - <span class=n>max_value</span><span class=p>:</span> <span class=nb>min</span><span class=p>(</span><span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>),</span> <span class=n>max_value</span><span class=p>),</span> - <span class=n>max_value</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsSingleton</span><span class=p>(</span><span class=n>max_value</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 1) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-6-combining-with-side-inputs-as-iterators>Example 6: Co [...] -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won’t fit into memory.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>def</span> <span class=nf>bounded_sum</span><span class=p>(</span><span class=n>values</span><span class=p>,</span> <span class=n>data_range</span><span class=p>):</span> - <span class=n>min_value</span> <span class=o>=</span> <span class=nb>min</span><span class=p>(</span><span class=n>data_range</span><span class=p>)</span> - <span class=n>result</span> <span class=o>=</span> <span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o><</span> <span class=n>min_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>min_value</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=nb>max</span><span class=p>(</span><span class=n>data_range</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o>></span> <span class=n>max_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>max_value</span> - <span class=k>return</span> <span class=n>result</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>data_range</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create data_range'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=mi>2</span><span class=p>,</span> <span class=mi>4</span><span class=p>,</span> <span class=mi>8</span><span class=p>])</span> - - <span class=n>bounded_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>2</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=mi>1</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>4</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>5</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Bounded sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombinePerKey</span><span class=p>(</span> - <span class=n>bounded_sum</span><span class=p>,</span> <span class=n>data_range</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsIter</span><span class=p>(</span><span class=n>data_range</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 2) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><blockquote><p><strong>Note</strong>: You can pass the <code>PCollecti [...] -but this requires that all the elements fit into memory.</p></blockquote><h3 id=example-7-combining-with-side-inputs-as-dictionaries>Example 7: Combining with side inputs as dictionaries</h3><p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won’t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>def</span> <span class=nf>bounded_sum</span><span class=p>(</span><span class=n>values</span><span class=p>,</span> <span class=n>data_range</span><span class=p>):</span> - <span class=n>min_value</span> <span class=o>=</span> <span class=n>data_range</span><span class=p>[</span><span class=s1>'min'</span><span class=p>]</span> - <span class=n>result</span> <span class=o>=</span> <span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o><</span> <span class=n>min_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>min_value</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=n>data_range</span><span class=p>[</span><span class=s1>'max'</span><span class=p>]</span> - <span class=k>if</span> <span class=n>result</span> <span class=o>></span> <span class=n>max_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>max_value</span> - <span class=k>return</span> <span class=n>result</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>data_range</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create data_range'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'min'</span><span class=p>,</span> <span class=mi>2</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'max'</span><span class=p>,</span> <span class=mi>8</span><span class=p>),</span> - <span class=p>])</span> - - <span class=n>bounded_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=mi>2</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=mi>1</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>4</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>5</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Bounded sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombinePerKey</span><span class=p>(</span> - <span class=n>bounded_sum</span><span class=p>,</span> <span class=n>data_range</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsDict</span><span class=p>(</span><span class=n>data_range</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 2) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-8-combining-with-a-combinefn>Example 8: Combining with [...] +('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combineperkey.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-5-combining-with-a-combinefn>Example 5: Combining with [...] This creates an empty accumulator. For example, an empty accumulator for a sum would be <code>0</code>, while an empty accumulator for a product (multiplication) would be <code>1</code>.</p></li><li><p><a href=https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn.add_input><code>CombineFn.add_input()</code></a>: Called <em>once per element</em>. diff --git a/website/generated-content/documentation/transforms/python/aggregation/combinevalues/index.html b/website/generated-content/documentation/transforms/python/aggregation/combinevalues/index.html index 6555a99..01640e8 100644 --- a/website/generated-content/documentation/transforms/python/aggregation/combinevalues/index.html +++ b/website/generated-content/documentation/transforms/python/aggregation/combinevalues/index.html @@ -1,7 +1,7 @@ <!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><title>CombineValues</title><meta name=description content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific [...] <span class=sr-only>Toggle navigation</span> <span class=icon-bar></span><span class=icon-bar></span><span class=icon-bar></span></button> -<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] +<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...] Pydoc</a></td></table><p><br><br><br></p><p>Combines an iterable of values in a keyed collection of elements.</p><p>See more information in the <a href=/documentation/programming-guide/#combine>Beam Programming Guide</a>.</p><h2 id=examples>Examples</h2><p>In the following examples, we create a pipeline with a <code>PCollection</code> of produce. Then, we apply <code>CombineValues</code> in multiple ways to combine the keyed values in the <code>PCollection</code>.</p><p><code>CombineValues</code> accepts a function that takes an <code>iterable</code> of elements as an input, and combines them to return a single element. <code>CombineValues</code> expects a keyed <code>PCollection</code> of elements, where the value is an iterable of elements to be combined.</p><h3 id=example-1-combining-with-a-predefined-function>Example 1: Combining with a predefined function</h3><p>We use the function @@ -66,88 +66,7 @@ They are passed as additional positional arguments or keyword arguments to the f <span class=k>lambda</span> <span class=n>values</span><span class=p>,</span> <span class=n>max_value</span><span class=p>:</span> <span class=nb>min</span><span class=p>(</span><span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>),</span> <span class=n>max_value</span><span class=p>),</span> <span class=n>max_value</span><span class=o>=</span><span class=mi>8</span><span class=p>)</span> <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) ('🍆', 1) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-5-combining-with-side-inputs-as-singletons>Example 5: C [...] -passing the <code>PCollection</code> as a <em>singleton</em> accesses that value.</p><p>In this example, we pass a <code>PCollection</code> the value <code>8</code> as a singleton. -We then use that value as the <code>max_value</code> for our saturated sum.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create max_value'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=mi>8</span><span class=p>])</span> - - <span class=n>saturated_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=p>[</span><span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=p>[</span><span class=mi>1</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>3</span><span class=p>]),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Saturated sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineValues</span><span class=p>(</span> - <span class=k>lambda</span> <span class=n>values</span><span class=p>,</span> - <span class=n>max_value</span><span class=p>:</span> <span class=nb>min</span><span class=p>(</span><span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>),</span> <span class=n>max_value</span><span class=p>),</span> - <span class=n>max_value</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsSingleton</span><span class=p>(</span><span class=n>max_value</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 1) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-6-combining-with-side-inputs-as-iterators>Example 6: Co [...] -This accesses elements lazily as they are needed, -so it is possible to iterate over large <code>PCollection</code>s that won’t fit into memory.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>def</span> <span class=nf>bounded_sum</span><span class=p>(</span><span class=n>values</span><span class=p>,</span> <span class=n>data_range</span><span class=p>):</span> - <span class=n>min_value</span> <span class=o>=</span> <span class=nb>min</span><span class=p>(</span><span class=n>data_range</span><span class=p>)</span> - <span class=n>result</span> <span class=o>=</span> <span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o><</span> <span class=n>min_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>min_value</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=nb>max</span><span class=p>(</span><span class=n>data_range</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o>></span> <span class=n>max_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>max_value</span> - <span class=k>return</span> <span class=n>result</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>data_range</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create data_range'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span><span class=mi>2</span><span class=p>,</span> <span class=mi>4</span><span class=p>,</span> <span class=mi>8</span><span class=p>])</span> - - <span class=n>bounded_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=p>[</span><span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=p>[</span><span class=mi>1</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>3</span><span class=p>]),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Bounded sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineValues</span><span class=p>(</span> - <span class=n>bounded_sum</span><span class=p>,</span> <span class=n>data_range</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsIter</span><span class=p>(</span><span class=n>data_range</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 2) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><blockquote><p><strong>Note</strong>: You can pass the <code>PCollecti [...] -but this requires that all the elements fit into memory.</p></blockquote><h3 id=example-7-combining-with-side-inputs-as-dictionaries>Example 7: Combining with side inputs as dictionaries</h3><p>If a <code>PCollection</code> is small enough to fit into memory, then that <code>PCollection</code> can be passed as a <em>dictionary</em>. -Each element must be a <code>(key, value)</code> pair. -Note that all the elements of the <code>PCollection</code> must fit into memory for this. -If the <code>PCollection</code> won’t fit into memory, use <code>beam.pvalue.AsIter(pcollection)</code> instead.</p><div class=language-py><div class=highlight><pre class=chroma><code class=language-py data-lang=py><span class=kn>import</span> <span class=nn>apache_beam</span> <span class=kn>as</span> <span class=nn>beam</span> - -<span class=k>def</span> <span class=nf>bounded_sum</span><span class=p>(</span><span class=n>values</span><span class=p>,</span> <span class=n>data_range</span><span class=p>):</span> - <span class=n>min_value</span> <span class=o>=</span> <span class=n>data_range</span><span class=p>[</span><span class=s1>'min'</span><span class=p>]</span> - <span class=n>result</span> <span class=o>=</span> <span class=nb>sum</span><span class=p>(</span><span class=n>values</span><span class=p>)</span> - <span class=k>if</span> <span class=n>result</span> <span class=o><</span> <span class=n>min_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>min_value</span> - <span class=n>max_value</span> <span class=o>=</span> <span class=n>data_range</span><span class=p>[</span><span class=s1>'max'</span><span class=p>]</span> - <span class=k>if</span> <span class=n>result</span> <span class=o>></span> <span class=n>max_value</span><span class=p>:</span> - <span class=k>return</span> <span class=n>max_value</span> - <span class=k>return</span> <span class=n>result</span> - -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>()</span> <span class=k>as</span> <span class=n>pipeline</span><span class=p>:</span> - <span class=n>data_range</span> <span class=o>=</span> <span class=n>pipeline</span> <span class=o>|</span> <span class=s1>'Create data_range'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'min'</span><span class=p>,</span> <span class=mi>2</span><span class=p>),</span> - <span class=p>(</span><span class=s1>'max'</span><span class=p>,</span> <span class=mi>8</span><span class=p>),</span> - <span class=p>])</span> - - <span class=n>bounded_total</span> <span class=o>=</span> <span class=p>(</span> - <span class=n>pipeline</span> - <span class=o>|</span> <span class=s1>'Create plant counts'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>Create</span><span class=p>([</span> - <span class=p>(</span><span class=s1>'🥕'</span><span class=p>,</span> <span class=p>[</span><span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍆'</span><span class=p>,</span> <span class=p>[</span><span class=mi>1</span><span class=p>]),</span> - <span class=p>(</span><span class=s1>'🍅'</span><span class=p>,</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>3</span><span class=p>]),</span> - <span class=p>])</span> - <span class=o>|</span> <span class=s1>'Bounded sum'</span> <span class=o>>></span> <span class=n>beam</span><span class=o>.</span><span class=n>CombineValues</span><span class=p>(</span> - <span class=n>bounded_sum</span><span class=p>,</span> <span class=n>data_range</span><span class=o>=</span><span class=n>beam</span><span class=o>.</span><span class=n>pvalue</span><span class=o>.</span><span class=n>AsDict</span><span class=p>(</span><span class=n>data_range</span><span class=p>))</span> - <span class=o>|</span> <span class=n>beam</span><span class=o>.</span><span class=n>Map</span><span class=p>(</span><span class=k>print</span><span class=p>))</span></code></pre></div></div><p class=notebook-skip>Output:</p><div class=notebook-skip><pre><code>('🥕', 5) -('🍆', 2) -('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-8-combining-with-a-combinefn>Example 8: Combining with [...] +('🍅', 8)</code></pre></div><table align=left style=margin-right:1em class=.language-py><td><a class=button target=_blank href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues.py><img src=https://www.tensorflow.org/images/GitHub-Mark-32px.png width=32px height=32px alt="View source code"> View source code</a></td></table><p><br><br><br></p><h3 id=example-5-combining-with-a-combinefn>Example 5: Combining with [...] This creates an empty accumulator. For example, an empty accumulator for a sum would be <code>0</code>, while an empty accumulator for a product (multiplication) would be <code>1</code>.</p></li><li><p><a href=https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.CombineFn.add_input><code>CombineFn.add_input()</code></a>: Called <em>once per element</em>. diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml index 99b16c5..08f6a68 100644 --- a/website/generated-content/sitemap.xml +++ b/website/generated-content/sitemap.xml @@ -1 +1 @@ -<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/blog/kafka-to-pubsub-example/</loc><lastmod>2021-01-20T19:53:05+03:00</lastmod></url><url> [...] \ No newline at end of file +<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-02-04T12:05:11-08:00</lastmod></url><url><loc>/blog/kafka-to-pubsub-example/</loc><lastmod>2021-01-20T19:53:05+03:00</lastmod></url><url> [...] \ No newline at end of file