This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 6fb9ec387a1 Publishing website 2022/10/21 04:15:49 at commit 69fe1cc
6fb9ec387a1 is described below

commit 6fb9ec387a13467c965a01602d1bbf579e43b42c
Author: jenkins <bui...@apache.org>
AuthorDate: Fri Oct 21 04:15:49 2022 +0000

    Publishing website 2022/10/21 04:15:49 at commit 69fe1cc
---
 .../sdks/java-multi-language-pipelines/index.html  | 57 ++++++++++++++++------
 website/generated-content/sitemap.xml              |  2 +-
 2 files changed, 44 insertions(+), 15 deletions(-)

diff --git 
a/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
 
b/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
index 8dee467d6de..92b81bb1db1 100644
--- 
a/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
+++ 
b/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
@@ -19,7 +19,7 @@
 function addPlaceholder(){$('input:text').attr('placeholder',"What are you 
looking for?");}
 function endSearch(){var 
search=document.querySelector(".searchBar");search.classList.add("disappear");var
 icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
 function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Languages</span></li><li><span 
class=section-nav-list-title>Java</span><ul class=section-nav-list><li><a 
href=/documentation/sdks/java/>Java SDK overvi [...]
+function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Languages</span></li><li><span 
class=section-nav-list-title>Java</span><ul class=section-nav-list><li><a 
href=/documentation/sdks/java/>Java SDK overvi [...]
 with the Apache Beam SDK for Java. For a more complete discussion of the topic,
 see
 <a 
href=/documentation/programming-guide/#multi-language-pipelines>Multi-language 
pipelines</a>.</p><p>A <em>multi-language pipeline</em> is a pipeline that’s 
built in one Beam SDK language
@@ -96,19 +96,20 @@ a function to <code>DataframeTransform</code>, see
 <a 
href=/documentation/dsls/dataframes/overview/#embedding-dataframes-in-a-pipeline>Embedding
 DataFrames in a pipeline</a>.</p><h2 id=run-the-java-pipeline>Run the Java 
pipeline</h2><p>If you want to customize the environment or use transforms not 
available in the
 default Beam SDK, you might need to run your own expansion service. In such
 cases, <a href=#advanced-start-an-expansion-service>start the expansion 
service</a>
-before running your pipeline.</p><p>Here we&rsquo;ve provided commands for 
running the example pipeline using
-Gradle on a <a href=https://github.com/apache/beam>Beam HEAD Git clone</a>.
-If you need a more stable environment, please
-<a href=/get-started/quickstart-java/>setup a Java project</a> that uses the 
latest
-released Beam version and include the necessary dependencies.</p><h3 
id=run-with-dataflow-runner>Run with Dataflow runner</h3><p>The following 
script runs the example multi-language pipeline on Dataflow, using
+before running your pipeline.</p><h3 
id=run-with-dataflow-runner-at-head-beam-2410-and-later>Run with Dataflow 
runner at HEAD (Beam 2.41.0 and 
later)</h3><blockquote><p><strong>Note:</strong> Due to <a 
href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a 
href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these
 instructions</a>)
+and using the additional pipeline option 
<code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><p>The following script runs the 
example multi-language pipeline on Dataflow, using
 example text from a Cloud Storage bucket. You’ll need to adapt the script to
-your environment.</p><pre><code>export OUTPUT_BUCKET=&lt;bucket&gt;
+your environment.</p><pre><code>export GCP_PROJECT=&lt;project&gt;
+export OUTPUT_BUCKET=&lt;bucket&gt;
 export GCP_REGION=&lt;region&gt;
 export TEMP_LOCATION=gs://$OUTPUT_BUCKET/tmp
-export PYTHON_VERSION=&lt;version&gt;
 
 ./gradlew :examples:multi-language:pythonDataframeWordCount --args=&quot; \
 --runner=DataflowRunner \
+--project=$GCP_PROJECT \
 --output=gs://${OUTPUT_BUCKET}/count \
 --region=${GCP_REGION}&quot;
 </code></pre><p>The pipeline outputs a file with the results to
@@ -120,9 +121,12 @@ Please see <a href=/get-started/quickstart-py/>here</a> 
for instructions.</li><l
 python -m apache_beam.runners.portability.local_job_service_main -p 
$JOB_SERVER_PORT
 </code></pre><ol start=3><li><p>In a different shell, go to a <a 
href=https://github.com/apache/beam>Beam HEAD Git 
clone</a>.</p></li><li><p>Build the Beam Java SDK container for a local 
pipeline execution
 (this guide requires that your JAVA_HOME is set to Java 
11).</p></li></ol><pre><code>./gradlew :sdks:java:container:java11:docker
-</code></pre><ol start=5><li>Run the pipeline.</li></ol><pre><code>export 
JOB_SERVER_PORT=&lt;port&gt;  # Same port as before
+</code></pre><ol start=5><li>Run the 
pipeline.</li></ol><blockquote><p><strong>Note:</strong> Due to <a 
href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a 
href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these
 instructions</a>)
+and using the additional pipeline option 
<code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><pre><code>export 
JOB_SERVER_PORT=&lt;port&gt;  # Same port as before
 export OUTPUT_FILE=&lt;local relative path&gt;
-export PYTHON_VERSION=&lt;version&gt;
 
 ./gradlew :examples:multi-language:pythonDataframeWordCount --args=&quot; \
 --runner=PortableRunner \
@@ -141,13 +145,38 @@ starting up the expansion service. But if you want to 
customize the environment
 or use transforms not available in the default Beam SDK, you might need to run
 your own expansion service.</p><p>For example, to start the standard expansion 
service for a Python transform,
 <a 
href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service.py>ExpansionServiceServicer</a>,
-follow these steps:</p><ol><li><p>Activate a Python virtual environment and 
install Apache Beam, as described
-in the <a href=/get-started/quickstart-py/>Python quick 
start</a>.</p></li><li><p>In the <strong>beam/sdks/python</strong> directory of 
the Beam source code, run the
-following command:</p><pre><code>python 
apache_beam/runners/portability/expansion_service_main.py -p 18089 
--fully_qualified_name_glob &quot;*&quot;
+follow these steps:</p><ol><li><p>Activate a new virtual environment following
+<a 
href=https://beam.apache.org/get-started/quickstart-py/#create-and-activate-a-virtual-environment>these
 instructions</a>.</p></li><li><p>Install Apache Beam with <code>gcp</code> and 
<code>dataframe</code> packages.</p></li></ol><pre><code>pip install 
apache-beam[gcp,dataframe]
+</code></pre><ol start=4><li><p>Run the following command</p><pre><code>python 
-m apache_beam.runners.portability.expansion_service_main -p &lt;PORT&gt; 
--fully_qualified_name_glob &quot;*&quot;
 </code></pre></li></ol><p>The command runs
 <a 
href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service_main.py>expansion_service_main.py</a>,
 which starts the standard expansion service. When you use
 Gradle to run your Java pipeline, you can specify the expansion service with 
the
-<code>expansionService</code> option. For example: 
<code>--expansionService=localhost:18089</code>.</p><h2 id=next-steps>Next 
steps</h2><p>To learn more about Beam support for cross-language pipelines, see
+<code>expansionService</code> option. For example: 
<code>--expansionService=localhost:&lt;PORT></code>.</p><h3 
id=run-with-dataflow-runner-using-a-beam-release-beam-2430-and-later>Run with 
Dataflow runner using a Beam release (Beam 2.43.0 and 
later)</h3><blockquote><p><strong>Note:</strong> Due to <a 
href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a 
href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these
 instructions</a>)
+and using the additional pipeline option 
<code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><ul><li>Check out the Beam 
examples Maven archetype for the relevant Beam 
version.</li></ul><pre><code>export BEAM_VERSION=&lt;Beam version&gt;
+
+mvn archetype:generate \
+    -DarchetypeGroupId=org.apache.beam \
+    -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
+    -DarchetypeVersion=$BEAM_VERSION \
+    -DgroupId=org.example \
+    -DartifactId=multi-language-beam \
+    -Dversion=&quot;0.1&quot; \
+    -Dpackage=org.apache.beam.examples \
+    -DinteractiveMode=false
+</code></pre><ul><li>Run the pipeline.</li></ul><pre><code>export 
GCP_PROJECT=&lt;GCP project&gt;
+export GCP_BUCKET=&lt;GCP bucket&gt;
+export GCP_REGION=&lt;GCP region&gt;
+
+mvn compile exec:java 
-Dexec.mainClass=org.apache.beam.examples.multilanguage.PythonDataframeWordCount
 \
+    -Dexec.args=&quot;--runner=DataflowRunner --project=$GCP_PROJECT \
+                 --region=us-central1 \
+                 --gcpTempLocation=gs://$GCP_BUCKET/multi-language-beam/tmp \
+                 --output=gs://$GCP_BUCKET/multi-language-beam/output&quot; \
+    -Pdataflow-runner
+</code></pre><h2 id=next-steps>Next steps</h2><p>To learn more about Beam 
support for cross-language pipelines, see
 <a 
href=/documentation/programming-guide/#multi-language-pipelines>Multi-language 
pipelines</a>.
 To learn more about the Beam DataFrame API, see
 <a href=/documentation/dsls/dataframes/overview/>Beam DataFrames 
overview</a>.</p></div></div><footer class=footer><div 
class=footer__contained><div class=footer__cols><div class="footer__cols__col 
footer__cols__col__logos"><div class=footer__cols__col__logo><img 
src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div 
class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg 
class=footer__logo alt="Apache logo"></div></div><div class=footer-wrapper><div 
[...]
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index e4ea3823738..39e84cb2573 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.42.0/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/catego
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.42.0/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/catego
 [...]
\ No newline at end of file

Reply via email to