[ 
https://issues.apache.org/jira/browse/BEAM-14471?focusedWorklogId=774198&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774198
 ]

ASF GitHub Bot logged work on BEAM-14471:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/May/22 19:49
            Start Date: 24/May/22 19:49
    Worklog Time Spent: 10m 
      Work Description: ihji commented on code in PR #17674:
URL: https://github.com/apache/beam/pull/17674#discussion_r880779176


##########
examples/java/src/main/java/org/apache/beam/examples/multilang/PythonDataframeWordCount.java:
##########
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.examples.multilang;
+
+import org.apache.beam.examples.common.ExampleUtils;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.extensions.python.PythonExternalTransform;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Distribution;
+import org.apache.beam.sdk.metrics.Metrics;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.options.Validation.Required;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.MapElements;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SimpleFunction;
+import org.apache.beam.sdk.util.PythonCallableSource;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+
+/**
+ * An example that counts words in Shakespeare and utilizes a Python external 
transform.
+ *
+ * <p>This class, {@link PythonDataframeWordCount}, uses Python 
DataframeTransform to count words
+ * from the input text file. The Python expansion service provided by 
--expansionService must allow
+ * the expansion of apache_beam.dataframe.transforms.DataframeTransform (which 
can be done by
+ * passing --fully_qualified_name_glob commandline option when launching the 
expansion service).
+ *
+ * <p>Note that, for using Dataflow Runner, you should specify the following 
two additional
+ * arguments:
+ *
+ * <pre>{@code
+ * --experiments=use_runner_v2
+ * 
--sdkHarnessContainerImageOverrides=.*python.*,gcr.io/apache-beam-testing/beam-sdk/beam_python3.8_sdk:latest

Review Comment:
   Changed to use auto expansion service from PythonExternalTransform (However, 
the example will not work without `--expansion_service` option until we release 
a newer SDK version since PythonExternalTransform launches an expansion service 
by installing a released SDK)





Issue Time Tracking
-------------------

    Worklog Id:     (was: 774198)
    Time Spent: 7h 20m  (was: 7h 10m)

> Adding testcases and examples for xlang Python DataframeTransform 
> ------------------------------------------------------------------
>
>                 Key: BEAM-14471
>                 URL: https://issues.apache.org/jira/browse/BEAM-14471
>             Project: Beam
>          Issue Type: Improvement
>          Components: cross-language, testing
>            Reporter: Heejong Lee
>            Assignee: Heejong Lee
>            Priority: P2
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Adding testcases and examples for xlang Python DataframeTransform 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to