[
https://issues.apache.org/jira/browse/BEAM-6146?focusedWorklogId=258502&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-258502
]
ASF GitHub Bot logged work on BEAM-6146:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Jun/19 08:21
Start Date: 12/Jun/19 08:21
Worklog Time Spent: 10m
Work Description: mxm commented on pull request #7180: [BEAM-6146] Run
pre-commit wordcount in batch and streaming mode.
URL: https://github.com/apache/beam/pull/7180#discussion_r292796018
##########
File path: sdks/python/build.gradle
##########
@@ -268,32 +269,37 @@ task directRunnerIT(dependsOn: 'installGcpTest') {
//
// ./gradlew :beam-sdks-python:portableWordCount
-PjobEndpoint=localhost:8099
//
-task portableWordCount(dependsOn: 'installGcpTest') {
- doLast {
- // TODO: Figure out GCS credentials and use real GCS input and output.
- def options = [
- "--input=/etc/profile",
- "--output=/tmp/py-wordcount-direct",
- "--runner=PortableRunner",
- "--experiments=worker_threads=100",
- ]
- if (project.hasProperty("streaming"))
- options += ["--streaming"]
- else
+task portableWordCount {
+ dependsOn portableWordCountTask('portableWordCountExample',
project.hasProperty("streaming"), envdir)
+}
+
+def portableWordCountTask(name, streaming, envdir) {
+ tasks.create(name) {
+ dependsOn = ['installGcpTest']
+ mustRunAfter = [':beam-runners-flink_2.11-job-server-container:docker',
':beam-sdks-python-container:docker']
+ doLast {
+ // TODO: Figure out GCS credentials and use real GCS input and output.
+ def options = [
+ "--input=/etc/profile",
+ "--output=/tmp/py-wordcount-direct",
+ "--runner=PortableRunner",
+ "--experiments=worker_threads=100",
+ ]
+ if (streaming)
+ options += ["--streaming"]
+ else
// workaround for local file output in docker container
- options += ["--environment_cache_millis=10000"]
- if (project.hasProperty("jobEndpoint"))
- options += ["--job_endpoint=${project.property('jobEndpoint')}"]
- exec {
- executable 'sh'
- args '-c', ". ${envdir}/bin/activate && python -m
apache_beam.examples.wordcount ${options.join(' ')}"
- // TODO: Check that the output file is generated and runs.
+ options += ["--environment_cache_millis=10000"]
+ if (project.hasProperty("jobEndpoint"))
+ options += ["--job_endpoint=${project.property('jobEndpoint')}"]
+ exec {
+ executable 'sh'
+ args '-c', ". ${envdir}/bin/activate && python -m
apache_beam.examples.wordcount ${options.join(' ')}"
Review comment:
I think it does make sense because batch and streaming are two different
execution modes, both in Flink as well as in the Flink Runner. It doesn't
matter much that the data is bounded, as all execution code can be executed.
That is not to say the tests cannot be more elaborate.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 258502)
Time Spent: 4.5h (was: 4h 20m)
> Portable Flink End to end precommit test
> ----------------------------------------
>
> Key: BEAM-6146
> URL: https://issues.apache.org/jira/browse/BEAM-6146
> Project: Beam
> Issue Type: Bug
> Components: runner-flink
> Reporter: Ankur Goenka
> Assignee: Ankur Goenka
> Priority: Major
> Fix For: 2.10.0
>
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
> Create an end to end wordcount based pipeline test executed on precommit.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)