[jira] [Created] (BEAM-5194) Pipeline options with multi value are not deserialized correctly from map
Ankur Goenka created BEAM-5194: -- Summary: Pipeline options with multi value are not deserialized correctly from map Key: BEAM-5194 URL: https://issues.apache.org/jira/browse/BEAM-5194 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Ankur Goenka Assignee: Ahmet Altay [https://github.com/apache/beam/blob/7c41e0a915083bd3b1fe52c2a417fa38a00e6463/sdks/python/apache_beam/options/pipeline_options.py#L171] Multiple options are converted to strings and added to flags which causes wrong deserialization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5194) Pipeline options with multi value are not deserialized correctly from map
[ https://issues.apache.org/jira/browse/BEAM-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-5194: -- Assignee: Ankur Goenka (was: Ahmet Altay) > Pipeline options with multi value are not deserialized correctly from map > - > > Key: BEAM-5194 > URL: https://issues.apache.org/jira/browse/BEAM-5194 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > [https://github.com/apache/beam/blob/7c41e0a915083bd3b1fe52c2a417fa38a00e6463/sdks/python/apache_beam/options/pipeline_options.py#L171] > > Multiple options are converted to strings and added to flags which causes > wrong deserialization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-5156) Apache Beam on dataflow runner can't find Tensorflow for workers
[ https://issues.apache.org/jira/browse/BEAM-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589137#comment-16589137 ] Ankur Goenka edited comment on BEAM-5156 at 8/22/18 5:17 PM: - Check the thread [https://lists.apache.org/thread.html/0cbf73d696e0d3a5bb8e93618ac9d6bb81daecf2c9c8e11ee220c8ae@%3Cdev.beam.apache.org%3E] I am suspecting the same issue here. Please try using --experiments worker_threads=100 after fixing the setup.py, please use was (Author: angoenka): Check the thread [https://lists.apache.org/thread.html/0cbf73d696e0d3a5bb8e93618ac9d6bb81daecf2c9c8e11ee220c8ae@%3Cdev.beam.apache.org%3E] Please try using --experiments worker_threads=100 after fixing the setup.py, please use > Apache Beam on dataflow runner can't find Tensorflow for workers > > > Key: BEAM-5156 > URL: https://issues.apache.org/jira/browse/BEAM-5156 > Project: Beam > Issue Type: Bug > Components: beam-model > Environment: google cloud compute instance running linux >Reporter: Thomas Johns >Assignee: Kenneth Knowles >Priority: Major > Fix For: 2.5.0, 2.6.0 > > > Adding serialized tensorflow model to apache beam pipeline with python sdk > but it can not find any version of tensorflow when applied to dataflow runner > although it is not a problem locally. Tried various versions of tensorflow > from 1.6 to 1.10. I thought it might be a conflicting package some where so I > removed all other packages and tried to just install tensorflow and same > problem. > Could not find a version that satisfies the requirement tensorflow==1.6.0 > (from -r reqtest.txt (line 59)) (from versions: )No matching distribution > found for tensorflow==1.6.0 (from -r reqtest.txt (line 59)) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5156) Apache Beam on dataflow runner can't find Tensorflow for workers
[ https://issues.apache.org/jira/browse/BEAM-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589137#comment-16589137 ] Ankur Goenka commented on BEAM-5156: Check the thread [https://lists.apache.org/thread.html/0cbf73d696e0d3a5bb8e93618ac9d6bb81daecf2c9c8e11ee220c8ae@%3Cdev.beam.apache.org%3E] Please try using --experiments worker_threads=100 after fixing the setup.py, please use > Apache Beam on dataflow runner can't find Tensorflow for workers > > > Key: BEAM-5156 > URL: https://issues.apache.org/jira/browse/BEAM-5156 > Project: Beam > Issue Type: Bug > Components: beam-model > Environment: google cloud compute instance running linux >Reporter: Thomas Johns >Assignee: Kenneth Knowles >Priority: Major > Fix For: 2.5.0, 2.6.0 > > > Adding serialized tensorflow model to apache beam pipeline with python sdk > but it can not find any version of tensorflow when applied to dataflow runner > although it is not a problem locally. Tried various versions of tensorflow > from 1.6 to 1.10. I thought it might be a conflicting package some where so I > removed all other packages and tried to just install tensorflow and same > problem. > Could not find a version that satisfies the requirement tensorflow==1.6.0 > (from -r reqtest.txt (line 59)) (from versions: )No matching distribution > found for tensorflow==1.6.0 (from -r reqtest.txt (line 59)) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5219) Expose OutboundMessage in PubSub client
Ankur Goenka created BEAM-5219: -- Summary: Expose OutboundMessage in PubSub client Key: BEAM-5219 URL: https://issues.apache.org/jira/browse/BEAM-5219 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Ankur Goenka Assignee: Chamikara Jayalath publish method in org/apache/beam/sdk/io/gcp/pubsub/PubsubClient.java is public but the argument OutboundMessage is not public which makes the api unusable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5219) Expose OutboundMessage in PubSub client
[ https://issues.apache.org/jira/browse/BEAM-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-5219: -- Assignee: Ankur Goenka (was: Chamikara Jayalath) > Expose OutboundMessage in PubSub client > --- > > Key: BEAM-5219 > URL: https://issues.apache.org/jira/browse/BEAM-5219 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Minor > > publish method in org/apache/beam/sdk/io/gcp/pubsub/PubsubClient.java is > public but the argument OutboundMessage is not public which makes the api > unusable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4834) Validate cycles in org.apache.beam.runners.core.construction.graph.Network before doing topological sort
[ https://issues.apache.org/jira/browse/BEAM-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-4834: -- Assignee: Ankur Goenka (was: Kenneth Knowles) > Validate cycles in org.apache.beam.runners.core.construction.graph.Network > before doing topological sort > > > Key: BEAM-4834 > URL: https://issues.apache.org/jira/browse/BEAM-4834 > Project: Beam > Issue Type: Bug > Components: runner-core >Reporter: Ankur Goenka > Assignee: Ankur Goenka >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Cyclic graphs will never finish the topological sort so we should check the > cycle before doing the topological sort. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4834) Validate cycles in org.apache.beam.runners.core.construction.graph.Network before doing topological sort
Ankur Goenka created BEAM-4834: -- Summary: Validate cycles in org.apache.beam.runners.core.construction.graph.Network before doing topological sort Key: BEAM-4834 URL: https://issues.apache.org/jira/browse/BEAM-4834 Project: Beam Issue Type: Bug Components: runner-core Reporter: Ankur Goenka Assignee: Kenneth Knowles Cyclic graphs will never finish the topological sort so we should check the cycle before doing the topological sort. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4810) Flaky test BeamFileSystemArtifactServicesTest
Ankur Goenka created BEAM-4810: -- Summary: Flaky test BeamFileSystemArtifactServicesTest Key: BEAM-4810 URL: https://issues.apache.org/jira/browse/BEAM-4810 Project: Beam Issue Type: Bug Components: runner-core Reporter: Ankur Goenka Assignee: Ankur Goenka The test is flaky because we do not wait for putArtifact completion. Here is a failing build https://builds.apache.org/job/beam_PreCommit_Java_Commit/340/testReport/org.apache.beam.runners.fnexecution.artifact/BeamFileSystemArtifactServicesTest/putArtifactsSingleSmallFileTest/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5273) Local file system does not work as expected on Portability Framework with Docker
Ankur Goenka created BEAM-5273: -- Summary: Local file system does not work as expected on Portability Framework with Docker Key: BEAM-5273 URL: https://issues.apache.org/jira/browse/BEAM-5273 Project: Beam Issue Type: Bug Components: sdk-go, sdk-java-harness, sdk-py-harness Reporter: Ankur Goenka Assignee: Ankur Goenka With portability framework, the local file system reads and write to the docker container file system. This makes usage of local files impossible with portability framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins
Ankur Goenka created BEAM-5284: -- Summary: Enable Java Portable Flink PostCommit Tests to Jenkins Key: BEAM-5284 URL: https://issues.apache.org/jira/browse/BEAM-5284 Project: Beam Issue Type: Test Components: testing Reporter: Ankur Goenka Assignee: Ankur Goenka -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins
Ankur Goenka created BEAM-5283: -- Summary: Enable Python Portable Flink PostCommit Tests to Jenkins Key: BEAM-5283 URL: https://issues.apache.org/jira/browse/BEAM-5283 Project: Beam Issue Type: Test Components: testing Reporter: Ankur Goenka Assignee: Jason Kuster -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5308) JobBundleFactory BindException with FlinkRunner and remote cluster
[ https://issues.apache.org/jira/browse/BEAM-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604840#comment-16604840 ] Ankur Goenka commented on BEAM-5308: I agree, the bug is setting the port range. > JobBundleFactory BindException with FlinkRunner and remote cluster > -- > > Key: BEAM-5308 > URL: https://issues.apache.org/jira/browse/BEAM-5308 > Project: Beam > Issue Type: Task > Components: runner-flink >Reporter: Thomas Weise >Assignee: Maximilian Michels >Priority: Major > Labels: portability > Time Spent: 20m > Remaining Estimate: 0h > > Repeated execution of the same job on remote Flink cluster (not embedded in > job server) fails with bind exception. There seem to be 2 issues: > * Multiple instances of job bundle factory cannot be created (port conflict) > * Job bundle factory is not released after job completes (and Docker > container keeps on running). That's not the case in embedded mode). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604833#comment-16604833 ] Ankur Goenka commented on BEAM-5283: The root cause seems to be related to permission. :beam-sdks-python:setupVirtualenv FAILED New python executable in /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_PortableValidatesRunner_Flink_Gradle/src/sdks/python/build/gradleenv/bin/python2 Also creating executable in /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_PortableValidatesRunner_Flink_Gradle/src/sdks/python/build/gradleenv/bin/python Installing setuptools, pkg_resources, pip, wheel...done. Running virtualenv with interpreter /usr/bin/python2 Collecting tox==3.0.0 Using cached https://files.pythonhosted.org/packages/e6/41/4dcfd713282bf3213b0384320fa8841e4db032ddcb80bc08a540159d42a8/tox-3.0.0-py2.py3-none-any.whl Collecting grpcio-tools==1.3.5 Using cached https://files.pythonhosted.org/packages/05/f6/0296e29b1bac6f85d2a8556d48adf825307f73109a3c2c17fb734292db0a/grpcio_tools-1.3.5-cp27-cp27mu-manylinux1_x86_64.whl Collecting pluggy<1.0,>=0.3.0 (from tox==3.0.0) Using cached https://files.pythonhosted.org/packages/f5/f1/5a93c118663896d83f7bcbfb7f657ce1d0c0d617e6b4a443a53abcc658ca/pluggy-0.7.1-py2.py3-none-any.whl Requirement not upgraded as not directly required: six in /usr/local/lib/python2.7/dist-packages (from tox==3.0.0) (1.11.0) Requirement not upgraded as not directly required: virtualenv>=1.11.2 in /usr/lib/python2.7/dist-packages (from tox==3.0.0) (15.0.1) Collecting py>=1.4.17 (from tox==3.0.0) Using cached https://files.pythonhosted.org/packages/c8/47/d179b80ab1dc1bfd46a0c87e391be47e6c7ef5831a9c138c5c49d1756288/py-1.6.0-py2.py3-none-any.whl Collecting grpcio>=1.3.5 (from grpcio-tools==1.3.5) Using cached https://files.pythonhosted.org/packages/bd/a6/4bad0d1a49071363dc6547a5178656fe375c80535128c12bb65c59d1a329/grpcio-1.14.2-cp27-cp27mu-manylinux1_x86_64.whl Collecting protobuf>=3.2.0 (from grpcio-tools==1.3.5) Using cached https://files.pythonhosted.org/packages/b8/c2/b7f587c0aaf8bf2201405e8162323037fe8d17aa21d3c7dda811b8d01469/protobuf-3.6.1-cp27-cp27mu-manylinux1_x86_64.whl Requirement not upgraded as not directly required: enum34>=1.0.4 in /usr/local/lib/python2.7/dist-packages (from grpcio>=1.3.5->grpcio-tools==1.3.5) (1.1.6) Collecting futures>=2.2.0 (from grpcio>=1.3.5->grpcio-tools==1.3.5) Using cached https://files.pythonhosted.org/packages/2d/99/b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0/futures-3.2.0-py2-none-any.whl Requirement not upgraded as not directly required: setuptools in /usr/local/lib/python2.7/dist-packages (from protobuf>=3.2.0->grpcio-tools==1.3.5) (39.0.1) Installing collected packages: pluggy, py, tox, futures, grpcio, protobuf, grpcio-tools Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/pluggy-0.7.1.dist-info' Consider using the `--user` option or check the permissions. You are using pip version 10.0.1, however version 18.0 is available. You should consider upgrading via the 'pip install --upgrade pip' command. > Enable Python Portable Flink PostCommit Tests to Jenkins > > > Key: BEAM-5283 > URL: https://issues.apache.org/jira/browse/BEAM-5283 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Ankur Goenka >Assignee: Jason Kuster >Priority: Major > Labels: CI > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5262) JobState support for Reference Runner
Ankur Goenka created BEAM-5262: -- Summary: JobState support for Reference Runner Key: BEAM-5262 URL: https://issues.apache.org/jira/browse/BEAM-5262 Project: Beam Issue Type: Bug Components: runner-direct Reporter: Ankur Goenka Assignee: Ankur Goenka Reference runner does not support getStateStream which is needed by portable SDK -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3676) FlinkRunner: Portable state service
[ https://issues.apache.org/jira/browse/BEAM-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597133#comment-16597133 ] Ankur Goenka commented on BEAM-3676: [~axelmagn] is there any thing left on this jira? > FlinkRunner: Portable state service > --- > > Key: BEAM-3676 > URL: https://issues.apache.org/jira/browse/BEAM-3676 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Axel Magnuson >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > The State API is an implementation of BeamFnState that exposes pipeline state > to SDK harnesses. Because it is used for side inputs, this service will also > need to be tied into side inputs/outputs during the translation phase. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target
[ https://issues.apache.org/jira/browse/BEAM-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607583#comment-16607583 ] Ankur Goenka commented on BEAM-5337: Tried running the failing gradle task ./gradlew :beam-runners-flink_2.11:test locally but it always succeeds. Will try to reproduce it again to inspect the point at which it gets stuck. > [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] > Build times out in beam-runners-flink target > - > > Key: BEAM-5337 > URL: https://issues.apache.org/jira/browse/BEAM-5337 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mikhail Gryzykhin > Assignee: Ankur Goenka >Priority: Major > > Job times out. > Failing job url: > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull] > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull] > https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-4130: -- Assignee: Ankur Goenka > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Ankur Goenka >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
[ https://issues.apache.org/jira/browse/BEAM-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-4023. Resolution: Fixed Fix Version/s: Not applicable > Log warning for missing worker id in FnApiControlClientPoolService > -- > > Key: BEAM-4023 > URL: https://issues.apache.org/jira/browse/BEAM-4023 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: Not applicable > > > We should log warning for missing worker id when connecting the GRPC channel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM
[ https://issues.apache.org/jira/browse/BEAM-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-3418. Resolution: Fixed Fix Version/s: 2.6.0 > Python Fnapi - Support Multiple SDK workers on a single VM > -- > > Key: BEAM-3418 > URL: https://issues.apache.org/jira/browse/BEAM-3418 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: performance, portability > Fix For: 2.6.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > Support multiple python SDK process on a VM to fully utilize a machine. > Each SDK Process will work in isolation and interact with Runner Harness > independently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4773) Flink job fails if image pull fails.
[ https://issues.apache.org/jira/browse/BEAM-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542053#comment-16542053 ] Ankur Goenka commented on BEAM-4773: Fix in PR [https://github.com/apache/beam/pull/5933] > Flink job fails if image pull fails. > > > Key: BEAM-4773 > URL: https://issues.apache.org/jira/browse/BEAM-4773 > Project: Beam > Issue Type: Bug > Components: runner-flink > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > If docker pull request fails then flink job fails. > https://github.com/apache/beam/commit/316a9962eecfc55cf5537ee547a16c2bc75ee333 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4773) Flink job fails if image pull fails.
Ankur Goenka created BEAM-4773: -- Summary: Flink job fails if image pull fails. Key: BEAM-4773 URL: https://issues.apache.org/jira/browse/BEAM-4773 Project: Beam Issue Type: Bug Components: runner-flink Reporter: Ankur Goenka Assignee: Ankur Goenka If docker pull request fails then flink job fails. https://github.com/apache/beam/commit/316a9962eecfc55cf5537ee547a16c2bc75ee333 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4784) Python SDK harness container build fails
[ https://issues.apache.org/jira/browse/BEAM-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16543769#comment-16543769 ] Ankur Goenka commented on BEAM-4784: I am not able to reproduce it here is a working build on master [https://scans.gradle.com/s/2cq26ubquy7ly/timeline?task=2a5wa25ihupj4] Suggestion: Try ./gradlew clean and then use ./gradlew -p sdks/python/container docker --scan --no-daemon to build > Python SDK harness container build fails > > > Key: BEAM-4784 > URL: https://issues.apache.org/jira/browse/BEAM-4784 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness >Reporter: Thomas Weise >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.6.0 > > > A new build failure has surfaced for ./gradlew -p sdks/python/container > docker (fails in :beam-sdks-python:sdist) > It can be reproduced even on a commit that succeeded before. Basically it > attempts to retrieve protobuf for several minutes before it finally fails > with "RuntimeError: maximum recursion depth exceeded" > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4787) Ignore genrated vendored files for python container
Ankur Goenka created BEAM-4787: -- Summary: Ignore genrated vendored files for python container Key: BEAM-4787 URL: https://issues.apache.org/jira/browse/BEAM-4787 Project: Beam Issue Type: Bug Components: dependencies Reporter: Ankur Goenka Assignee: Ankur Goenka Python container build generates a bunch of vendor files which should be ignored in git -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM
[ https://issues.apache.org/jira/browse/BEAM-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-3418: --- Description: Support multiple python SDK process on a VM to fully utilize a machine. Each SDK Process will work in isolation and interact with Runner Harness independently. was:Support multiple python SDK process on a VM to fully utilize a machine. > Python Fnapi - Support Multiple SDK workers on a single VM > -- > > Key: BEAM-3418 > URL: https://issues.apache.org/jira/browse/BEAM-3418 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: performance, portability > > Support multiple python SDK process on a VM to fully utilize a machine. > Each SDK Process will work in isolation and interact with Runner Harness > independently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM
[ https://issues.apache.org/jira/browse/BEAM-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-3418: --- Summary: Python Fnapi - Support Multiple SDK workers on a single VM (was: Python Fnapi - Support Multiple workers on a single VM) > Python Fnapi - Support Multiple SDK workers on a single VM > -- > > Key: BEAM-3418 > URL: https://issues.apache.org/jira/browse/BEAM-3418 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: performance, portability > > Support multiple python SDK process on a VM to fully utilize a machine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3418) Python Fnapi - Support Multiple workers on a single VM
[ https://issues.apache.org/jira/browse/BEAM-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-3418: --- Summary: Python Fnapi - Support Multiple workers on a single VM (was: Python Fnapi - Multiprocess worker) > Python Fnapi - Support Multiple workers on a single VM > -- > > Key: BEAM-3418 > URL: https://issues.apache.org/jira/browse/BEAM-3418 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: performance, portability > > Support multiple python SDK process on a VM to fully utilize a machine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5369) Portable wordcount java broken because of create_view usage
Ankur Goenka created BEAM-5369: -- Summary: Portable wordcount java broken because of create_view usage Key: BEAM-5369 URL: https://issues.apache.org/jira/browse/BEAM-5369 Project: Beam Issue Type: Test Components: sdk-java-core Reporter: Ankur Goenka Assignee: Ankur Goenka Portable Wordcount is broken on Flink since [https://github.com/apache/beam/pull/6208/files#diff-14d60e038b469ee6ce66ec6ec9d7d976L153] SDK should stop using create_view URN as its depricated -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
[ https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611413#comment-16611413 ] Ankur Goenka commented on BEAM-5288: It will be good to expose args separately in Process Environment to make sure we escape them correctly when we issue the command. message ProcessPayload { string os = 1; // "linux", "darwin", .. string arch = 2; // "amd64", .. string command = 3; // process to execute map env = 4; // environment variables } ==>> message ProcessPayload { string os = 1; // "linux", "darwin", .. string arch = 2; // "amd64", .. string command = 3; // process to execute repeated string args = 4; // Arguments to the command map env = 5; // Environment variables } > Modify Environment to support non-dockerized SDK harness deployments > - > > Key: BEAM-5288 > URL: https://issues.apache.org/jira/browse/BEAM-5288 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > As of mailing discussions and BEAM-5187, it has become clear that we need to > extend the Environment information. In addition to the Docker environment, > the extended environment holds deployment options for 1) a process-based > environment, 2) an externally managed environment. > The proto definition, as of now, looks as follows: > {noformat} > message Environment { >// (Required) The URN of the payload >string urn = 1; >// (Optional) The data specifying any parameters to the URN. If >// the URN does not require any arguments, this may be omitted. >bytes payload = 2; > } > message StandardEnvironments { >enum Environments { > DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; > PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; > EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; >} > } > // The payload of a Docker image > message DockerPayload { >string container_image = 1; // implicitly linux_amd64. > } > message ProcessPayload { >string os = 1; // "linux", "darwin", .. >string arch = 2; // "amd64", .. >string command = 3; // process to execute >map env = 4; // environment variables > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
[ https://issues.apache.org/jira/browse/BEAM-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-4023: -- Assignee: Ankur Goenka (was: Henning Rohde) > Log warning for missing worker id in FnApiControlClientPoolService > -- > > Key: BEAM-4023 > URL: https://issues.apache.org/jira/browse/BEAM-4023 > Project: Beam > Issue Type: Bug > Components: sdk-go > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Blocker > > When running the maven build in a machine without the valid auth credentials > the module breaks like this: > {code:bash} > {{[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ > beam-runners-gcp-gcsproxy ---}} > {{[INFO] Prepared command line : bin/go build -buildmode=default -o > /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy > github.com/apache/beam/cmd/gcsproxy}} > {{[ERROR] }} > {{[ERROR] -Exec.Err-}} > {{[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx}} > {{[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: > undefined: option.WithoutAuthentication}} > {{[ERROR] }} > {{}} > {{[INFO] Apache Beam :: Runners :: Google Cloud Platform :: GCS artifact > proxy FAILURE [ 1.038 s]}} > {{}} > {{[INFO] BUILD FAILURE}} > {{}} > {{[ERROR] Failed to execute goal > com.igormaznitsa:mvn-golang-wrapper:2.1.6:build (go-build) on project > beam-runners-gcp-gcsproxy: Can't find generated target file : > /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy -> [Help 1]}} > {{[ERROR] }} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
[ https://issues.apache.org/jira/browse/BEAM-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-4023: --- Priority: Major (was: Blocker) > Log warning for missing worker id in FnApiControlClientPoolService > -- > > Key: BEAM-4023 > URL: https://issues.apache.org/jira/browse/BEAM-4023 > Project: Beam > Issue Type: Bug > Components: sdk-go > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > We should -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
Ankur Goenka created BEAM-4023: -- Summary: Log warning for missing worker id in FnApiControlClientPoolService Key: BEAM-4023 URL: https://issues.apache.org/jira/browse/BEAM-4023 Project: Beam Issue Type: Bug Components: sdk-go Reporter: Ankur Goenka Assignee: Henning Rohde When running the maven build in a machine without the valid auth credentials the module breaks like this: {code:bash} {{[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ beam-runners-gcp-gcsproxy ---}} {{[INFO] Prepared command line : bin/go build -buildmode=default -o /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy github.com/apache/beam/cmd/gcsproxy}} {{[ERROR] }} {{[ERROR] -Exec.Err-}} {{[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx}} {{[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: undefined: option.WithoutAuthentication}} {{[ERROR] }} {{}} {{[INFO] Apache Beam :: Runners :: Google Cloud Platform :: GCS artifact proxy FAILURE [ 1.038 s]}} {{}} {{[INFO] BUILD FAILURE}} {{}} {{[ERROR] Failed to execute goal com.igormaznitsa:mvn-golang-wrapper:2.1.6:build (go-build) on project beam-runners-gcp-gcsproxy: Can't find generated target file : /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy -> [Help 1]}} {{[ERROR] }} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
[ https://issues.apache.org/jira/browse/BEAM-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-4023: --- Description: We should (was: When running the maven build in a machine without the valid auth credentials the module breaks like this: {code:bash} {{[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ beam-runners-gcp-gcsproxy ---}} {{[INFO] Prepared command line : bin/go build -buildmode=default -o /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy github.com/apache/beam/cmd/gcsproxy}} {{[ERROR] }} {{[ERROR] -Exec.Err-}} {{[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx}} {{[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: undefined: option.WithoutAuthentication}} {{[ERROR] }} {{}} {{[INFO] Apache Beam :: Runners :: Google Cloud Platform :: GCS artifact proxy FAILURE [ 1.038 s]}} {{}} {{[INFO] BUILD FAILURE}} {{}} {{[ERROR] Failed to execute goal com.igormaznitsa:mvn-golang-wrapper:2.1.6:build (go-build) on project beam-runners-gcp-gcsproxy: Can't find generated target file : /home/ismael/upstream/beam/runners/gcp/gcsproxy/target/gcsproxy -> [Help 1]}} {{[ERROR] }} {code}) > Log warning for missing worker id in FnApiControlClientPoolService > -- > > Key: BEAM-4023 > URL: https://issues.apache.org/jira/browse/BEAM-4023 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Ankur Goenka > Assignee: Ankur Goenka >Priority: Blocker > > We should -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4023) Log warning for missing worker id in FnApiControlClientPoolService
[ https://issues.apache.org/jira/browse/BEAM-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-4023: --- Description: We should log warning for missing worker id when connecting the GRPC channel. (was: We should ) > Log warning for missing worker id in FnApiControlClientPoolService > -- > > Key: BEAM-4023 > URL: https://issues.apache.org/jira/browse/BEAM-4023 > Project: Beam > Issue Type: Bug > Components: sdk-go > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > We should log warning for missing worker id when connecting the GRPC channel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4022) beam_PreCommit_Python_MavenInstall failing with cython error
Ankur Goenka created BEAM-4022: -- Summary: beam_PreCommit_Python_MavenInstall failing with cython error Key: BEAM-4022 URL: https://issues.apache.org/jira/browse/BEAM-4022 Project: Beam Issue Type: Bug Components: testing Reporter: Ankur Goenka Assignee: Udi Meiri Seems to only happen in this working directory: {{/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild}} but not this: {{/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild@2}} {{ERROR: invocation failed (errno 2), args: ['/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/target/.tox/py27-cython2/bin/pip', 'install', 'cython==0.26.1'], cwd: /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python}} {{ Traceback (most recent call last):}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/bin/tox", line 11, in }} {{ sys.exit(run_main())}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", line 40, in run_main}} {{ main(args)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", line 46, in main}} {{ retcode = Session(config).runcommand()}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", line 415, in runcommand}} {{ return self.subcommand_test()}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", line 599, in subcommand_test}} {{ if self.setupenv(venv):}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", line 491, in setupenv}} {{ status = venv.update(action=action)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", line 171, in update}} {{ self.hook.tox_testenv_install_deps(action=action, venv=self)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", line 617, in __call__}} {{ return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", line 222, in _hookexec}} {{ return self._inner_hookexec(hook, methods, kwargs)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", line 216, in }} {{ firstresult=hook.spec_opts.get('firstresult'),}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", line 201, in _multicall}} {{ return outcome.get_result()}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", line 77, in get_result}} {{ _reraise(*ex) # noqa}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", line 180, in _multicall}} {{ res = hook_impl.function(*args)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", line 452, in tox_testenv_install_deps}} {{ venv._install(deps, action=action)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", line 331, in _install}} {{ action=action)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", line 303, in run_install_command}} {{ action=action, redirect=self.session.report.verbosity < 2)}} {{ File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", line 409,
[jira] [Assigned] (BEAM-4022) beam_PreCommit_Python_MavenInstall failing with cython error
[ https://issues.apache.org/jira/browse/BEAM-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-4022: -- Assignee: Ankur Goenka (was: Udi Meiri) > beam_PreCommit_Python_MavenInstall failing with cython error > > > Key: BEAM-4022 > URL: https://issues.apache.org/jira/browse/BEAM-4022 > Project: Beam > Issue Type: Bug > Components: testing > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > Seems to only happen in this working directory: > {{/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild}} > but not this: > {{/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild@2}} > {{ERROR: invocation failed (errno 2), args: > ['/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/target/.tox/py27-cython2/bin/pip', > 'install', 'cython==0.26.1'], cwd: > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python}} > {{ Traceback (most recent call last):}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/bin/tox", > line 11, in }} > {{ sys.exit(run_main())}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", > line 40, in run_main}} > {{ main(args)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", > line 46, in main}} > {{ retcode = Session(config).runcommand()}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", > line 415, in runcommand}} > {{ return self.subcommand_test()}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", > line 599, in subcommand_test}} > {{ if self.setupenv(venv):}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/session.py", > line 491, in setupenv}} > {{ status = venv.update(action=action)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", > line 171, in update}} > {{ self.hook.tox_testenv_install_deps(action=action, venv=self)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", > line 617, in __call__}} > {{ return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", > line 222, in _hookexec}} > {{ return self._inner_hookexec(hook, methods, kwargs)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/__init__.py", > line 216, in }} > {{ firstresult=hook.spec_opts.get('firstresult'),}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", > line 201, in _multicall}} > {{ return outcome.get_result()}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", > line 77, in get_result}} > {{ _reraise(*ex) # noqa}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/pluggy/callers.py", > line 180, in _multicall}} > {{ res = hook_impl.function(*args)}} > {{ File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_GradleBuild/src/sdks/python/build/gradleenv/local/lib/python2.7/site-packages/tox/venv.py", > line 452, in tox_testenv_install_deps}} > {{ venv._install(deps, action=action)}} > {{ File > "/home/jenkins/jenkins-slave/worksp
[jira] [Updated] (BEAM-4022) beam_PreCommit_Python_MavenInstall failing with cython error
[ https://issues.apache.org/jira/browse/BEAM-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-4022: --- Description: beam_PreCommit_Python_MavenInstall is failing with (link to failing job [https://builds.apache.org/view/A-D/view/ActiveMQ/job/beam_PreCommit_Python_MavenInstall/4425/console] ) OK (skipped=44) py27-gcp runtests: commands[5] | /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/run_tox_cleanup.sh py27-cython2 create: /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/.tox/py27-cython2 py27-cython2 installdeps: cython==0.26.1 ERROR: invocation failed (errno 2), args: ['/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/.tox/py27-cython2/bin/pip', 'install', 'cython==0.26.1'], cwd: /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python Traceback (most recent call last): File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/bin/tox", line 11, in sys.exit(run_main()) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/session.py", line 40, in run_main main(args) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/session.py", line 46, in main retcode = Session(config).runcommand() File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/session.py", line 415, in runcommand return self.subcommand_test() File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/session.py", line 599, in subcommand_test if self.setupenv(venv): File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/session.py", line 491, in setupenv status = venv.update(action=action) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/venv.py", line 171, in update self.hook.tox_testenv_install_deps(action=action, venv=self) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/__init__.py", line 617, in __call__ return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/__init__.py", line 222, in _hookexec return self._inner_hookexec(hook, methods, kwargs) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/__init__.py", line 216, in firstresult=hook.spec_opts.get('firstresult'), File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/callers.py", line 201, in _multicall return outcome.get_result() File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/callers.py", line 77, in get_result _reraise(*ex) # noqa File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/pluggy/callers.py", line 180, in _multicall res = hook_impl.function(*args) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/venv.py", line 452, in tox_testenv_install_deps venv._install(deps, action=action) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/venv.py", line 331, in _install action=action) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/venv.py", line 303, in run_install_command action=action, redirect=self.session.report.verbosity < 2) File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall@2/src/sdks/python/target/python/lib/python2.7/site-packages/tox/venv.py", line 409, in _pcall redirect=redirect, ignore
[jira] [Assigned] (BEAM-3883) Python SDK stages artifacts when talking to job server
[ https://issues.apache.org/jira/browse/BEAM-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-3883: -- Assignee: Ankur Goenka (was: Ahmet Altay) > Python SDK stages artifacts when talking to job server > -- > > Key: BEAM-3883 > URL: https://issues.apache.org/jira/browse/BEAM-3883 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Ben Sidhom >Assignee: Ankur Goenka >Priority: Major > > The Python SDK does not currently stage its user-defined functions or > dependencies when talking to the job API. Artifacts that need to be staged > include the user code itself, any SDK components not included in the > container image, and the list of Python packages that must be installed at > runtime. > > Artifacts that are currently expected can be found in the harness boot code: > [https://github.com/apache/beam/blob/58e3b06bee7378d2d8db1c8dd534b415864f63e1/sdks/python/container/boot.go#L52.] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3996) Invalid test util ResourceIdTester#ValidateFailureResolvingIds
[ https://issues.apache.org/jira/browse/BEAM-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-3996: -- Assignee: Thomas Groh (was: Kenneth Knowles) > Invalid test util ResourceIdTester#ValidateFailureResolvingIds > -- > > Key: BEAM-3996 > URL: https://issues.apache.org/jira/browse/BEAM-3996 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Reporter: Ankur Goenka >Assignee: Thomas Groh >Priority: Major > > The test util described here will never fail as we wrap fail in a try-catch > block catching throwable. > https://github.com/apache/beam/blob/a1ef0aac298e10a04a8ee5afea4765374a9c7508/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java#L107 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3996) Invalid test util ResourceIdTester#ValidateFailureResolvingIds
Ankur Goenka created BEAM-3996: -- Summary: Invalid test util ResourceIdTester#ValidateFailureResolvingIds Key: BEAM-3996 URL: https://issues.apache.org/jira/browse/BEAM-3996 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Ankur Goenka Assignee: Kenneth Knowles The test util described here will never fail as we wrap fail in a try-catch block catching throwable. https://github.com/apache/beam/blob/a1ef0aac298e10a04a8ee5afea4765374a9c7508/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java#L107 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3904) Don't use UUID when worker_id is missing
Ankur Goenka created BEAM-3904: -- Summary: Don't use UUID when worker_id is missing Key: BEAM-3904 URL: https://issues.apache.org/jira/browse/BEAM-3904 Project: Beam Issue Type: Task Components: sdk-py-harness Reporter: Ankur Goenka Assignee: Ankur Goenka Removed defaulting to UUID when worker_id is not present and throw exception in worker_id_interceptor.py after we have rolled out the corresponding container changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-2930) Flink support for portable side input
[ https://issues.apache.org/jira/browse/BEAM-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-2930: -- Assignee: (was: Ben Sidhom) > Flink support for portable side input > - > > Key: BEAM-2930 > URL: https://issues.apache.org/jira/browse/BEAM-2930 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Henning Rohde >Priority: Major > Labels: portability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-2930) Flink support for portable side input
[ https://issues.apache.org/jira/browse/BEAM-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-2930: -- Assignee: Ben Sidhom > Flink support for portable side input > - > > Key: BEAM-2930 > URL: https://issues.apache.org/jira/browse/BEAM-2930 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Henning Rohde >Assignee: Ben Sidhom >Priority: Major > Labels: portability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2930) Flink support for portable side input
[ https://issues.apache.org/jira/browse/BEAM-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526403#comment-16526403 ] Ankur Goenka commented on BEAM-2930: [~bsidhom] I think this is taken care in the portable flink runner using broadcast variable. Do we need anything more here? > Flink support for portable side input > - > > Key: BEAM-2930 > URL: https://issues.apache.org/jira/browse/BEAM-2930 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Henning Rohde >Priority: Major > Labels: portability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5467) Python Flink ValidatesRunner job fixes
[ https://issues.apache.org/jira/browse/BEAM-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640350#comment-16640350 ] Ankur Goenka commented on BEAM-5467: The test pass consistently on local jenkins setup. > Python Flink ValidatesRunner job fixes > -- > > Key: BEAM-5467 > URL: https://issues.apache.org/jira/browse/BEAM-5467 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Minor > Labels: portability-flink > Time Spent: 2.5h > Remaining Estimate: 0h > > Add status to README > Rename script and job for consistency > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-3904) Don't use UUID when worker_id is missing
[ https://issues.apache.org/jira/browse/BEAM-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-3904. Resolution: Fixed Fix Version/s: 2.8.0 > Don't use UUID when worker_id is missing > > > Key: BEAM-3904 > URL: https://issues.apache.org/jira/browse/BEAM-3904 > Project: Beam > Issue Type: Task > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Minor > Fix For: 2.8.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Removed defaulting to UUID when worker_id is not present and throw exception > in worker_id_interceptor.py after we have rolled out the corresponding > container changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes
[ https://issues.apache.org/jira/browse/BEAM-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-4176: -- Assignee: Ankur Goenka > Java: Portable batch runner passes all ValidatesRunner tests that > non-portable runner passes > > > Key: BEAM-4176 > URL: https://issues.apache.org/jira/browse/BEAM-4176 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Ben Sidhom > Assignee: Ankur Goenka >Priority: Major > Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 > PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png > > Time Spent: 26h 40m > Remaining Estimate: 0h > > We need this as a sanity check that runner execution is correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes
[ https://issues.apache.org/jira/browse/BEAM-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637390#comment-16637390 ] Ankur Goenka commented on BEAM-4176: !81VxNWtFtke.png! We are able to run all the test cases with a hack on create_view now. > Java: Portable batch runner passes all ValidatesRunner tests that > non-portable runner passes > > > Key: BEAM-4176 > URL: https://issues.apache.org/jira/browse/BEAM-4176 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Ben Sidhom >Priority: Major > Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 > PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png > > Time Spent: 26h 40m > Remaining Estimate: 0h > > We need this as a sanity check that runner execution is correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes
[ https://issues.apache.org/jira/browse/BEAM-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-4176: --- Attachment: 81VxNWtFtke.png > Java: Portable batch runner passes all ValidatesRunner tests that > non-portable runner passes > > > Key: BEAM-4176 > URL: https://issues.apache.org/jira/browse/BEAM-4176 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Ben Sidhom >Priority: Major > Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 > PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png > > Time Spent: 26h 40m > Remaining Estimate: 0h > > We need this as a sanity check that runner execution is correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5683) [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Failure summary
[ https://issues.apache.org/jira/browse/BEAM-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642484#comment-16642484 ] Ankur Goenka commented on BEAM-5683: As mentioned earlier, the test case seems to be failing because of pip. pip is executed as a subprocess from python. Can we access pip subprocess logs? > [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Failure summary > -- > > Key: BEAM-5683 > URL: https://issues.apache.org/jira/browse/BEAM-5683 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness, test-failures >Reporter: Scott Wegner >Assignee: Ankur Goenka >Priority: Major > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1289/] > * [Gradle Build > Scan|https://scans.gradle.com/s/hjmzvh4ylhs6y/console-log?task=:beam-sdks-python:validatesRunnerBatchTests] > * [Test source > code|https://github.com/apache/beam/blob/303a4275eb0a323761e1a4dec6a22fde9863acf8/sdks/python/apache_beam/runners/portability/stager.py#L390] > Initial investigation: > Seems to be failing on pip download. > == > ERROR: test_multiple_empty_outputs > (apache_beam.transforms.ptransform_test.PTransformTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/ptransform_test.py", > line 277, in test_multiple_empty_outputs > pipeline.run() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py", > line 104, in run > result = super(TestPipeline, self).run(test_runner_api) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 403, in run > self.to_runner_api(), self.runner, self._options).run(False) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 416, in run > return self.runner.run_pipeline(self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 50, in run_pipeline > self.result = super(TestDataflowRunner, self).run_pipeline(pipeline) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 389, in run_pipeline > self.dataflow_client.create_job(self.job), self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/retry.py", > line 184, in wrapper > return fun(*args, **kwargs) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 490, in create_job > self.create_job_description(job) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 519, in create_job_description > resources = self._stage_resour > ces(job.options) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 452, in _stage_resources > staging_location=google_cloud_options.staging_location) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 161, in stage_job_resources > requirements_cache_path) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 411, in _populate_requirements_cache > processes.check_call(cmd_args) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/processes.py", > line 46, in check_call > return subprocess.check_call(*args, **kwargs) > File "/usr/lib/python2.7/subprocess.py", line 541, in check_call >
[jira] [Commented] (BEAM-5683) [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Failure summary
[ https://issues.apache.org/jira/browse/BEAM-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642450#comment-16642450 ] Ankur Goenka commented on BEAM-5683: Looking at it. > [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Failure summary > -- > > Key: BEAM-5683 > URL: https://issues.apache.org/jira/browse/BEAM-5683 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness, test-failures >Reporter: Scott Wegner >Assignee: Ankur Goenka >Priority: Major > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1289/] > * [Gradle Build > Scan|https://scans.gradle.com/s/hjmzvh4ylhs6y/console-log?task=:beam-sdks-python:validatesRunnerBatchTests] > * [Test source > code|https://github.com/apache/beam/blob/303a4275eb0a323761e1a4dec6a22fde9863acf8/sdks/python/apache_beam/runners/portability/stager.py#L390] > Initial investigation: > Seems to be failing on pip download. > == > ERROR: test_multiple_empty_outputs > (apache_beam.transforms.ptransform_test.PTransformTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/ptransform_test.py", > line 277, in test_multiple_empty_outputs > pipeline.run() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py", > line 104, in run > result = super(TestPipeline, self).run(test_runner_api) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 403, in run > self.to_runner_api(), self.runner, self._options).run(False) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 416, in run > return self.runner.run_pipeline(self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 50, in run_pipeline > self.result = super(TestDataflowRunner, self).run_pipeline(pipeline) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 389, in run_pipeline > self.dataflow_client.create_job(self.job), self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/retry.py", > line 184, in wrapper > return fun(*args, **kwargs) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 490, in create_job > self.create_job_description(job) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 519, in create_job_description > resources = self._stage_resour > ces(job.options) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 452, in _stage_resources > staging_location=google_cloud_options.staging_location) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 161, in stage_job_resources > requirements_cache_path) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 411, in _populate_requirements_cache > processes.check_call(cmd_args) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/processes.py", > line 46, in check_call > return subprocess.check_call(*args, **kwargs) > File "/usr/lib/python2.7/subprocess.py", line 541, in check_call > raise CalledProcessError(retcode, cmd) > CalledProcessError: Command > '['/home/jenkins/jenkins-slave/workspace/beam_PostCommi
[jira] [Created] (BEAM-5697) Support parsing legacy options in FnHaness
Ankur Goenka created BEAM-5697: -- Summary: Support parsing legacy options in FnHaness Key: BEAM-5697 URL: https://issues.apache.org/jira/browse/BEAM-5697 Project: Beam Issue Type: Bug Components: sdk-java-harness Reporter: Ankur Goenka Assignee: Ankur Goenka Legacy pipeline options which do not have the format "beam:option::v1" are not de-serialized correctly in PipelineOptionsTranslation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636285#comment-16636285 ] Ankur Goenka commented on BEAM-5187: This seems to be done. Shall we close it? > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6h 40m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4826) Flink runner sends bad flatten to SDK
[ https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-4826. Resolution: Fixed Fix Version/s: 2.8.0 > Flink runner sends bad flatten to SDK > - > > Key: BEAM-4826 > URL: https://issues.apache.org/jira/browse/BEAM-4826 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Henning Rohde >Assignee: Ankur Goenka >Priority: Major > Labels: portability > Fix For: 2.8.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle > descriptors. But it sends the original 3-input flatten but w/ 1 actual input > present in each bundle descriptor. This is inconsistent and the SDK shouldn't > expect dangling PCollections. In contrast, Dataflow removes the flatten when > it does the same split. > Snippet: > register: < > process_bundle_descriptor: < > id: "3" > transforms: < > key: "e4" > value: < > unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1" > spec: < > urn: "urn:beam:transform:pardo:v1" > payload: [...] > > > inputs: < > key: "i0" > value: "n3" > > > outputs: < > key: "i0" > value: "n4" > > > > > > > transforms: < > key: "e7" > value: < > unique_name: "Flatten" > spec: < > urn: "beam:transform:flatten:v1" > > > inputs: < > key: "i0" > value: "n2" > > > inputs: < > key: "i1" > value: "n4" . // <--- only one present. > > > inputs: < > key: "i2" > value: "n6" > > > outputs: < > key: "i0" > value: "n7" > > > > > > > [...] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5090) Use topological sort during ProcessBundle in Java SDKHarness
[ https://issues.apache.org/jira/browse/BEAM-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-5090: --- Labels: newbie (was: ) > Use topological sort during ProcessBundle in Java SDKHarness > > > Key: BEAM-5090 > URL: https://issues.apache.org/jira/browse/BEAM-5090 > Project: Beam > Issue Type: Improvement > Components: sdk-java-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: newbie > > In reference to comment > [https://github.com/apache/beam/pull/6093#issuecomment-410831830] > * Use QueryablePipeline#getTopologicallyOrderedTransforms and execute > processBundle requests. > * Explore: is it worth caching the sorted structure when registReuest is > received. > * Also, explore how we can handle cycles in the execution stage in process > bundle request if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5022) Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to BeamModulePlugin
[ https://issues.apache.org/jira/browse/BEAM-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5022. Resolution: Fixed Fix Version/s: 2.8.0 > Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to > BeamModulePlugin > -- > > Key: BEAM-5022 > URL: https://issues.apache.org/jira/browse/BEAM-5022 > Project: Beam > Issue Type: Improvement > Components: build-system, runner-flink >Reporter: Ankur Goenka > Assignee: Ankur Goenka >Priority: Major > Fix For: 2.8.0 > > > Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to > BeamModulePlugin So that it can be used by other portable runners tests. > > Also Create an interface TestJobserverDriver and make the drivers extend it > instead of using reflection start the Jobserver. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
[ https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5288. Resolution: Fixed Fix Version/s: 2.8.0 > Modify Environment to support non-dockerized SDK harness deployments > - > > Key: BEAM-5288 > URL: https://issues.apache.org/jira/browse/BEAM-5288 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Maximilian Michels >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.8.0 > > Time Spent: 16h 40m > Remaining Estimate: 0h > > As of mailing discussions and BEAM-5187, it has become clear that we need to > extend the Environment information. In addition to the Docker environment, > the extended environment holds deployment options for 1) a process-based > environment, 2) an externally managed environment. > The proto definition, as of now, looks as follows: > {noformat} > message Environment { >// (Required) The URN of the payload >string urn = 1; >// (Optional) The data specifying any parameters to the URN. If >// the URN does not require any arguments, this may be omitted. >bytes payload = 2; > } > message StandardEnvironments { >enum Environments { > DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; > PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; > EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; >} > } > // The payload of a Docker image > message DockerPayload { >string container_image = 1; // implicitly linux_amd64. > } > message ProcessPayload { >string os = 1; // "linux", "darwin", .. >string arch = 2; // "amd64", .. >string command = 3; // process to execute >map env = 4; // environment variables > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-2769) Java SDK support for submitting a Portable Pipeline
[ https://issues.apache.org/jira/browse/BEAM-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-2769. Resolution: Fixed Fix Version/s: 2.8.0 > Java SDK support for submitting a Portable Pipeline > --- > > Key: BEAM-2769 > URL: https://issues.apache.org/jira/browse/BEAM-2769 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Thomas Groh >Assignee: Ankur Goenka >Priority: Major > Labels: portability > Fix For: 2.8.0 > > > The Java codebase should provide a way to submit a Job to a Job Service. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2769) Java SDK support for submitting a Portable Pipeline
[ https://issues.apache.org/jira/browse/BEAM-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636280#comment-16636280 ] Ankur Goenka commented on BEAM-2769: yes. We can submit portable pipelines now. > Java SDK support for submitting a Portable Pipeline > --- > > Key: BEAM-2769 > URL: https://issues.apache.org/jira/browse/BEAM-2769 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Thomas Groh >Assignee: Ankur Goenka >Priority: Major > Labels: portability > Fix For: 2.8.0 > > > The Java codebase should provide a way to submit a Job to a Job Service. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5190) Python pipeline options are not picked correctly by PortableRunner
[ https://issues.apache.org/jira/browse/BEAM-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5190. Resolution: Fixed Fix Version/s: 2.8.0 > Python pipeline options are not picked correctly by PortableRunner > -- > > Key: BEAM-5190 > URL: https://issues.apache.org/jira/browse/BEAM-5190 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.8.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Python SDK worker is deserializing the pipeline options to dictionary instead > of PipelineOptions > Sample log > [grpc-default-executor-2] INFO sdk_worker_main.main - Python sdk harness > started with pipeline_options: \{u'beam:option:flink_master:v1': u'[auto]', > u'beam:option:streaming:v1': False, u'beam:option:experiments:v1': > [u'beam_fn_api', u'worker_threads=50'], u'beam:option:dry_run:v1': False, > u'beam:option:runner:v1': None, u'beam:option:profile_memory:v1': False, > u'beam:option:runtime_type_check:v1': False, u'beam:option:region:v1': > u'us-central1', u'beam:option:options_id:v1': 1, u'beam:option:no_auth:v1': > False, u'beam:option:dataflow_endpoint:v1': > u'https://dataflow.googleapis.com', u'beam:option:sdk_location:v1': > u'/usr/local/google/home/goenka/d/work/beam/beam/sdks/python/dist/apache-beam-2.7.0.dev0.tar.gz', > u'beam:option:direct_runner_use_stacked_bundle:v1': True, > u'beam:option:save_main_session:v1': True, > u'beam:option:type_check_strictness:v1': u'DEFAULT_TO_ANY', > u'beam:option:profile_cpu:v1': False, u'beam:option:job_endpoint:v1': > u'localhost:8099', u'beam:option:job_name:v1': > u'BeamApp-goenka-0822071645-48ae1008', u'beam:option:temp_location:v1': > u'gs://clouddfe-goenka/tmp/', u'beam:option:app_name:v1': None, > u'beam:option:project:v1': u'google.com:clouddfe', > u'beam:option:pipeline_type_check:v1': True, > u'beam:option:staging_location:v1': u'gs://clouddfe-goenka/tmp/staging'} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5194) Pipeline options with multi value are not deserialized correctly from map
[ https://issues.apache.org/jira/browse/BEAM-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5194. Resolution: Fixed Fix Version/s: 2.8.0 > Pipeline options with multi value are not deserialized correctly from map > - > > Key: BEAM-5194 > URL: https://issues.apache.org/jira/browse/BEAM-5194 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.8.0 > > Time Spent: 40m > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/7c41e0a915083bd3b1fe52c2a417fa38a00e6463/sdks/python/apache_beam/options/pipeline_options.py#L171] > > Multiple options are converted to strings and added to flags which causes > wrong deserialization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5697) Support parsing legacy options in FnHaness
[ https://issues.apache.org/jira/browse/BEAM-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5697. Resolution: Fixed Fix Version/s: 2.8.0 > Support parsing legacy options in FnHaness > -- > > Key: BEAM-5697 > URL: https://issues.apache.org/jira/browse/BEAM-5697 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.8.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Legacy pipeline options which do not have the format > "beam:option::v1" are not de-serialized correctly in > PipelineOptionsTranslation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5708) Support caching of SDKHarness environments in flink
Ankur Goenka created BEAM-5708: -- Summary: Support caching of SDKHarness environments in flink Key: BEAM-5708 URL: https://issues.apache.org/jira/browse/BEAM-5708 Project: Beam Issue Type: Improvement Components: runner-flink Reporter: Ankur Goenka Assignee: Ankur Goenka Cache and reuse environment to improve performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5683) [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Fails due to pip download flake
[ https://issues.apache.org/jira/browse/BEAM-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645718#comment-16645718 ] Ankur Goenka commented on BEAM-5683: I agree. Shall we expose the pip logs? I think exposing when failed should be ok. > [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Fails due to > pip download flake > -- > > Key: BEAM-5683 > URL: https://issues.apache.org/jira/browse/BEAM-5683 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness, test-failures >Reporter: Scott Wegner > Assignee: Ankur Goenka >Priority: Major > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1289/] > * [Gradle Build > Scan|https://scans.gradle.com/s/hjmzvh4ylhs6y/console-log?task=:beam-sdks-python:validatesRunnerBatchTests] > * [Test source > code|https://github.com/apache/beam/blob/303a4275eb0a323761e1a4dec6a22fde9863acf8/sdks/python/apache_beam/runners/portability/stager.py#L390] > Initial investigation: > Seems to be failing on pip download. > == > ERROR: test_multiple_empty_outputs > (apache_beam.transforms.ptransform_test.PTransformTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/ptransform_test.py", > line 277, in test_multiple_empty_outputs > pipeline.run() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py", > line 104, in run > result = super(TestPipeline, self).run(test_runner_api) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 403, in run > self.to_runner_api(), self.runner, self._options).run(False) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 416, in run > return self.runner.run_pipeline(self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 50, in run_pipeline > self.result = super(TestDataflowRunner, self).run_pipeline(pipeline) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 389, in run_pipeline > self.dataflow_client.create_job(self.job), self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/retry.py", > line 184, in wrapper > return fun(*args, **kwargs) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 490, in create_job > self.create_job_description(job) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 519, in create_job_description > resources = self._stage_resour > ces(job.options) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 452, in _stage_resources > staging_location=google_cloud_options.staging_location) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 161, in stage_job_resources > requirements_cache_path) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 411, in _populate_requirements_cache > processes.check_call(cmd_args) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/processes.py", > line 46, in check_call > return subprocess.check_call(*args, **kwargs) > File "/usr/lib/python2.7/subprocess.py", line 541, in check_call > raise CalledProcessError(retcod
[jira] [Assigned] (BEAM-2594) Python shim for submitting to a JobService
[ https://issues.apache.org/jira/browse/BEAM-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-2594: -- Assignee: Ankur Goenka > Python shim for submitting to a JobService > -- > > Key: BEAM-2594 > URL: https://issues.apache.org/jira/browse/BEAM-2594 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Kenneth Knowles >Assignee: Ankur Goenka >Priority: Minor > Labels: portability > Fix For: 2.8.0 > > > Python SDK should support submission of portable pipelines to the ULR, as per > https://s.apache.org/beam-job-api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5683) [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Fails due to pip download flake
[ https://issues.apache.org/jira/browse/BEAM-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645730#comment-16645730 ] Ankur Goenka commented on BEAM-5683: check_output does that by default. > [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Fails due to > pip download flake > -- > > Key: BEAM-5683 > URL: https://issues.apache.org/jira/browse/BEAM-5683 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness, test-failures >Reporter: Scott Wegner > Assignee: Ankur Goenka >Priority: Major > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1289/] > * [Gradle Build > Scan|https://scans.gradle.com/s/hjmzvh4ylhs6y/console-log?task=:beam-sdks-python:validatesRunnerBatchTests] > * [Test source > code|https://github.com/apache/beam/blob/303a4275eb0a323761e1a4dec6a22fde9863acf8/sdks/python/apache_beam/runners/portability/stager.py#L390] > Initial investigation: > Seems to be failing on pip download. > == > ERROR: test_multiple_empty_outputs > (apache_beam.transforms.ptransform_test.PTransformTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/ptransform_test.py", > line 277, in test_multiple_empty_outputs > pipeline.run() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py", > line 104, in run > result = super(TestPipeline, self).run(test_runner_api) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 403, in run > self.to_runner_api(), self.runner, self._options).run(False) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py", > line 416, in run > return self.runner.run_pipeline(self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 50, in run_pipeline > self.result = super(TestDataflowRunner, self).run_pipeline(pipeline) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 389, in run_pipeline > self.dataflow_client.create_job(self.job), self) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/retry.py", > line 184, in wrapper > return fun(*args, **kwargs) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 490, in create_job > self.create_job_description(job) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 519, in create_job_description > resources = self._stage_resour > ces(job.options) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py", > line 452, in _stage_resources > staging_location=google_cloud_options.staging_location) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 161, in stage_job_resources > requirements_cache_path) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py", > line 411, in _populate_requirements_cache > processes.check_call(cmd_args) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/processes.py", > line 46, in check_call > return subprocess.check_call(*args, **kwargs) > File "/usr/lib/python2.7/subprocess.py", line 541, in check_call > raise CalledProcessError(retcode, cmd) > CalledProcessError: Command &g
[jira] [Resolved] (BEAM-2594) Python shim for submitting to a JobService
[ https://issues.apache.org/jira/browse/BEAM-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-2594. Resolution: Fixed Fix Version/s: 2.8.0 > Python shim for submitting to a JobService > -- > > Key: BEAM-2594 > URL: https://issues.apache.org/jira/browse/BEAM-2594 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Kenneth Knowles >Assignee: Ankur Goenka >Priority: Minor > Labels: portability > Fix For: 2.8.0 > > > Python SDK should support submission of portable pipelines to the ULR, as per > https://s.apache.org/beam-job-api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2594) Python shim for submitting to a JobService
[ https://issues.apache.org/jira/browse/BEAM-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645738#comment-16645738 ] Ankur Goenka commented on BEAM-2594: This PR splits the ULR and make it possible to submit jobs to any jobServer. [https://github.com/apache/beam/pull/5301] > Python shim for submitting to a JobService > -- > > Key: BEAM-2594 > URL: https://issues.apache.org/jira/browse/BEAM-2594 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Kenneth Knowles >Priority: Minor > Labels: portability > Fix For: 2.8.0 > > > Python SDK should support submission of portable pipelines to the ULR, as per > https://s.apache.org/beam-job-api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5433) Cleanup Environment.url from beam_runner_api.proto
[ https://issues.apache.org/jira/browse/BEAM-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-5433: --- Issue Type: Task (was: Test) > Cleanup Environment.url from beam_runner_api.proto > -- > > Key: BEAM-5433 > URL: https://issues.apache.org/jira/browse/BEAM-5433 > Project: Beam > Issue Type: Task > Components: beam-model > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > > Environment URL field is deprecated and should be removed ASAP. > The current blocker in removing the field is compatibility with Dataflow as > data flow has internal code which relies on it. > There is also vote passed to move the affected dataflow code to open source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5433) Cleanup Environment.url from beam_runner_api.proto
Ankur Goenka created BEAM-5433: -- Summary: Cleanup Environment.url from beam_runner_api.proto Key: BEAM-5433 URL: https://issues.apache.org/jira/browse/BEAM-5433 Project: Beam Issue Type: Test Components: beam-model Reporter: Ankur Goenka Assignee: Ankur Goenka Environment URL field is deprecated and should be removed ASAP. The current blocker in removing the field is compatibility with Dataflow as data flow has internal code which relies on it. There is also vote passed to move the affected dataflow code to open source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5467) Python Flink ValidatesRunner job fixes
[ https://issues.apache.org/jira/browse/BEAM-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631002#comment-16631002 ] Ankur Goenka commented on BEAM-5467: Anecdotally, tasks are failing because of segfault with following error 04:18:38 Segmentation fault (core dumped) 04:18:38 04:18:38 > Task :beam-sdks-python:flinkCompatibilityMatrixStreaming FAILED 04:18:38 :beam-sdks-python:flinkCompatibilityMatrixStreaming (Thread[Task worker for ':' Thread 6,5,main]) completed. Took 1 mins 13.28 secs. 04:18:38 04:18:38 FAILURE: Build completed with 2 failures. 04:18:38 04:18:38 1: Task failed with an exception. 04:18:38 --- 04:18:38 * Where: 04:18:38 Build file '/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_VR_Flink/src/sdks/python/build.gradle' line: 340 04:18:38 04:18:38 * What went wrong: 04:18:38 Execution failed for task ':beam-sdks-python:flinkCompatibilityMatrixBatch'. 04:18:38 > Process 'command 'sh'' finished with non-zero exit value 139 04:18:38 04:18:38 * Try: 04:18:38 Run with --stacktrace option to get the stack trace. Run with --debug option to get more log output. Run with --scan to get full insights. 04:18:38 == > Python Flink ValidatesRunner job fixes > -- > > Key: BEAM-5467 > URL: https://issues.apache.org/jira/browse/BEAM-5467 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Minor > Labels: portability-flink > Time Spent: 1h 20m > Remaining Estimate: 0h > > Add status to README > Rename script and job for consistency > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5467) Python Flink ValidatesRunner job fixes
[ https://issues.apache.org/jira/browse/BEAM-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631049#comment-16631049 ] Ankur Goenka commented on BEAM-5467: I verified that they get executed sequentially so that should not be a problem. :beam-sdks-python:flinkCompatibilityMatrixBatchFAILED Started: 5m 27.699s Duration: 4m 0.393s :beam-sdks-python:flinkCompatibilityMatrixStreamingFAILED Started: 9m 28.093s Duration: 1m 13.280s > Python Flink ValidatesRunner job fixes > -- > > Key: BEAM-5467 > URL: https://issues.apache.org/jira/browse/BEAM-5467 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Minor > Labels: portability-flink > Time Spent: 1h 20m > Remaining Estimate: 0h > > Add status to README > Rename script and job for consistency > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5219) Expose OutboundMessage in PubSub client
[ https://issues.apache.org/jira/browse/BEAM-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5219. Resolution: Fixed Fix Version/s: 2.8.0 > Expose OutboundMessage in PubSub client > --- > > Key: BEAM-5219 > URL: https://issues.apache.org/jira/browse/BEAM-5219 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Minor > Fix For: 2.8.0 > > Time Spent: 1h > Remaining Estimate: 0h > > publish method in org/apache/beam/sdk/io/gcp/pubsub/PubsubClient.java is > public but the argument OutboundMessage is not public which makes the api > unusable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5262) JobState support for Reference Runner
[ https://issues.apache.org/jira/browse/BEAM-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5262. Resolution: Fixed Fix Version/s: 2.8.0 > JobState support for Reference Runner > - > > Key: BEAM-5262 > URL: https://issues.apache.org/jira/browse/BEAM-5262 > Project: Beam > Issue Type: Bug > Components: runner-direct > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Minor > Fix For: 2.8.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Reference runner does not support getStateStream which is needed by portable > SDK -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5396) Flink portable runner savepoint / upgrade support
[ https://issues.apache.org/jira/browse/BEAM-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618330#comment-16618330 ] Ankur Goenka commented on BEAM-5396: Thanks for the reference link. Its very relevant. This feature require quite a bit of effort. We can start thing of a design while we are improving portability and once we are stable, we can get to this feature. > Flink portable runner savepoint / upgrade support > - > > Key: BEAM-5396 > URL: https://issues.apache.org/jira/browse/BEAM-5396 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Priority: Major > > The Flink runner overall and the new portable implementation specifically > need to support Flink savepoints for production use. Specifically, it should > be possible to upgrade a stateful portable Beam pipeline that runs on Flink, > which involves taking a savepoint and then starting the new version of the > pipeline from that savepoint. The potential issues with pipeline evolution > and migration are similar to those when using the Flink DataStream API > (schema / name changes etc.). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka reassigned BEAM-5283: -- Assignee: Ankur Goenka (was: Jason Kuster) > Enable Python Portable Flink PostCommit Tests to Jenkins > > > Key: BEAM-5283 > URL: https://issues.apache.org/jira/browse/BEAM-5283 > Project: Beam > Issue Type: Test > Components: testing > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > Time Spent: 10.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614179#comment-16614179 ] Ankur Goenka commented on BEAM-5283: Reference bug [https://github.com/pypa/virtualenv/issues/997] shebangs limit on the argument length which broke the test. > Enable Python Portable Flink PostCommit Tests to Jenkins > > > Key: BEAM-5283 > URL: https://issues.apache.org/jira/browse/BEAM-5283 > Project: Beam > Issue Type: Test > Components: testing > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > Fix For: 2.8.0 > > Time Spent: 10.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka resolved BEAM-5283. Resolution: Fixed Fix Version/s: 2.8.0 > Enable Python Portable Flink PostCommit Tests to Jenkins > > > Key: BEAM-5283 > URL: https://issues.apache.org/jira/browse/BEAM-5283 > Project: Beam > Issue Type: Test > Components: testing > Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > Fix For: 2.8.0 > > Time Spent: 10.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
[ https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614240#comment-16614240 ] Ankur Goenka commented on BEAM-5288: We are thinking of removing the args as of now to avoid any confusion and potential name collision. All the relevant information can be conveyed to the SDKHarness using the Environment. > Modify Environment to support non-dockerized SDK harness deployments > - > > Key: BEAM-5288 > URL: https://issues.apache.org/jira/browse/BEAM-5288 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Maximilian Michels >Assignee: Ankur Goenka >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > As of mailing discussions and BEAM-5187, it has become clear that we need to > extend the Environment information. In addition to the Docker environment, > the extended environment holds deployment options for 1) a process-based > environment, 2) an externally managed environment. > The proto definition, as of now, looks as follows: > {noformat} > message Environment { >// (Required) The URN of the payload >string urn = 1; >// (Optional) The data specifying any parameters to the URN. If >// the URN does not require any arguments, this may be omitted. >bytes payload = 2; > } > message StandardEnvironments { >enum Environments { > DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; > PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; > EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; >} > } > // The payload of a Docker image > message DockerPayload { >string container_image = 1; // implicitly linux_amd64. > } > message ProcessPayload { >string os = 1; // "linux", "darwin", .. >string arch = 2; // "amd64", .. >string command = 3; // process to execute >map env = 4; // environment variables > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5396) Flink portable runner savepoint / upgrade support
[ https://issues.apache.org/jira/browse/BEAM-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618442#comment-16618442 ] Ankur Goenka commented on BEAM-5396: Makes sense. Its a runner specific functionality so its a good idea to track it in feature matrix. > Flink portable runner savepoint / upgrade support > - > > Key: BEAM-5396 > URL: https://issues.apache.org/jira/browse/BEAM-5396 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Priority: Major > > The Flink runner overall and the new portable implementation specifically > need to support Flink savepoints for production use. Specifically, it should > be possible to upgrade a stateful portable Beam pipeline that runs on Flink, > which involves taking a savepoint and then starting the new version of the > pipeline from that savepoint. The potential issues with pipeline evolution > and migration are similar to those when using the Flink DataStream API > (schema / name changes etc.). -- This message was sent by Atlassian JIRA (v7.6.3#76005)