[ https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640114#comment-16640114 ]
Udi Meiri commented on BEAM-5442: --------------------------------- I believe PR 6557 broke integration tests using Dataflow. Cloud console: "Parsing unknown args: [u'--dataflowJobId=2018-10-05_07_00_20-5526009939236014896', u'--autoscalingAlgorithm=NONE', u'--direct_runner_use_stacked_bundle', u'--maxNumWorkers=0', u'--style=scrambled', u'--sleep_secs=20', u'--pipeline_type_check', u'--gcpTempLocation=gs://temp-storage-for-end-to-end-tests/temp-it/beamapp-jenkins-1005140012-917021.1538748012.917145', u'--numWorkers=1', u'--beam_plugins=apache_beam.io.filesystem.FileSystem', u'--beam_plugins=apache_beam.io.hadoopfilesystem.HadoopFileSystem', u'--beam_plugins=apache_beam.io.localfilesystem.LocalFileSystem', u'--beam_plugins=apache_beam.io.gcp.gcsfilesystem.GCSFileSystem', u'--beam_plugins=apache_beam.io.filesystem_test.TestingFileSystem', u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.PipelineGraphRenderer', u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.MuteRenderer', u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.TextRenderer', u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.PydotRenderer', u'--pipelineUrl=gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1005140012-917021.1538748012.917145/pipeline.pb']" "Python sdk harness failed: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py", line 133, in main sdk_pipeline_options.get_all_options(drop_default=True)) File "/usr/local/lib/python2.7/dist-packages/apache_beam/options/pipeline_options.py", line 224, in get_all_options parser.add_argument(arg.split('=', 1)[0], nargs='?') File "/usr/lib/python2.7/argparse.py", line 1308, in add_argument return self._add_action(action) File "/usr/lib/python2.7/argparse.py", line 1682, in _add_action self._optionals._add_action(action) File "/usr/lib/python2.7/argparse.py", line 1509, in _add_action action = super(_ArgumentGroup, self)._add_action(action) File "/usr/lib/python2.7/argparse.py", line 1322, in _add_action self._check_conflict(action) File "/usr/lib/python2.7/argparse.py", line 1460, in _check_conflict conflict_handler(action, confl_optionals) File "/usr/lib/python2.7/argparse.py", line 1467, in _handle_conflict_error raise ArgumentError(action, message % conflict_string) ArgumentError: argument --beam_plugins: conflicting option string(s): --beam_plugins" ---- Test output: 07:28:37 ====================================================================== 07:28:37 FAIL: test_streaming_with_attributes (apache_beam.io.gcp.pubsub_integration_test.PubSubIntegrationTest) 07:28:37 ---------------------------------------------------------------------- 07:28:37 Traceback (most recent call last): 07:28:37 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_integration_test.py", line 172, in test_streaming_with_attributes 07:28:37 self._test_streaming(with_attributes=True) 07:28:37 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_integration_test.py", line 164, in _test_streaming 07:28:37 timestamp_attribute=self.TIMESTAMP_ATTRIBUTE) 07:28:37 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_it_pipeline.py", line 91, in run_pipeline 07:28:37 result = p.run() 07:28:37 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/pipeline.py", line 416, in run 07:28:37 return self.runner.run_pipeline(self) 07:28:37 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", line 65, in run_pipeline 07:28:37 hc_assert_that(self.result, pickler.loads(on_success_matcher)) 07:28:37 AssertionError: 07:28:37 Expected: (Test pipeline expected terminated in state: RUNNING and Expected 2 messages.) 07:28:37 but: Expected 2 messages. Got 0 messages. Diffs (item, count): 07:28:37 Expected but not in actual: [(PubsubMessage(data001-seen, {'processed': 'IT'}), 1), (PubsubMessage(data002-seen, {'timestamp_out': '2018-07-11T02:02:50.149000Z', 'processed': 'IT'}), 1)] 07:28:37 Unexpected: [] 07:28:37 Stripped attributes: ['id', 'timestamp'] 07:28:37 07:28:37 -------------------- >> begin captured stdout << --------------------- 07:28:37 Found: https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-10-05_07_00_20-5526009939236014896?project=apache-beam-testing. 07:28:37 07:28:37 --------------------- >> end captured stdout << ---------------------- > PortableRunner swallows custom options for Runner > ------------------------------------------------- > > Key: BEAM-5442 > URL: https://issues.apache.org/jira/browse/BEAM-5442 > Project: Beam > Issue Type: Bug > Components: sdk-java-core, sdk-py-core > Reporter: Maximilian Michels > Assignee: Maximilian Michels > Priority: Major > Labels: portability, portability-flink > Fix For: 2.8.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > The PortableRunner doesn't pass custom PipelineOptions to the executing > Runner. > Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner. > (The option is just removed during proto translation without any warning) > We should allow some form of customization through the options, even for the > PortableRunner. -- This message was sent by Atlassian JIRA (v7.6.3#76005)