Currently 'sdk_harness_container_image_overrides' is Dataflow only. Also I think options might be getting dropped due to https://issues.apache.org/jira/browse/BEAM-9449.
Unfortunately, till one of the above is fixed I think you might have to update the code at the following location to override the container and rebuild the SDK. https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/Environments.java#L397 Thanks, Cham On Tue, Feb 2, 2021 at 2:54 PM Paul Nimusiima <[email protected]> wrote: > Those two arguments appear to be discarded > WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: > ['--default_environment_type=DOCKER', > '--default_environment_config=beam_java_sdk:0.0.1'] > and the default apache/beam_java11_sdk:2.27.0 is used. > > INFO > org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - > Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for > worker id 1-2 > > and when the when the harness is being removed > > INFO org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory > - Closing environment urn: "beam:env:docker:v1" > payload: "\n\035apache/beam_java11_sdk:2.27.0" > capabilities: "beam:coder:bytes:v1" > ... > capabilities: "beam:transform:sdf_truncate_sized_restrictions:v1" > dependencies { > type_urn: "beam:artifact:type:file:v1" > type_payload: > "\n\354\001/var/folders/yj/njlsyhss5sb7t36fklhxjbvc0000gp/T/beam-artifact-staging/11d9c289a5c41d77ecf6d4eff0cc95928c241ec416c8e88ea67fbc931463a0c5/1-external_1beam:env:docker-beam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lT" > role_urn: "beam:artifact:role:staging_to:v1" > role_payload: > "\nZbeam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lTDuSw_K_Lc45B5c.jar" > } > > On Tue, Feb 2, 2021 at 10:14 PM Kyle Weaver <[email protected]> wrote: > >> > Regarding Kyle Weaver's suggestion, I tried a combination of >> "--environment_type=DOCKER" and >> '--environment_config=beam_python_sdk:0.0.1' where beam_python_sdk:0.0.1 is >> an image I built. That made it possible to override the python sdk harness, >> but I was wondering if there is something similar for overriding the Java >> sdk harness? >> >> The Java options are named differently, despite appearing to have >> identical semantics. Can you try default_environment_type and >> default_environment_config? >> >> On Tue, Feb 2, 2021 at 2:04 PM Paul Nimusiima <[email protected]> >> wrote: >> >>> Hi all, >>> Thanks for the responses. >>> Regarding Kyle Weaver's suggestion, I tried a combination of >>> "--environment_type=DOCKER" and >>> '--environment_config=beam_python_sdk:0.0.1' where beam_python_sdk:0.0.1 >>> is an image I built. That made it possible to override the python sdk >>> harness, but I was wondering if there is something similar for overriding >>> the Java sdk harness? >>> >>> Thanks again. >>> >>> On Tue, Feb 2, 2021 at 9:08 PM Kyle Weaver <[email protected]> wrote: >>> >>>> AFAIK sdk_harness_container_image_overrides only works for the Dataflow >>>> runner. For other runners I think you will have to change the default >>>> environment config: >>>> https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L53-L71 >>>> >>>> On Tue, Feb 2, 2021 at 11:56 AM Brian Hulette <[email protected]> >>>> wrote: >>>> >>>>> Hm I would expect that to work. Can you tell what container Flink is >>>>> using if it's not using the one you specified? +Chamikara Jayalath >>>>> <[email protected]> may have some insight here >>>>> >>>>> Brian >>>>> >>>>> On Tue, Feb 2, 2021 at 3:27 AM Paul Nimusiima <[email protected]> >>>>> wrote: >>>>> >>>>>> Hello Beam community, >>>>>> >>>>>> I am wondering whether it is possible to connect to a secure Kafka >>>>>> broker with the python beam SDK. So for example, in the code below, how >>>>>> would I make the "ssl.truststore.location" and "ssl.keystore.location" >>>>>> accessible inside the Java SDK harness which runs the code. >>>>>> >>>>>> ReadFromKafka( >>>>>> consumer_config={ >>>>>> "bootstrap.servers": "bootstrap-server:17032", >>>>>> "security.protocol": "SSL", >>>>>> "ssl.truststore.location": >>>>>> "/opt/keys/client.truststore.jks", # how do I make this available to the >>>>>> Java SDK harness >>>>>> "ssl.truststore.password": "password", >>>>>> "ssl.keystore.type": "PKCS12", >>>>>> "ssl.keystore.location": "/opt/keys/client.keystore.p12", # >>>>>> how do I make this available to the Java SDK harness >>>>>> "ssl.keystore.password": "password", >>>>>> "group.id": "group", >>>>>> "basic.auth.credentials.source": "USER_INFO", >>>>>> "schema.registry.basic.auth.user.info": "user:password" >>>>>> }, >>>>>> topics=["topic"], >>>>>> max_num_records=2, >>>>>> # expansion_service="localhost:56938" >>>>>> ) >>>>>> >>>>>> I tried building a custom image that has pulls credentials into the >>>>>> container based on "apache/beam_java11_sdk:2.27.0" and using the >>>>>> "--sdk_harness_container_image_overrides=.*java.*,{image}:{tag}" >>>>>> argument, but it does not appear to be used when the pipeline runs on >>>>>> Flink. >>>>>> >>>>>> Thank you for the great tool and any help or advice is greatly >>>>>> appreciated. >>>>>> >>>>>> -- >>>>>> paul >>>>>> >>>>> >>> >>> -- >>> paul >>> >> > > -- > paul >
