Thanks for the information Chamikara On Wed, Feb 3, 2021 at 12:04 AM Chamikara Jayalath <[email protected]> wrote:
> Currently 'sdk_harness_container_image_overrides' is Dataflow only. > > Also I think options might be getting dropped due to > https://issues.apache.org/jira/browse/BEAM-9449. > > Unfortunately, till one of the above is fixed I think you might have to > update the code at the following location to override the container and > rebuild the SDK. > > https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/Environments.java#L397 > > Thanks, > Cham > > On Tue, Feb 2, 2021 at 2:54 PM Paul Nimusiima <[email protected]> > wrote: > >> Those two arguments appear to be discarded >> WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: >> ['--default_environment_type=DOCKER', >> '--default_environment_config=beam_java_sdk:0.0.1'] >> and the default apache/beam_java11_sdk:2.27.0 is used. >> >> INFO >> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - >> Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for >> worker id 1-2 >> >> and when the when the harness is being removed >> >> INFO org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory >> - Closing environment urn: "beam:env:docker:v1" >> payload: "\n\035apache/beam_java11_sdk:2.27.0" >> capabilities: "beam:coder:bytes:v1" >> ... >> capabilities: "beam:transform:sdf_truncate_sized_restrictions:v1" >> dependencies { >> type_urn: "beam:artifact:type:file:v1" >> type_payload: >> "\n\354\001/var/folders/yj/njlsyhss5sb7t36fklhxjbvc0000gp/T/beam-artifact-staging/11d9c289a5c41d77ecf6d4eff0cc95928c241ec416c8e88ea67fbc931463a0c5/1-external_1beam:env:docker-beam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lT" >> role_urn: "beam:artifact:role:staging_to:v1" >> role_payload: >> "\nZbeam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lTDuSw_K_Lc45B5c.jar" >> } >> >> On Tue, Feb 2, 2021 at 10:14 PM Kyle Weaver <[email protected]> wrote: >> >>> > Regarding Kyle Weaver's suggestion, I tried a combination of >>> "--environment_type=DOCKER" and >>> '--environment_config=beam_python_sdk:0.0.1' where beam_python_sdk:0.0.1 is >>> an image I built. That made it possible to override the python sdk harness, >>> but I was wondering if there is something similar for overriding the Java >>> sdk harness? >>> >>> The Java options are named differently, despite appearing to have >>> identical semantics. Can you try default_environment_type and >>> default_environment_config? >>> >>> On Tue, Feb 2, 2021 at 2:04 PM Paul Nimusiima <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> Thanks for the responses. >>>> Regarding Kyle Weaver's suggestion, I tried a combination of >>>> "--environment_type=DOCKER" and >>>> '--environment_config=beam_python_sdk:0.0.1' where >>>> beam_python_sdk:0.0.1 is an image I built. That made it possible to >>>> override the python sdk harness, but I was wondering if there is something >>>> similar for overriding the Java sdk harness? >>>> >>>> Thanks again. >>>> >>>> On Tue, Feb 2, 2021 at 9:08 PM Kyle Weaver <[email protected]> wrote: >>>> >>>>> AFAIK sdk_harness_container_image_overrides only works for the >>>>> Dataflow runner. For other runners I think you will have to change the >>>>> default environment config: >>>>> https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L53-L71 >>>>> >>>>> On Tue, Feb 2, 2021 at 11:56 AM Brian Hulette <[email protected]> >>>>> wrote: >>>>> >>>>>> Hm I would expect that to work. Can you tell what container Flink is >>>>>> using if it's not using the one you specified? +Chamikara Jayalath >>>>>> <[email protected]> may have some insight here >>>>>> >>>>>> Brian >>>>>> >>>>>> On Tue, Feb 2, 2021 at 3:27 AM Paul Nimusiima <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello Beam community, >>>>>>> >>>>>>> I am wondering whether it is possible to connect to a secure Kafka >>>>>>> broker with the python beam SDK. So for example, in the code below, how >>>>>>> would I make the "ssl.truststore.location" and "ssl.keystore.location" >>>>>>> accessible inside the Java SDK harness which runs the code. >>>>>>> >>>>>>> ReadFromKafka( >>>>>>> consumer_config={ >>>>>>> "bootstrap.servers": "bootstrap-server:17032", >>>>>>> "security.protocol": "SSL", >>>>>>> "ssl.truststore.location": >>>>>>> "/opt/keys/client.truststore.jks", # how do I make this available to >>>>>>> the Java SDK harness >>>>>>> "ssl.truststore.password": "password", >>>>>>> "ssl.keystore.type": "PKCS12", >>>>>>> "ssl.keystore.location": "/opt/keys/client.keystore.p12", # >>>>>>> how do I make this available to the Java SDK harness >>>>>>> "ssl.keystore.password": "password", >>>>>>> "group.id": "group", >>>>>>> "basic.auth.credentials.source": "USER_INFO", >>>>>>> "schema.registry.basic.auth.user.info": "user:password" >>>>>>> }, >>>>>>> topics=["topic"], >>>>>>> max_num_records=2, >>>>>>> # expansion_service="localhost:56938" >>>>>>> ) >>>>>>> >>>>>>> I tried building a custom image that has pulls credentials into the >>>>>>> container based on "apache/beam_java11_sdk:2.27.0" and using the >>>>>>> "--sdk_harness_container_image_overrides=.*java.*,{image}:{tag}" >>>>>>> argument, but it does not appear to be used when the pipeline runs >>>>>>> on Flink. >>>>>>> >>>>>>> Thank you for the great tool and any help or advice is greatly >>>>>>> appreciated. >>>>>>> >>>>>>> -- >>>>>>> paul >>>>>>> >>>>>> >>>> >>>> -- >>>> paul >>>> >>> >> >> -- >> paul >> > -- paul
