Currently 'sdk_harness_container_image_overrides' is Dataflow only.

Also I think options might be getting dropped due to
https://issues.apache.org/jira/browse/BEAM-9449.

Unfortunately, till one of the above is fixed I think you might have to
update the code at the following location to override the container and
rebuild the SDK.
https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/Environments.java#L397

Thanks,
Cham

On Tue, Feb 2, 2021 at 2:54 PM Paul Nimusiima <[email protected]> wrote:

> Those two arguments appear to be discarded
> WARNING:apache_beam.options.pipeline_options:Discarding unparseable args:
> ['--default_environment_type=DOCKER',
> '--default_environment_config=beam_java_sdk:0.0.1']
> and the default apache/beam_java11_sdk:2.27.0 is used.
>
> INFO
> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory -
> Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for
> worker id 1-2
>
> and when the when the harness is being removed
>
> INFO  org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory
>  - Closing environment urn: "beam:env:docker:v1"
> payload: "\n\035apache/beam_java11_sdk:2.27.0"
> capabilities: "beam:coder:bytes:v1"
> ...
> capabilities: "beam:transform:sdf_truncate_sized_restrictions:v1"
> dependencies {
>   type_urn: "beam:artifact:type:file:v1"
>   type_payload:
> "\n\354\001/var/folders/yj/njlsyhss5sb7t36fklhxjbvc0000gp/T/beam-artifact-staging/11d9c289a5c41d77ecf6d4eff0cc95928c241ec416c8e88ea67fbc931463a0c5/1-external_1beam:env:docker-beam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lT"
>   role_urn: "beam:artifact:role:staging_to:v1"
>   role_payload:
> "\nZbeam-sdks-java-io-expansion-service-2.27.0-OrWYGnEG6c2yQRgdLP3AI3hLQ_1lTDuSw_K_Lc45B5c.jar"
> }
>
> On Tue, Feb 2, 2021 at 10:14 PM Kyle Weaver <[email protected]> wrote:
>
>> > Regarding Kyle Weaver's suggestion, I tried a combination of
>> "--environment_type=DOCKER" and
>> '--environment_config=beam_python_sdk:0.0.1' where beam_python_sdk:0.0.1 is
>> an image I built. That made it possible to override the python sdk harness,
>> but I was wondering if there is something similar for overriding the Java
>> sdk harness?
>>
>> The Java options are named differently, despite appearing to have
>> identical semantics. Can you try default_environment_type and
>> default_environment_config?
>>
>> On Tue, Feb 2, 2021 at 2:04 PM Paul Nimusiima <[email protected]>
>> wrote:
>>
>>> Hi all,
>>> Thanks for the responses.
>>> Regarding Kyle Weaver's suggestion, I tried a combination of
>>> "--environment_type=DOCKER" and
>>> '--environment_config=beam_python_sdk:0.0.1' where beam_python_sdk:0.0.1
>>> is an image I built. That made it possible to override the python sdk
>>> harness, but I was wondering if there is something similar for overriding
>>> the Java sdk harness?
>>>
>>> Thanks again.
>>>
>>> On Tue, Feb 2, 2021 at 9:08 PM Kyle Weaver <[email protected]> wrote:
>>>
>>>> AFAIK sdk_harness_container_image_overrides only works for the Dataflow
>>>> runner. For other runners I think you will have to change the default
>>>> environment config:
>>>> https://github.com/apache/beam/blob/0078bb35ba4aef9410d681d8b4e2c16d9f56433d/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L53-L71
>>>>
>>>> On Tue, Feb 2, 2021 at 11:56 AM Brian Hulette <[email protected]>
>>>> wrote:
>>>>
>>>>> Hm I would expect that to work. Can you tell what container Flink is
>>>>> using if it's not using the one you specified? +Chamikara Jayalath
>>>>> <[email protected]> may have some insight here
>>>>>
>>>>> Brian
>>>>>
>>>>> On Tue, Feb 2, 2021 at 3:27 AM Paul Nimusiima <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hello Beam community,
>>>>>>
>>>>>> I am wondering whether it is possible to connect to a secure Kafka
>>>>>> broker  with the python beam SDK. So for example, in the code below, how
>>>>>> would I make the "ssl.truststore.location" and "ssl.keystore.location"
>>>>>> accessible inside the Java SDK harness which runs the code.
>>>>>>
>>>>>> ReadFromKafka(
>>>>>>         consumer_config={
>>>>>>             "bootstrap.servers": "bootstrap-server:17032",
>>>>>>             "security.protocol": "SSL",
>>>>>>             "ssl.truststore.location": 
>>>>>> "/opt/keys/client.truststore.jks", # how do I make this available to the 
>>>>>> Java SDK harness
>>>>>>             "ssl.truststore.password": "password",
>>>>>>             "ssl.keystore.type": "PKCS12",
>>>>>>             "ssl.keystore.location": "/opt/keys/client.keystore.p12", # 
>>>>>> how do I make this available to the Java SDK harness
>>>>>>             "ssl.keystore.password": "password",
>>>>>>             "group.id": "group",
>>>>>>             "basic.auth.credentials.source": "USER_INFO",
>>>>>>             "schema.registry.basic.auth.user.info": "user:password"
>>>>>>         },
>>>>>>         topics=["topic"],
>>>>>>         max_num_records=2,
>>>>>>         # expansion_service="localhost:56938"
>>>>>>     )
>>>>>>
>>>>>> I tried building a custom image that has pulls credentials into the
>>>>>> container based on "apache/beam_java11_sdk:2.27.0" and using the 
>>>>>> "--sdk_harness_container_image_overrides=.*java.*,{image}:{tag}"
>>>>>> argument, but it does not appear to be used when the pipeline runs on
>>>>>> Flink.
>>>>>>
>>>>>> Thank you for the great tool and any help or advice is greatly
>>>>>> appreciated.
>>>>>>
>>>>>> --
>>>>>> paul
>>>>>>
>>>>>
>>>
>>> --
>>> paul
>>>
>>
>
> --
> paul
>

Reply via email to