In an offline chat with Hai, It seem useful for users to be able to provide
custom authentication like a secret which can be distributed out of band by
the infrastructure and can be provided via file system, rpc to another
service etc.
gRPC already has some mechanism for standard and custom authentication[1].
Instrumenting gRPC channel using command line option or environment
variable on the worker machines can be be useful.

[1] https://grpc.io/docs/guides/auth/

On Fri, Apr 26, 2019 at 4:33 PM Lukasz Cwik <lc...@google.com> wrote:

> The link to the ApiServiceDescriptor is
> https://github.com/apache/beam/blob/476e17ed6badd4d5c06c4caf8a824805f40a8e7a/model/pipeline/src/main/proto/endpoints.proto#L31
>
> On Fri, Apr 26, 2019 at 4:32 PM Lukasz Cwik <lc...@google.com> wrote:
>
>> I had originally taken a look at this a while ago but not much has
>> progressed since then. The original idea was that the ApiServiceDescriptor
>> would be extended to support secure ways of authentication/communication. I
>> was prototyping with an OAuth2 client credentials grant at the time but
>> dropped it as other things were more important. The only currently
>> supported mode across all SDKs is an implicit authenticated/secure mode
>> where all communication is assumed to already be encrypted/private (e.g.
>> over VPN that is managed externally with trusted services) and hence the
>> gRPC channel itself is insecure and there is no authentication being
>> performed.
>>
>> Even though sdk_worker.py seems like it supports credentials, no one
>> invokes the constructor with credentials enabled as can be seen by this
>> comment by Robert[1].
>>
>> For SSL/TLS support it seems like we need some way to configure a runner
>> to be told to use SSL/TLS (potentially with a custom private key and trust
>> chain). Do you have some suggestions on how we add support for passing
>> around channel/call[2] credentials?
>>
>> 1:
>> https://github.com/apache/beam/blob/476e17ed6badd4d5c06c4caf8a824805f40a8e7a/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L139
>> 2: https://grpc.io/docs/guides/auth/
>>
>> On Tue, Apr 23, 2019 at 5:06 PM Hai Lu <lhai...@apache.org> wrote:
>>
>>> Hi,
>>>
>>> This is Hai from LinkedIn. Daniel and I have been working on
>>> productionizing Samza portable runner. BTW, Daniel didn't mention in his
>>> previous email that he has enabled and validated Python 3 for Samza runner
>>> and it worked smoothly. Kudos to the team!
>>>
>>> Here I have a few security related questions about portability. At
>>> LinkedIn, we enable SSL/TLS and ACLs for Kafka data and any data exchange.
>>> In the case of portable runner, we're required to secure the data channels
>>> between Java and Python processes as well because our Samza jobs are
>>> running in a multi-tenant environment. While I'm currently working on this
>>> on our internal branch, I do want to keep it clean and consistent with the
>>> master branch.
>>>
>>> My questions are: were there any plans/thoughts around security for
>>> portability? I see that sdk_worker.py does have some codes to create
>>> secured gRPC channels; is anyone actually leveraging those codes? I don't
>>> see on the Java side any work is done, though.
>>>
>>> Thanks,
>>> Hai Lu
>>>
>>

Reply via email to