Github user vijikarthi commented on the issue:
https://github.com/apache/flink/pull/2275
Adding some more cotext to the implementation details. which is based on
the design proposal
(https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing)
Current security implementation works in a subtle way utilizing the Keberos
cache of the user who starts Flink process/jobs and only in the context of
supporting secure access to Hadoop cluster. The underlying UGI implementation
of Hadoop infrastructure is used to harden the security using the keytab cache.
For Yarn mode of deployment, delegation tokens are created and populated to
container environment (App Master/JM and TM).
There are two areas of improvement that current implementation lacks:
1) Tokens will be expired in due course and hence it impacts long running
jobs
2) Missing functionality to support secure connection to Kafka and ZK
(Kafka 0.9 and latest ZK versions are supporting kerberos based authentication
using SASL/JAAS)
This PR addresses above gaps by providing Keytab support to securely
communicate to Hadoop and Kafka/ZK services.
1) Additional Configurations:
Below new security specific configurations are added to the Flink
configuration file.
a) security.principal - user principal that Flink process/connectors should
authenticate as
b) security.keytab - keytab file location
In standlone mode, it is assumed that the configurations pre-exists (manual
process) on all cluster nodes from where the JM and TMs will be running.
In Yarn mode, the configuration (and keytab file) is expected only on the
node from where YarnCLI or FlinkCLI will be invoked. Application code takes
care of copying Keytab file to JM/TM Yarn containers as local resource for
lookup.
In the absence of providing security configurations, the delegation token
mechanism still works to support backward compatibility (manual kinit before
starting JM/TMs).
2) Process-wide in-memory JAAS configuration to enable Kafka/ZK secure
authentication.
The JAAS configuration plays a critical role in authentication for
Kerberized application. Kafka/ZK login module code is expected to construct a
login context based on supplied JAAS configuration file entries and
authenticates to produce a subject. The context is constructed with an
application name which acts as a lookup key into the configuration, yielding
one or more login modules. The login module implements the specific strategy,
such as using a configured keytab or using the userâs ticket cache.
Instead of managing per-connector JAAS configuration file, a process-wide
JAAS configuration object is initialized during Flink bootstrap phase, thus
providing a singular login module to all callers configured to login using the
supplied keytab.
(https://docs.oracle.com/javase/7/docs/api/javax/security/auth/login/Configuration.html#setConfiguration(javax.security.auth.login.Configuration)
To summarize, following sequence happens when the secure configuration is
enabled.
Flink bootstrap code (both Yarn and Standalone) initializes security
context by
a) Initializing UGI with the supplied keytab and principal which takes care
of handling Kerberos authentication and login renewal for Hadoop services.
b) Creating process-wide JAAS configuration object for Kafka/ZK login
modules to support Kerberos/SASL authentication. Login renewals are
automatically taken care by ZK and Kafka login module implementation.
Some additional details are provided in the documentation page as well that
can be referenced from here.
(https://github.com/vijikarthi/flink/blob/FLINK-3929/docs/internals/flink_security.md)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---