Hi all! A quick question: Is this a special case of the security improvements proposed in this thread [1], or a separate proposal all together?
Stephan [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html On Tue, Dec 18, 2018 at 8:06 PM Rong Rong <walter...@gmail.com> wrote: > Hi Shuyi, > > Yes. I think the impersonation is a very much valid question! This can > actually be considered as 2 questions as I stated in the doc. > 1. In the doc I stated that impersonation should be implemented on the > user-side code and should only invoke the cluster client as the actual user > joe'. > 2. However, since currently the cluster client assumes no impersonation at > all, many of the code assumes that a fully authorized client can be > instantiated with the same authority that the actual Flink cluster has. > When impersonation is enabled, this might not be the case. For example, if > impersonation is in place, most likely the cluster client running on joe's > behalf will not, and should not have access to keytab file of 'joe'. > Instead, a delegation token is used. Thus the second part of the doc is > trying to address this issue. > > -- > Rong > > On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <suez1...@gmail.com> wrote: > > > Hi Rong, thanks a lot for the proposal. Currently, Flink assume the > keytab > > is located in a remote DFS. Pre-installing Keytabs statically in YARN > node > > local filesystem is a common approach, so I think we should support this > > mode in Flink natively. As an optimazation to reduce the KDC access > > frequency, we should also support method 3 (the DT approach) as discussed > > in [1]. A question is that why do we need to implement impersonation in > > Flink? I assume the superuser can do the impersonation for 'joe' and > 'joe' > > can then invoke Flink client to deploy the job. Thanks a lot. > > > > Shuyi > > > > [1] > > > > > https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit > > > > On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <walter...@gmail.com> wrote: > > > > > Hi All, > > > > > > We have been experimenting integration of Kerberos with Flink in our > Corp > > > environment and found out some limitations on the current > Flink-Kerberos > > > security mechanism running with Apache YARN. > > > > > > Based on the Hadoop Kerberos security guide [1]. Apparently there are > > only > > > a subset of the suggested long-running service security mechanism is > > > supported in Flink. Furthermore, the current model does not work well > > with > > > superuser impersonating actual users [2] for deployment purposes, which > > is > > > a widely adopted way to launch application in corp environments. > > > > > > We would like to propose an improvement [3] to introduce the other > > comment > > > methods [1] for securing long-running application on YARN and enable > > > impersonation mode. Any comments and suggestions are highly > appreciated. > > > > > > Many thanks, > > > Rong > > > > > > [1] > > > > > > > > > https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services > > > [2] > > > > > > > > > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html > > > [3] > > > > > > > > > https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing > > > > > > > > > -- > > "So you have to trust that the dots will somehow connect in your future." > > >