[ 
https://issues.apache.org/jira/browse/FLINK-17641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168775#comment-17168775
 ] 

Robert Metzger edited comment on FLINK-17641 at 7/31/20, 1:26 PM:
------------------------------------------------------------------

Besides the approach you've already mentioned (using SSL), the Flink community 
recommends setting up a service in front of the Flink HTTP endpoints that 
controls access to it.
You could for example use nginx configured as a reverse proxy for that.
I agree that this solution is not very elegant in YARN, where the Flink 
sessions are probably rather short-lived, and you would need to dynamically 
figure out the HTTP endpoint of each Flink application.
Also, you need to either forbid users in your intranet to access the ports / 
ip-range of the Flink sessions OR you setup SSL between nginx and the Flink 
instances (can be the same certs).

This is a fairly frequent feature request for the Flink REST interfaces. In the 
past, we have rejected this because it out of the scope of Flink ("feature 
creep").

However, I see that it is fairly difficult to implement this for a YARN setup, 
in particular figuring out the right ip:port of the current JM leader (which 
relies on Zookeeper if you are using HA)
What I could imagine as an addition to Flink is adding a new command to the CLI 
frontend, that returns some cluster information, including the leader ip:port.
Ideally, nginx is able to query this information from the command line.
I've filed a ticket for adding this command to the CLI: FLINK-17641.

Alternatively, you could set up a cron job that updates the nginx configuration 
based on the available JobManagers from YARN. Maybe this is helpful as well: 
https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceDiscovery.html



was (Author: rmetzger):
Besides the approach you've already mentioned (using SSL), the Flink community 
recommends setting up a service in front of the Flink HTTP endpoints that 
controls access to it.
You could for example use nginx configured as a reverse proxy for that.
I agree that this solution is not very elegant in YARN, where the Flink 
sessions are probably rather short-lived, and you would need to dynamically 
figure out the HTTP endpoint of each Flink application.
Also, you need to either forbid users in your intranet to access the ports / 
ip-range of the Flink sessions OR you setup SSL between nginx and the Flink 
instances (can be the same certs).

This is a fairly frequent feature request for the Flink REST interfaces. In the 
past, we have rejected this because it out of the scope of Flink ("feature 
creep").

However, I see that it is fairly difficult to implement this for a YARN setup, 
in particular figuring out the right ip:port of the current JM leader (which 
relies on Zookeeper if you are using HA)
What I could imagine as an addition to Flink is adding a new command to the CLI 
frontend, that returns some cluster information, including the leader ip:port.
Ideally, nginx is able to query this information from the command line.
I've filed a ticket for adding this command to the CLI: FLINK-17641.


> How to secure flink applications on yarn on multi-tenant environment
> --------------------------------------------------------------------
>
>                 Key: FLINK-17641
>                 URL: https://issues.apache.org/jira/browse/FLINK-17641
>             Project: Flink
>          Issue Type: Wish
>          Components: Deployment / YARN
>            Reporter: Ethan Li
>            Priority: Major
>
> This is a question I wish to get some insights on. 
> We are trying to support and secure flink on shared yarn cluster. Besides the 
> security provided by yarn side (queueACL, kerberos), what I noticed is that 
> flink CLI can still interact with the flink job as long as it knows the 
> jobmanager rpc port/hostname and rest.port, which can be obtained easily with 
> yarn command. 
> Also on the UI side, on yarn cluster, users can visit flink job UI via yarn 
> proxy using browser. As long as the user can authenticate and view yarn 
> resourcemanager webpage, he/she can visit the flink UI without any problem. 
> This basically means Flink UI is wide-open to corp internal users.
> On the internal connection side, I am aware of the support added in 1.10 to 
> limit the mTLS connection by configuring 
> security.ssl.internal.cert.fingerprint 
> (https://ci.apache.org/projects/flink/flink-docs-stable/ops/security-ssl.html)
> This works but it is not very flexible. Users need to update the config if 
> the cert changes before they submit a new job.
> I asked the similar question on the mailing list before. I am really 
> interested in how other folks deal with this issue. Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to