[ 
https://issues.apache.org/jira/browse/FLINK-28291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiulong.zhu updated FLINK-28291:
--------------------------------
    Attachment:     (was: 
0001-add-KerberosDelegationTokenManager-to-renew-token-pe.patch)

> Add kerberos delegation token renewer feature instead of logged from keytab 
> individually
> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-28291
>                 URL: https://issues.apache.org/jira/browse/FLINK-28291
>             Project: Flink
>          Issue Type: New Feature
>          Components: Deployment / YARN
>    Affects Versions: 1.13.5
>            Reporter: jiulong.zhu
>            Priority: Minor
>             Fix For: 1.13.5
>
>
> h2. 1. Design
> LifeCycle of delegation token in RM:
>  # Container starts with DT given by client.
>  # Enable delegation token renewer by:
>  ## set {{security.kerberos.token.renew.enabled}} true, default false. And
>  ## specify {{security.kerberos.login.keytab}} and 
> {{security.kerberos.login.principal}}
>  # When enabled delegation token renewer, the renewer thread will re-obtain 
> tokens from DelegationTokenProvider(only HadoopFSDelegationTokenProvider 
> now). Then the renewer thread will broadcast new tokens to RM locally, all 
> JMs and all TMs by RPCGateway.
>  # RM process adds new tokens in context by UserGroupInformation.
> LifeCycle of delegation token in JM / TM:
>  # TaskManager starts with keytab stored in remote hdfs.
>  # When registered successfully, JM / TM get the current tokens of RM boxed 
> by {{JobMasterRegistrationSuccess}} / {{{}TaskExecutorRegistrationSuccess{}}}.
>  # JM / TM process add new tokens in context by UserGroupInformation.
> It’s too heavy and unnecessary to retrieval leader of ResourceManager by 
> HAService, so DelegationTokenManager is instanced by ResourceManager. So 
> DelegationToken can hold the reference of ResourceManager, instead of RM 
> RPCGateway or self gateway.
> h2. 2. Test
>  # No local junit test. It’s too heavy to build junit environments including 
> KDC and local hadoop.
>  # Cluster test
> step 1: Specify krb5.conf with short token lifetime(ticket_lifetime, 
> renew_lifetime) when submitting flink application.
> ```
> {{flink run .... -yD security.kerberos.token.renew.enabled=true -yD 
> security.kerberos.krb5-conf.path= /home/work/krb5.conf -yD 
> security.kerberos.login.use-ticket-cache=false ...}}
> ```
> step 2: Watch token identifier changelog and synchronizer between rm and 
> worker.
> >> 
> In RM / JM log, 
> 2022-06-28 15:13:03,509 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> New token (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) 
> created in KerberosDelegationToken, and next schedule delay is 64799880 ms. 
> 2022-06-28 15:13:03,529 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> Updating delegation tokens for current user. 2022-06-28 15:13:04,729 INFO 
> org.apache.flink.runtime.util.HadoopUtils [] - JobMaster receives new token 
> (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) from RM.
> … 
> 2022-06-29 09:13:03,732 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> New token (HDFS_DELEGATION_TOKEN token 52310 for work on ha-hdfs:newfyyy) 
> created in KerberosDelegationToken, and next schedule delay is 64800045 ms.
> 2022-06-29 09:13:03,805 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> Updating delegation tokens for current user. 
> 2022-06-29 09:13:03,806 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> JobMaster receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on 
> ha-hdfs:newfyyy) from RM.
> >> 
> In TM log, 
> 2022-06-28 15:13:17,983 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52101 for work on 
> ha-hdfs:newfyyy) from RM. 
> 2022-06-28 15:13:18,016 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> Updating delegation tokens for current user. 
> … 
> 2022-06-29 09:13:03,809 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on 
> ha-hdfs:newfyyy) from RM.
> 2022-06-29 09:13:03,836 INFO org.apache.flink.runtime.util.HadoopUtils [] - 
> Updating delegation tokens for current user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to