Hey 杨光,

thanks for looking into this in such a detail. Unfortunately, I'm not
sure what the expected behaviour is (whether the change in behaviour
was accidental or on purpose).

Let me pull in Gordon who has worked quite a bit on the Kerberos
related components in Flink.

@Gordon:
1) Do you know what the expected behaviour is here?
2) How can he work around this issue in 1.4?

– Ufuk

On Fri, Dec 15, 2017 at 11:34 AM, 杨光 <laolang...@gmail.com> wrote:
> Hi,
> I am using flink single-job mode on YARN to read data from a kafka
> cluster  installation configured for Kerberos. When i upgrade flink to
> 1.4.0 , the yarn application can not run normally and logs th error
> like this:
>
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.flink.configuration.IllegalConfigurationException: Kerberos
> login configuration is invalid; keytab is unreadable
>         at 
> org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:160)
>         at 
> org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65)
>         at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala)
> Caused by: org.apache.flink.configuration.IllegalConfigurationException:
> Kerberos login configuration is invalid; keytab is unreadable
>         at 
> org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:139)
>         at 
> org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90)
>         at 
> org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71)
>         at 
> org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139
>
>
> So i add some logs for the method "SecurityConfiguration.validate()"
> and rebuild the flink  package.
>
> private void validate() {
>    if (!StringUtils.isBlank(keytab)) {
>       // principal is required
>       if (StringUtils.isBlank(principal)) {
>          throw new IllegalConfigurationException("Kerberos login
> configuration is invalid; keytab requires a principal.");
>       }
>
>       // check the keytab is readable
>       File keytabFile = new File(keytab);
>
>       if (!keytabFile.exists()) {
>          throw new IllegalConfigurationException("WTF! keytabFile is
> not exist ! keytab:" + keytab);
>       }
>
>       if (!keytabFile.isFile()) {
>          throw new IllegalConfigurationException("WTF! keytabFile is
> not file ! keytab:" + keytab);
>       }
>
>       if (!keytabFile.canRead()) {
>          throw new IllegalConfigurationException("WTF! keytabFile is
> not readalbe ! keytab:" + keytab);
>       }
>
>       if (!keytabFile.exists() || !keytabFile.isFile() ||
> !keytabFile.canRead()) {
>          throw new IllegalConfigurationException("Kerberos login
> configuration is invalid; keytab is unreadable");
>       }
>    }
> }
>
> After that , the yarn logs error  like  this :
> 017-12-15 17:14:36,314 INFO
> org.apache.flink.yarn.YarnTaskManagerRunner                   -
> localKeytabPath:
> /data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000002/krb5.keytab
> 2017-12-15 17:14:36,315 INFO
> org.apache.flink.yarn.YarnTaskManagerRunner                   - YARN
> daemon is running as: hadoop Yarn client user obtainer: hadoop
> 2017-12-15 17:14:36,315 INFO
> org.apache.flink.yarn.YarnTaskManagerRunner                   -
> ResourceID assigned for this container:
> container_e05_1513310528578_0009_01_000002
> 2017-12-15 17:14:36,321 ERROR
> org.apache.flink.yarn.YarnTaskManagerRunner                   -
> Exception occurred while launching Task Manager
> org.apache.flink.configuration.IllegalConfigurationException: WTF!
> keytabFile is not exist !
> keytab:/data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000001/krb5.keytab
> at 
> org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:140)
> at 
> org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90)
> at 
> org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71)
> at 
> org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139)
> at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65)
> at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala)
>
>
> These logs tell the "keytabFile" value is different from the
> "localKeytabPath".  I searched the
> "org.apache.flink.yarn.YarnTaskManagerRunner" class source code and
> found  there are
> something different betwee 1.3.2 and 1.4.0
>
> 1.3.2
>
> //To support Yarn Secure Integration Test Scenario
> File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME);
>
> if (krb5Conf.exists() && krb5Conf.canRead()) {
>    String krb5Path = krb5Conf.getAbsolutePath();
>    LOG.info("KRB5 Conf: {}", krb5Path);
>    hadoopConfiguration = new org.apache.hadoop.conf.Configuration();
>    
> hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
> "kerberos");
>    
> hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
> "true");
> }
>
> // set keytab principal and replace path with the local path of the
> shipped keytab file in NodeManager
> if (localKeytabPath != null && remoteKeytabPrincipal != null) {
>    configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB,
> localKeytabPath);
>    configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL,
> remoteKeytabPrincipal);
> }
>
>
> 1.4.0
>
> //To support Yarn Secure Integration Test Scenario
> File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME);
>
> if (krb5Conf.exists() && krb5Conf.canRead()) {
>    String krb5Path = krb5Conf.getAbsolutePath();
>    LOG.info("KRB5 Conf: {}", krb5Path);
>    org.apache.hadoop.conf.Configuration hadoopConfiguration = new
> org.apache.hadoop.conf.Configuration();
>    
> hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
> "kerberos");
>    
> hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
> "true");
>
>    // set keytab principal and replace path with the local path of the
> shipped keytab file in NodeManager
>    if (localKeytabPath != null && remoteKeytabPrincipal != null) {
>       configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB,
> localKeytabPath);
>       configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL,
> remoteKeytabPrincipal);
>    }
>
>    sc = new SecurityConfiguration(configuration,
>       Collections.singletonList(securityConfig -> new
> HadoopModule(securityConfig, hadoopConfiguration)));
>
> } else {
>    sc = new SecurityConfiguration(configuration);
>
> }
>
>
>
> In the previous version ,the "SecurityOptions.KERBEROS_LOGIN_KEYTAB"
> is always set the same with "localKeytabPath" but in 1.4.0 only if the
> "krb5Conf.exists() && krb5Conf.canRead()" retrun true . And in my test
> case ,it looks like  the code only run the else  default code。
>
>
> Are there something i counld do  to work around this problem ?
>
> Thanks!

Reply via email to