Hey 杨光, thanks for looking into this in such a detail. Unfortunately, I'm not sure what the expected behaviour is (whether the change in behaviour was accidental or on purpose).
Let me pull in Gordon who has worked quite a bit on the Kerberos related components in Flink. @Gordon: 1) Do you know what the expected behaviour is here? 2) How can he work around this issue in 1.4? – Ufuk On Fri, Dec 15, 2017 at 11:34 AM, 杨光 <laolang...@gmail.com> wrote: > Hi, > I am using flink single-job mode on YARN to read data from a kafka > cluster installation configured for Kerberos. When i upgrade flink to > 1.4.0 , the yarn application can not run normally and logs th error > like this: > > Exception in thread "main" java.lang.RuntimeException: > org.apache.flink.configuration.IllegalConfigurationException: Kerberos > login configuration is invalid; keytab is unreadable > at > org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:160) > at > org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65) > at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala) > Caused by: org.apache.flink.configuration.IllegalConfigurationException: > Kerberos login configuration is invalid; keytab is unreadable > at > org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:139) > at > org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90) > at > org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71) > at > org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139 > > > So i add some logs for the method "SecurityConfiguration.validate()" > and rebuild the flink package. > > private void validate() { > if (!StringUtils.isBlank(keytab)) { > // principal is required > if (StringUtils.isBlank(principal)) { > throw new IllegalConfigurationException("Kerberos login > configuration is invalid; keytab requires a principal."); > } > > // check the keytab is readable > File keytabFile = new File(keytab); > > if (!keytabFile.exists()) { > throw new IllegalConfigurationException("WTF! keytabFile is > not exist ! keytab:" + keytab); > } > > if (!keytabFile.isFile()) { > throw new IllegalConfigurationException("WTF! keytabFile is > not file ! keytab:" + keytab); > } > > if (!keytabFile.canRead()) { > throw new IllegalConfigurationException("WTF! keytabFile is > not readalbe ! keytab:" + keytab); > } > > if (!keytabFile.exists() || !keytabFile.isFile() || > !keytabFile.canRead()) { > throw new IllegalConfigurationException("Kerberos login > configuration is invalid; keytab is unreadable"); > } > } > } > > After that , the yarn logs error like this : > 017-12-15 17:14:36,314 INFO > org.apache.flink.yarn.YarnTaskManagerRunner - > localKeytabPath: > /data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000002/krb5.keytab > 2017-12-15 17:14:36,315 INFO > org.apache.flink.yarn.YarnTaskManagerRunner - YARN > daemon is running as: hadoop Yarn client user obtainer: hadoop > 2017-12-15 17:14:36,315 INFO > org.apache.flink.yarn.YarnTaskManagerRunner - > ResourceID assigned for this container: > container_e05_1513310528578_0009_01_000002 > 2017-12-15 17:14:36,321 ERROR > org.apache.flink.yarn.YarnTaskManagerRunner - > Exception occurred while launching Task Manager > org.apache.flink.configuration.IllegalConfigurationException: WTF! > keytabFile is not exist ! > keytab:/data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000001/krb5.keytab > at > org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:140) > at > org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90) > at > org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71) > at > org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139) > at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65) > at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala) > > > These logs tell the "keytabFile" value is different from the > "localKeytabPath". I searched the > "org.apache.flink.yarn.YarnTaskManagerRunner" class source code and > found there are > something different betwee 1.3.2 and 1.4.0 > > 1.3.2 > > //To support Yarn Secure Integration Test Scenario > File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME); > > if (krb5Conf.exists() && krb5Conf.canRead()) { > String krb5Path = krb5Conf.getAbsolutePath(); > LOG.info("KRB5 Conf: {}", krb5Path); > hadoopConfiguration = new org.apache.hadoop.conf.Configuration(); > > hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION, > "kerberos"); > > hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, > "true"); > } > > // set keytab principal and replace path with the local path of the > shipped keytab file in NodeManager > if (localKeytabPath != null && remoteKeytabPrincipal != null) { > configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB, > localKeytabPath); > configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL, > remoteKeytabPrincipal); > } > > > 1.4.0 > > //To support Yarn Secure Integration Test Scenario > File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME); > > if (krb5Conf.exists() && krb5Conf.canRead()) { > String krb5Path = krb5Conf.getAbsolutePath(); > LOG.info("KRB5 Conf: {}", krb5Path); > org.apache.hadoop.conf.Configuration hadoopConfiguration = new > org.apache.hadoop.conf.Configuration(); > > hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION, > "kerberos"); > > hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, > "true"); > > // set keytab principal and replace path with the local path of the > shipped keytab file in NodeManager > if (localKeytabPath != null && remoteKeytabPrincipal != null) { > configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB, > localKeytabPath); > configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL, > remoteKeytabPrincipal); > } > > sc = new SecurityConfiguration(configuration, > Collections.singletonList(securityConfig -> new > HadoopModule(securityConfig, hadoopConfiguration))); > > } else { > sc = new SecurityConfiguration(configuration); > > } > > > > In the previous version ,the "SecurityOptions.KERBEROS_LOGIN_KEYTAB" > is always set the same with "localKeytabPath" but in 1.4.0 only if the > "krb5Conf.exists() && krb5Conf.canRead()" retrun true . And in my test > case ,it looks like the code only run the else default code。 > > > Are there something i counld do to work around this problem ? > > Thanks!