[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21178 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185432085 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- So shall we change to `Utils.getUGI`? Basically it is the same. But as @mridulm mentioned, someone may set this "HADOOP_USER_NAME", though doAs is not worked in STS. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185427751 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -92,7 +95,26 @@ public String getAuthName() { public static final String HS2_PROXY_USER = "hive.server2.proxy.user"; public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken"; - public HiveAuthFactory(HiveConf conf) throws TTransportException { + private static Field keytabFile = null; + private static Method getKeytab = null; + static { +Class clz = UserGroupInformation.class; +try { + keytabFile = clz.getDeclaredField("keytabFile"); + keytabFile.setAccessible(true); +} catch (NoSuchFieldException nfe) { + keytabFile = null; --- End diff -- Ok, I just remember Steve commented on some JIRAs about this thing, but I cannot remember where it is. Anyway I will log some logs here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185423983 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- Ah ok, Utils.getUGI handles proxying in HS2 iirc - this is, currently, not relevant in our case imo. You are right, we could have simply used `Utils.getUGI` instead of `UserGroupInformation.getCurrentUser` (it would have been more appropriate as well). Something to keep in mind in case users are relying on HADOOP_USER_NAME for STS (highly doubt it, but you never know !) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185423072 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -92,7 +95,26 @@ public String getAuthName() { public static final String HS2_PROXY_USER = "hive.server2.proxy.user"; public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken"; - public HiveAuthFactory(HiveConf conf) throws TTransportException { + private static Field keytabFile = null; + private static Method getKeytab = null; + static { +Class clz = UserGroupInformation.class; +try { + keytabFile = clz.getDeclaredField("keytabFile"); + keytabFile.setAccessible(true); +} catch (NoSuchFieldException nfe) { + keytabFile = null; --- End diff -- I am not sure what you mean by that statement. ``` $ $ find sql/hive-thriftserver/ -type f | grep scala$ | xargs grep -i 'log.*"' | wc -l 42 ``` Log messages help us debug issues without needing to modify code; and this is an excellent case of where logging at appropriate level will help us unearth issues in production. I am not sure what the concern here is. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185413718 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- @mridulm thanks for your answer. Actually my question was a bit different, but @jerryshao answered me: which is the difference between `UserGroupInformation.getCurrentUser` and `Utils.getUGI()`? Since `Utils.getUGI()` does the same unless `HADOOP_USER_NAME` is set, my feeling was that using either one or the other function is the same here, so I just wondered why using one on one place and the other few lines after and not always the same one: I just wanted to make sure that they are actually the same. Thanks for you kind answers. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185407534 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -92,7 +95,26 @@ public String getAuthName() { public static final String HS2_PROXY_USER = "hive.server2.proxy.user"; public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken"; - public HiveAuthFactory(HiveConf conf) throws TTransportException { + private static Field keytabFile = null; + private static Method getKeytab = null; + static { +Class clz = UserGroupInformation.class; +try { + keytabFile = clz.getDeclaredField("keytabFile"); + keytabFile.setAccessible(true); +} catch (NoSuchFieldException nfe) { + keytabFile = null; --- End diff -- In the current STS code, we avoid using any log method. You can check the code, all the log method which used before (in Spark1) is removed now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185404453 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -92,7 +95,26 @@ public String getAuthName() { public static final String HS2_PROXY_USER = "hive.server2.proxy.user"; public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken"; - public HiveAuthFactory(HiveConf conf) throws TTransportException { + private static Field keytabFile = null; + private static Method getKeytab = null; + static { +Class clz = UserGroupInformation.class; +try { + keytabFile = clz.getDeclaredField("keytabFile"); + keytabFile.setAccessible(true); +} catch (NoSuchFieldException nfe) { + keytabFile = null; --- End diff -- nit: We should include a debug message (not INFO) indicating the exception. Same for getKeytab below. This will help debug in case the signatures of these private methods changes later, and we are not sure what is happening. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r185406504 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- Assuming I understood your question @mgaido91, we use Utils.getUGI only when we do an explicit login; else fallback on originalUgi (the current user). This is because after a login, the instance of ugi changes from underneath us in UserGroupInformation class. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184844613 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- Sure, I will change it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184841250 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- It is called twice right now; but as a util method which can be used in other place, let us not intentionally introduce known inefficiencies. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184833705 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- This will only be called twice in the initialization stage, so there should not be large overhead. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184833443 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- I don't think there's any particular reason, we just copy what HS2 did before. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184833381 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -18,14 +18,11 @@ package org.apache.hive.service.auth; import java.io.IOException; +import java.lang.reflect.Field; +import java.lang.reflect.Method; import java.net.InetSocketAddress; import java.net.UnknownHostException; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.HashMap; -import java.util.List; -import java.util.Locale; -import java.util.Map; +import java.util.*; --- End diff -- This is automatically done by my intellij idea, will revert back. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184784052 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- You dont want to reflect on class to get to field/method each time method is invoked. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184774268 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -18,14 +18,11 @@ package org.apache.hive.service.auth; import java.io.IOException; +import java.lang.reflect.Field; +import java.lang.reflect.Method; import java.net.InetSocketAddress; import java.net.UnknownHostException; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.HashMap; -import java.util.List; -import java.util.Locale; -import java.util.Map; +import java.util.*; --- End diff -- nit: I have always seen that we are trying to avoid to have imports of `.*` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184774034 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: HiveServer2, sqlContext: SQLC if (UserGroupInformation.isSecurityEnabled) { try { -HiveAuthFactory.loginFromKeytab(hiveConf) -sparkServiceUGI = Utils.getUGI() +val principal = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL) +val keyTabFile = hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB) +if (principal.isEmpty || keyTabFile.isEmpty) { + throw new IOException( +"HiveServer2 Kerberos principal or keytab is not correctly configured") +} + +val originalUgi = UserGroupInformation.getCurrentUser --- End diff -- just curious, why here we are using `UserGroupInformation.getCurrentUser` and a few lines after `Utils.getUGI()`? I see that they basically do the same, but I am interested if there are reasons for doing this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184672313 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- What is the purpose of moving both field and method out of this method? I'm not sure is there any difference. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184625530 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); --- End diff -- `Object.equals` for keytab check ? I dont think we invoke this when `keytab` is null currently, but would be good to future proof this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184626752 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); +} + } catch (Throwable t) { +return null; + } +} catch (Throwable t) { + return null; +} --- End diff -- Let us catch specific exceptions for method/field invocation; not throwable. The code I had initially written was poor in this regard btw, and was catching throwable - was bad form. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184625003 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -107,9 +109,16 @@ public HiveAuthFactory(HiveConf conf) throws TTransportException { authTypeStr = AuthTypes.NONE.getAuthName(); } if (authTypeStr.equalsIgnoreCase(AuthTypes.KERBEROS.getAuthName())) { -saslServer = ShimLoader.getHadoopThriftAuthBridge() - .createServer(conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB), - conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)); +String principal = conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL); +String keytab = conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB); +if (needUgiLogin(UserGroupInformation.getCurrentUser(), + SecurityUtil.getServerPrincipal(principal, "0.0.0.0"), keytab)) { --- End diff -- This will always return `false` : though the additional check will not hurt. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184626560 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); +field.setAccessible(true); +return (String) field.get(null); + } +} catch (NoSuchFieldException e) { + try { +synchronized (clz) { + // In Hadoop 3 we don't have "keytabFile" field, instead we should use private method + // getKeytab(). + Method method = clz.getDeclaredMethod("getKeytab"); + method.setAccessible(true); + return (String) method.invoke(UserGroupInformation.getCurrentUser()); --- End diff -- I was not aware of this change, thanks for fixing this ! Let us move both field and method out of getKeytabFromUgi. ``` synchronized(UserGroupInformation.class) { if (null != keyTabField) { return keyTabField.get(null); } else { return (String) method.invoke(UserGroupInformation.getCurrentUser()) } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/21178#discussion_r184625977 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java --- @@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, String proxyUser, String i } } + public static boolean needUgiLogin(UserGroupInformation ugi, String principal, String keytab) { +return null == ugi || !ugi.hasKerberosCredentials() || !ugi.getUserName().equals(principal) || + !keytab.equals(getKeytabFromUgi()); + } + + private static String getKeytabFromUgi() { +Class clz = UserGroupInformation.class; +try { + synchronized (clz) { +Field field = clz.getDeclaredField("keytabFile"); --- End diff -- The field lookup + setAccessible can be pulled out of the method and into a static variable. The field.get() itself should be within the sync block. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/21178 [SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeytab in STS ## What changes were proposed in this pull request? Spark ThriftServer will call UGI.loginUserFromKeytab twice in initialization. This is unnecessary and will cause various potential problems, like Hadoop IPC failure after 7 days, or RM failover issue and so on. So here we need to remove all the unnecessary login logics and make sure UGI in the context never be created again. Note this is actually a HS2 issue, If later on we upgrade supported Hive version, the issue may already be fixed in Hive side. ## How was this patch tested? Local verification in secure cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-24110 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21178.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21178 commit fd48235e5a835f4a701c4ce303fbef1d74b2f095 Author: jerryshao Date: 2018-04-27T07:21:52Z Avoid UGI.loginUserFromKeytab in STS --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org