unsubscribe

2023-01-19 Thread 김병찬
제목unsubscribe


[Spark Standalone Mode] How to read from kerberised HDFS in spark standalone mode

2023-01-19 Thread Bansal, Jaimita
Hi Spark Team,

We are facing an issue when trying to read from HDFS via spark running in 
standalone cluster.  The issue comes from the executor node not able to 
authenticate. It is using auth:SIMPLE when actually we have setup auth as 
Kerberos.  Could you please help in resolving this?

Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
at 
org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:778) 
~[hadoop-common-3.1.1.7.1.7.1000-141.jar:na]


18:57:44.726 [main] DEBUG o.a.spark.deploy.SparkHadoopUtil - creating UGI for 
user: 
18:57:45.045 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login
18:57:45.046 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login 
commit
18:57:45.047 [main] DEBUG o.a.h.security.UserGroupInformation - using kerberos 
user: @GS.COM
18:57:45.047 [main] DEBUG o.a.h.security.UserGroupInformation - Using user: 
"@GS.COM" with name @GS.COM
18:57:45.047 [main] DEBUG o.a.h.security.UserGroupInformation - User entry: 
" @GS.COM"
18:57:45.047 [main] DEBUG o.a.h.security.UserGroupInformation - UGI 
loginUser:@GS.COM (auth:KERBEROS)
18:57:45.056 [main] DEBUG o.a.h.security.UserGroupInformation - 
PrivilegedAction as: (auth:SIMPLE) 
from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
18:57:45.078 [TGT Renewer for @GS.COM] DEBUG 
o.a.h.security.UserGroupInformation - Current time is 1674068265078
18:57:45.079 [TGT Renewer for @GS.COM] DEBUG 
o.a.h.security.UserGroupInformation - Next refresh is 1674136785000
18:57:45.092 [main] INFO  org.apache.spark.SecurityManager - Changing view acls 
to: root,
18:57:45.092 [main] INFO  org.apache.spark.SecurityManager - Changing modify 
acls to: root,
18:57:45.093 [main] INFO  org.apache.spark.SecurityManager - Changing view acls 
groups to:
18:57:45.093 [main] INFO  org.apache.spark.SecurityManager - Changing modify 
acls groups to:

Thanks,
Jaimita

Vice President, Data Lake Engineering
Goldman Sachs




Your Personal Data: We may collect and process information about you that may 
be subject to data protection laws. For more information about how we use and 
disclose your personal data, how we protect your information, our legal basis 
to use your information, your rights and who you can contact, please refer to: 
www.gs.com/privacy-notices


How to check the liveness of a SparkSession

2023-01-19 Thread Yeachan Park
Hi all,

We have a long running PySpark session running on client mode that
occasionally dies.

We'd like to check whether the session is still alive. One solution we came
up with was checking whether the UI is still up, but we were wondering if
there's maybe an easier way then that.

Maybe something like spark.getActiveSession() might do the same. I noticed
that it throws a connection refused error if the current spark session dies.

Are there any official/suggested ways to check this? I couldn't find much
in the docs/previous mailing lists.

Kind regards,
Yeachan