We have a case where we interact with a Kerberized service and found a simple 
workaround to distribute and make use of the driver’s Kerberos credential cache 
file in  the executors. Maybe some of the ideas there can be of help for this 
case too? Our case in on Linux though. Details: 
https://github.com/LucaCanali/Miscellaneous/blob/master/Spark_Notes/Spark_Executors_Kerberos_HowTo.md

Regards,
Luca

From: Marcelo Vanzin <van...@cloudera.com.INVALID>
Sent: Monday, October 15, 2018 18:32
To: foster.langb...@riskfrontiers.com
Cc: user <user@spark.apache.org>
Subject: Re: kerberos auth for MS SQL server jdbc driver

Spark only does Kerberos authentication on the driver. For executors it 
currently only supports Hadoop's delegation tokens for Kerberos.

To use something that does not support delegation tokens you have to manually 
manage the Kerberos login in your code that runs in executors, which might be 
tricky. It means distributing the keytab yourself (not with Spark's --keytab 
argument) and calling into the UserGroupInformation API directly.

I don't have any examples of that, though, maybe someone does. (We have a 
similar example for Kafka on our blog somewhere, but not sure how far that will 
get you with MS SQL.)


On Mon, Oct 15, 2018 at 12:04 AM Foster Langbein 
<foster.langb...@riskfrontiers.com<mailto:foster.langb...@riskfrontiers.com>> 
wrote:
Has anyone gotten spark to write to SQL server using Kerberos authentication 
with Microsoft's JDBC driver? I'm having limited success, though in theory it 
should work.

I'm using a YARN-mode 4-node Spark 2.3.0 cluster and trying to write a simple 
table to SQL Server 2016. I can get it to work if I use SQL server credentials, 
however this is not an option in my application. I need to use windows 
authentication - so-called integratedSecurity - and in particular I want to use 
a keytab file.

The solution half works - the spark driver creates a table on SQL server - so 
I'm pretty confident the Kerberos implementation/credentials etc are setup 
correctly and valid. However the executors then fail to write any data to the 
table with an exception: "java.security.PrivilegedActionException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)"

After much tracing/debugging it seems executors are behaving differently to the 
spark driver and ignoring the specification to use the credentials supplied in 
the keytab and instead trying to use the default spark cluster user. I simply 
haven't been able to force them to use what's in the keytab after trying many. 
many variations.

Very grateful if anyone has any help/suggestions/ideas on how to get this to 
work.


--

 [Image removed by sender.]


Dr Foster Langbein | Chief Technology Officer | Risk Frontiers

Level 2, 100 Christie St, St Leonards, NSW, 2065



Telephone: +61 2 8459 9777

Email: 
foster.langb...@riskfrontiers.com<mailto:foster.langb...@riskfrontiers.com> | 
Website: www.riskfrontiers.com<http://www.riskfrontiers.com/>



Risk Modelling | Risk Management | Resilience | Disaster Management | Social 
Research
Australia | New Zealand | Asia Pacific





--
Marcelo

Reply via email to