My understanding is that the token generation is handled by Spark itself as
long as you were authenticated in Kerberos when submitting the job and
spark.authenticate is set to true.

--keytab and --principal options should be used for "long" running job,
when you may need to do ticket renewal. Spark will handle it then. I may be
wrong though.

I guess it gets even more complicated if you need to access other secured
service from Spark like hbase or Phoenix, but i guess this is for another
discussion.

Regards,
Marcin

On Thu, Dec 8, 2016, 08:40 Gerard Casey <gerardhughca...@gmail.com> wrote:

> I just read an interesting comment on cloudera:
>
> What does it mean by “when the job is submitted,and you have a kinit, you
> will have TOKEN to access HDFS, you would need to pass that on, or the
> KERBEROS ticket” ?
>
> Reference
> <https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/org-apache-hadoop-security-AccessControlException-SIMPLE/td-p/28082>
>  and
> full quote:
>
> In a cluster which is kerberised there is no SIMPLE authentication. Make
> sure that you have run kinit before you run the application.
> Second thing to check: In your application you need to do the right thing
> and either pass on the TOKEN or a KERBEROS ticket.
> When the job is submitted, and you have done a kinit, you will have TOKEN
> to access HDFS you would need to pass that on, or the KERBEROS ticket.
> You will need to handle this in your code. I can not see exactly what you
> are doing at that point in the startup of your code but any HDFS access
> will require a TOKEN or KERBEROS ticket.
>
>
> Cheers,
> Wilfred
>
> On 8 Dec 2016, at 08:35, Gerard Casey <gerardhughca...@gmail.com> wrote:
>
> Thanks Marcelo.
>
> I’ve completely removed it. Ok - even if I read/write from HDFS?
>
> Trying to the SparkPi example now
>
> G
>
> On 7 Dec 2016, at 22:10, Marcelo Vanzin <van...@cloudera.com> wrote:
>
> Have you removed all the code dealing with Kerberos that you posted?
> You should not be setting those principal / keytab configs.
>
> Literally all you have to do is login with kinit then run spark-submit.
>
> Try with the SparkPi example for instance, instead of your own code.
> If that doesn't work, you have a configuration issue somewhere.
>
> On Wed, Dec 7, 2016 at 1:09 PM, Gerard Casey <gerardhughca...@gmail.com>
> wrote:
>
> Thanks.
>
> I’ve checked the TGT, principal and key tab. Where to next?!
>
> On 7 Dec 2016, at 22:03, Marcelo Vanzin <van...@cloudera.com> wrote:
>
> On Wed, Dec 7, 2016 at 12:15 PM, Gerard Casey <gerardhughca...@gmail.com>
> wrote:
>
> Can anyone point me to a tutorial or a run through of how to use Spark with
> Kerberos? This is proving to be quite confusing. Most search results on the
> topic point to what needs inputted at the point of `sparks submit` and not
> the changes needed in the actual src/main/.scala file
>
>
> You don't need to write any special code to run Spark with Kerberos.
> Just write your application normally, and make sure you're logged in
> to the KDC (i.e. "klist" shows a valid TGT) before running your app.
>
>
> --
> Marcelo
>
>
>
>
>
> --
> Marcelo
>
>
>
>

Reply via email to