Hi Jay,

So yes most of YarnAppLauncher is still relying on kerberos login. We are using 
similar thing as OTHER_NAMENODES but it just happen to be the case that Azkaban 
is taking over the heavy-lifting part.


Regards,
Lei
________________________________
From: Jay Sen <[email protected]>
Sent: Friday, September 11, 2020 12:23 AM
To: [email protected] <[email protected]>
Subject: Re: Yarn token mgmt

Hi Lei, yes, that was helpful.

the more i looked into it. I see all current method from yarnAppLauncher is
limited to local hadoop cluster with kerberos only. ( as it uses
getDelegationToken for the local hdfs)
Moreover, it does not honor multiple tokens like KMS-dt or possibly others
( even though I sae other token fetching functions, its all for the local
cluster ), but not for the remote.

The right way of doing this can be registrering the remote namenodes
dynamically at runtime from the CopySource as it appears, probably for the
long term solution.
For now, I was able to solve it via bunch of code changes with similar
functionality of OTHEER_NAMENODES.

Thanks for the pointers.
-Jay

On Tue, Sep 8, 2020 at 8:59 PM Lei Sun <[email protected]> wrote:

> Hi Jay,
>
>
> Our workflow scheduler (Azkaban) did that for Gobblin to fetch remote
> tokens in the beginning of the job and add it to UGI.
>
> Hope it helps.
>
>
> Lei
>
>
> ________________________________
> From: Jay Sen <[email protected]>
> Sent: Tuesday, September 8, 2020 6:46 PM
> To: [email protected] <[email protected]>
> Subject: Yarn token mgmt
>
> Hi Gobblin Dev team,
>
> I see the configs and functionality around creating and renewing the token
> off of provided keytab file, but didn't ind any functionality that creates
> token for remote system.
>
> so question is if we run Gobblin for hadoop to hadoop job ( source =
> CopySource ), how does it manages creating and renewing token for the
> remote hadoop cluster.
>
> Thanks
> jay
>

Reply via email to