Re: Communicating between yarn and tasks after delegation token renewal

2022-10-08 Thread Vinod Kumar Vavilapalli
There’s no way to do that.

Once YARN launches containers, it doesn’t communicate with them for anything 
after that. The tasks / containers can obviously always reach out to YARN 
services. But even that in this case is not helpful because YARN never exposes 
through APIs what it is doing with the tokens or when it is renewing them.

What is it that you are doing? What new information are you trying to share 
with the tasks? What framework is this? A custom YARN app or MapReduce / Tez / 
Spark / Flink etc..? 

Thanks
+Vinod

> On Oct 7, 2022, at 10:40 PM, Julien Phalip  wrote:
> 
> Hi,
> 
> IIUC, when a distributed job is started, Yarn first obtains a delegation 
> token from the target resource, then securely pushes the delegation token to 
> the individual tasks. If the job lasts longer than a given period of time, 
> then Yarn renews the delegation token (or more precisely, extends its 
> lifetime), therefore allowing the tasks to continue using the delegation 
> token. This is based on the assumption that the delegation token itself is 
> static and doesn't change (only its lifetime can be extended on the target 
> resource's server).
> 
> I'm building a custom service where I'd like to share new information with 
> the tasks once the delegation token has been renewed. Is there a way to let 
> Yarn push new information to the running tasks right after renewing the token?
> 
> Thanks,
> 
> Julien



Re: Communicating between yarn and tasks after delegation token renewal

2022-10-07 Thread Julien Phalip
Hi Vinod,

Thank you for the quick response.

To give you more context, I would like to reduce the load on the remote
target service. My idea was that, when YARN renews the token, the remote
service would also return an updated piece of data to YARN. Then YARN would
somehow share that piece of data securely with the containers.

This would avoid each container having to directly connect to the target
service, as they could just use the updated piece of data shared by YARN.
This way, the load on the target service would be drastically reduced as
only YARN would interact with it directly from time to time.

If there's no way in Hadoop proper to implement this behavior, perhaps I
could use a separate caching service that is local to the YARN cluster?
YARN would periodically update the local cache, and each container would
get the data from the local cache instead of the remote service. Obviously,
the local cache would need to authenticate the containers, probably using
the delegation token.

Is there maybe such a local caching service that is already available in
the Hadoop ecosystem to implement this? Or could I maybe build my own
somewhat easily using some existing Hadoop features?

Thank you,

Julien

On Fri, Oct 7, 2022 at 10:18 AM Vinod Kumar Vavilapalli 
wrote:

> There’s no way to do that.
>
> Once YARN launches containers, it doesn’t communicate with them for
> anything after that. The tasks / containers can obviously always reach out
> to YARN services. But even that in this case is not helpful because YARN
> never exposes through APIs what it is doing with the tokens or when it is
> renewing them.
>
> What is it that you are doing? What new information are you trying to
> share with the tasks? What framework is this? A custom YARN app or
> MapReduce / Tez / Spark / Flink etc..?
>
> Thanks
> +Vinod
>
> On Oct 7, 2022, at 10:40 PM, Julien Phalip  wrote:
>
> Hi,
>
> IIUC, when a distributed job is started, Yarn first obtains a delegation
> token from the target resource, then securely pushes the delegation token
> to the individual tasks. If the job lasts longer than a given period of
> time, then Yarn renews the delegation token (or more precisely, extends its
> lifetime), therefore allowing the tasks to continue using the delegation
> token. This is based on the assumption that the delegation token itself is
> static and doesn't change (only its lifetime can be extended on the target
> resource's server).
>
> I'm building a custom service where I'd like to share new information with
> the tasks once the delegation token has been renewed. Is there a way to let
> Yarn push new information to the running tasks right after renewing the
> token?
>
> Thanks,
>
> Julien
>
>
>


Communicating between yarn and tasks after delegation token renewal

2022-10-07 Thread Julien Phalip
Hi,

IIUC, when a distributed job is started, Yarn first obtains a delegation
token from the target resource, then securely pushes the delegation token
to the individual tasks. If the job lasts longer than a given period of
time, then Yarn renews the delegation token (or more precisely, extends its
lifetime), therefore allowing the tasks to continue using the delegation
token. This is based on the assumption that the delegation token itself is
static and doesn't change (only its lifetime can be extended on the target
resource's server).

I'm building a custom service where I'd like to share new information with
the tasks once the delegation token has been renewed. Is there a way to let
Yarn push new information to the running tasks right after renewing the
token?

Thanks,

Julien