Hi Vinod,
Thank you for the quick response.
To give you more context, I would like to reduce the load on the remote
target service. My idea was that, when YARN renews the token, the remote
service would also return an updated piece of data to YARN. Then YARN would
somehow share that piece of data securely with the containers.
This would avoid each container having to directly connect to the target
service, as they could just use the updated piece of data shared by YARN.
This way, the load on the target service would be drastically reduced as
only YARN would interact with it directly from time to time.
If there's no way in Hadoop proper to implement this behavior, perhaps I
could use a separate caching service that is local to the YARN cluster?
YARN would periodically update the local cache, and each container would
get the data from the local cache instead of the remote service. Obviously,
the local cache would need to authenticate the containers, probably using
the delegation token.
Is there maybe such a local caching service that is already available in
the Hadoop ecosystem to implement this? Or could I maybe build my own
somewhat easily using some existing Hadoop features?
Thank you,
Julien
On Fri, Oct 7, 2022 at 10:18 AM Vinod Kumar Vavilapalli
wrote:
> There’s no way to do that.
>
> Once YARN launches containers, it doesn’t communicate with them for
> anything after that. The tasks / containers can obviously always reach out
> to YARN services. But even that in this case is not helpful because YARN
> never exposes through APIs what it is doing with the tokens or when it is
> renewing them.
>
> What is it that you are doing? What new information are you trying to
> share with the tasks? What framework is this? A custom YARN app or
> MapReduce / Tez / Spark / Flink etc..?
>
> Thanks
> +Vinod
>
> On Oct 7, 2022, at 10:40 PM, Julien Phalip wrote:
>
> Hi,
>
> IIUC, when a distributed job is started, Yarn first obtains a delegation
> token from the target resource, then securely pushes the delegation token
> to the individual tasks. If the job lasts longer than a given period of
> time, then Yarn renews the delegation token (or more precisely, extends its
> lifetime), therefore allowing the tasks to continue using the delegation
> token. This is based on the assumption that the delegation token itself is
> static and doesn't change (only its lifetime can be extended on the target
> resource's server).
>
> I'm building a custom service where I'd like to share new information with
> the tasks once the delegation token has been renewed. Is there a way to let
> Yarn push new information to the running tasks right after renewing the
> token?
>
> Thanks,
>
> Julien
>
>
>