Re: [DISCUSS] Support obtaining Hive delegation tokens when submitting application to Yarn

Rui Li Tue, 26 Jan 2021 19:34:01 -0800

Hi Jie,

Thanks for the investigation. I think we can first implement pluggable DT
providers, and add renewal abilities incrementally. I'm also curious where
Spark runs its HadoopDelegationTokenManager when renewal is enabled?
Because it seems HadoopDelegationTokenManager needs access to keytab to
create new tokens, does that mean it can only run on the client side?


On Mon, Jan 25, 2021 at 10:32 AM 王 杰 <jackwan...@outlook.com> wrote:

> Hi Till,
>
> Sorry for late response, I just did some investigations about Spark. Spark
> adopted the SPI way to obtain delegations for different components. It has
> a HadoopDelegationTokenManager.scala<
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala>
> to manage all Hadoop delegation tokens including obtaining and renewing the
> delegation tokens.
>
> When the HadoopDelegationTokenManager is initializing, it will use
> ServiceLoader to load all HadoopDelegationTokenProviders in different
> connectors. As for Hive, the provider implementation is
> HadoopDelegationTokenProvider<
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala
> >.
>
> Thanks,
> Jie
>
>
> On 2021/01/13 08:51:29, Till Rohrmann <trohrm...@apache.org<mailto:
> trohrm...@apache.org>> wrote:
> > Hi Jie Wang,
> >
> > thanks for starting this discussion. To me the SPI approach sounds better
> > because it is not as brittle as using reflection. Concerning the
> > configuration, we could think about introducing some Hive specific
> > configuration options which allow us to specify these paths. How are
> other
> > projects which integrate with Hive are solving this problem?
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 12, 2021 at 4:13 PM 王 杰 <jackwan...@outlook.com<mailto:
> jackwan...@outlook.com>> wrote:
> >
> > > Hi everyone,
> > >
> > > Currently, Hive delegation token is not obtained when Flink submits the
> > > application in Yarn mode using kinit way. The ticket is
> > > https://issues.apache.org/jira/browse/FLINK-20714. I'd like to start a
> > > discussion about how to support this feature.
> > >
> > > Maybe we have two options:
> > > 1. Using a reflection way to construct a Hive client to obtain the
> token,
> > > just same as the org.apache.flink.yarn.Utils.obtainTokenForHBase
> > > implementation.
> > > 2. Introduce a pluggable delegation provider via SPI. Delegation
> provider
> > > could be placed in connector related code, so reflection is not needed
> and
> > > is more extendable.
> > >
> > >
> > >
> > > Both options have to handle how to specify the HiveConf to use. In Hive
> > > connector, user could specify both hiveConfDir and hadoopConfDir when
> > > creating HiveCatalog. The hadoopConfDir may not the same as the Hadoop
> > > configuration in HadoopModule.
> > >
> > > Looking forward to your suggestions.
> > >
> > > --
> > > Best regards!
> > > Jie Wang
> > >
> > >
> >
>


-- 
Best regards!
Rui Li

Re: [DISCUSS] Support obtaining Hive delegation tokens when submitting application to Yarn

Reply via email to