Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

Gyula Fóra Thu, 03 Feb 2022 09:57:34 -0800

Hi Till!

The delegation token framework solves a few production problems, KDC
scalability is just one and probably not the most important.


As Gabor has explained some of which are:
 - Solves the problem for token renewal for long running jobs which would
currently time out and die
 - Improves security by not exposing keytabs on each node
 - Reduces KDC load

I do not think we should reject the design just because one of the things
it solves is not primarily Flink's responsibility.
Even if that is the case I think the other issues like security and general
token renewal seem very important to me.

Cheers,
Gyula

On Thu, Feb 3, 2022 at 6:34 PM Till Rohrmann <[email protected]> wrote:

> I don't have a good alternative solution but it sounds to me a bit as if we
> are trying to solve Kerberos' scalability problems within Flink. And even
> if we do it like this, there is no guarantee that it works because there
> can be other applications bombing the KDC with requests. From a
> maintainability and separation of concerns perspective I'd rather have this
> as some kind of external tool/service that makes KDC scale better and that
> Flink processes can talk to to obtain the tokens.
>
> Cheers,
> Till
>
> On Thu, Feb 3, 2022 at 6:01 PM Gabor Somogyi <[email protected]>
> wrote:
>
> > Oh and the most important reason I've forgotten.
> > Without the feature in the FLIP all secure workloads with delegation
> tokens
> > are going to stop when tokens are reaching it's max lifetime 🙂
> > This is around 7 days with default config...
> >
> > On Thu, Feb 3, 2022 at 5:30 PM Gabor Somogyi <[email protected]>
> > wrote:
> >
> > > That's not the single purpose of the feature but in some environments
> it
> > > caused problems.
> > > The main intention is not to deploy keytab to all the nodes because the
> > > attack surface is bigger + reduce the KDC load.
> > > I've already described the situation previously in this thread so
> copying
> > > it here.
> > >
> > > --------COPY--------
> > > "KDC *may* collapse under some circumstances" is the proper wording.
> > >
> > > We have several customers who are executing workloads on Spark/Flink.
> > Most
> > > of the time I'm facing their
> > > daily issues which is heavily environment and use-case dependent. I've
> > > seen various cases:
> > > * where the mentioned ~1k nodes were working fine
> > > * where KDC thought the number of requests are coming from DDOS attack
> so
> > > discontinued authentication
> > > * where KDC was simply not responding because of the load
> > > * where KDC was intermittently had some outage (this was the most nasty
> > > thing)
> > >
> > > Since you're managing relatively big cluster then you know that KDC is
> > not
> > > only used by Spark/Flink workloads
> > > but the whole company IT infrastructure is bombing it so it really
> > depends
> > > on other factors too whether KDC is reaching
> > > it's limit or not. Not sure what kind of evidence are you looking for
> but
> > > I'm not authorized to share any information about
> > > our clients data.
> > >
> > > One thing is for sure. The more external system types are used in
> > > workloads (for ex. HDFS, HBase, Hive, Kafka) which
> > > are authenticating through KDC the more possibility to reach this
> > > threshold when the cluster is big enough.
> > > --------COPY--------
> > >
> > > > The FLIP mentions scaling issues with 200 nodes; it's really
> surprising
> > > to me that such a small number of requests can already cause issues.
> > >
> > > One node/task doesn't mean 1 request. The following type of kerberos
> auth
> > > types has been seen by me which can run at the same time:
> > > HDFS, Hbase, Hive, Kafka, all DBs (oracle, mariaDB, etc...)
> Additionally
> > > one task is not necessarily opens 1 connection.
> > >
> > > All in all I don't have steps to reproduce but we've faced this
> > already...
> > >
> > > G
> > >
> > >
> > > On Thu, Feb 3, 2022 at 5:15 PM Chesnay Schepler <[email protected]>
> > > wrote:
> > >
> > >> What I don't understand is how this could overload the KDC. Aren't
> > >> tokens valid for a relatively long time period?
> > >>
> > >> For new deployments where many TMs are started at once I could imagine
> > >> it temporarily, but shouldn't the accesses to the KDC eventually
> > >> naturally spread out?
> > >>
> > >> The FLIP mentions scaling issues with 200 nodes; it's really
> surprising
> > >> to me that such a small number of requests can already cause issues.
> > >>
> > >> On 03/02/2022 16:14, Gabor Somogyi wrote:
> > >> >> I would prefer not choosing the first option
> > >> > Then the second option may play only.
> > >> >
> > >> >> I am not a Kerberos expert but is it really so that every
> application
> > >> that
> > >> > wants to use Kerberos needs to implement the token propagation
> itself?
> > >> This
> > >> > somehow feels as if there is something missing.
> > >> >
> > >> > OK, so first some kerberos + token intro.
> > >> >
> > >> > Some basics:
> > >> > * TGT can be created from keytab
> > >> > * TGT is needed to obtain TGS (called token)
> > >> > * Authentication only works with TGS -> all places where external
> > >> system is
> > >> > needed either a TGT or TGS needed
> > >> >
> > >> > There are basically 2 ways to authenticate to a kerberos secured
> > >> external
> > >> > system:
> > >> > 1. One needs a kerberos TGT which MUST be propagated to all JVMs.
> Here
> > >> each
> > >> > and every JVM obtains a TGS by itself which bombs the KDC that may
> > >> collapse.
> > >> > 2. One needs a kerberos TGT which exists only on a single place (in
> > this
> > >> > case JM). JM gets a TGS which MUST be propagated to all TMs because
> > >> > otherwise authentication fails.
> > >> >
> > >> > Now the whole system works in a way that keytab file (we can imagine
> > >> that
> > >> > as plaintext password) is reachable on all nodes.
> > >> > This is a relatively huge attack surface. Now the main intention is:
> > >> > * Instead of propagating keytab file to all nodes propagate a TGS
> > which
> > >> has
> > >> > limited lifetime (more secure)
> > >> > * Do the TGS generation in a single place so KDC may not collapse +
> > >> having
> > >> > keytab only on a single node can be better protected
> > >> >
> > >> > As a final conclusion if there is a place which expects to do
> kerberos
> > >> > authentication then it's a MUST to have either TGT or TGS.
> > >> > Now it's done in a pretty unsecure way. The questions are the
> > following:
> > >> > * Do we want to leave this unsecure keytab propagation like this and
> > >> bomb
> > >> > KDC?
> > >> > * If no then how do we propagate the more secure token to TMs.
> > >> >
> > >> > If the answer to the first question is no then the FLIP can be
> > abandoned
> > >> > and doesn't worth the further effort.
> > >> > If the answer is yes then we can talk about the how part.
> > >> >
> > >> > G
> > >> >
> > >> >
> > >> > On Thu, Feb 3, 2022 at 3:42 PM Till Rohrmann <[email protected]>
> > >> wrote:
> > >> >
> > >> >> I would prefer not choosing the first option
> > >> >>
> > >> >>> Make the TM accept tasks only after registration(not sure if it's
> > >> >> possible or makes sense at all)
> > >> >>
> > >> >> because it effectively means that we change how Flink's component
> > >> lifecycle
> > >> >> works for distributing Kerberos tokens. It also effectively means
> > that
> > >> a TM
> > >> >> cannot make progress until connected to a RM.
> > >> >>
> > >> >> I am not a Kerberos expert but is it really so that every
> application
> > >> that
> > >> >> wants to use Kerberos needs to implement the token propagation
> > itself?
> > >> This
> > >> >> somehow feels as if there is something missing.
> > >> >>
> > >> >> Cheers,
> > >> >> Till
> > >> >>
> > >> >> On Thu, Feb 3, 2022 at 3:29 PM Gabor Somogyi <
> > >> [email protected]>
> > >> >> wrote:
> > >> >>
> > >> >>>>   Isn't this something the underlying resource management system
> > >> could
> > >> >> do
> > >> >>> or which every process could do on its own?
> > >> >>>
> > >> >>> I was looking for such feature but not found.
> > >> >>> Maybe we can solve the propagation easier but then I'm waiting on
> > >> better
> > >> >>> suggestion.
> > >> >>> If anybody has better/more simple idea then please point to a
> > specific
> > >> >>> feature which works on all resource management systems.
> > >> >>>
> > >> >>>> Here's an example for the TM to run workloads without being
> > connected
> > >> >>> to the RM, without ever having a valid token
> > >> >>>
> > >> >>> All in all I see the main problem. Not sure what is the reason
> > behind
> > >> >> that
> > >> >>> a TM accepts tasks w/o registration but clearly not helping here.
> > >> >>> I basically see 2 possible solutions:
> > >> >>> * Make the TM accept tasks only after registration(not sure if
> it's
> > >> >>> possible or makes sense at all)
> > >> >>> * We send tokens right after container creation with
> > >> >>> "updateDelegationTokens"
> > >> >>> Not sure which one is more realistic to do since I'm not involved
> > the
> > >> new
> > >> >>> feature.
> > >> >>> WDYT?
> > >> >>>
> > >> >>>
> > >> >>> On Thu, Feb 3, 2022 at 3:09 PM Till Rohrmann <
> [email protected]>
> > >> >> wrote:
> > >> >>>> Hi everyone,
> > >> >>>>
> > >> >>>> Sorry for joining this discussion late. I also did not read all
> > >> >> responses
> > >> >>>> in this thread so my question might already be answered: Why does
> > >> Flink
> > >> >>>> need to be involved in the propagation of the tokens? Why do we
> > need
> > >> >>>> explicit RPC calls in the Flink domain? Isn't this something the
> > >> >> underlying
> > >> >>>> resource management system could do or which every process could
> do
> > >> on
> > >> >> its
> > >> >>>> own? I am a bit worried that we are making Flink responsible for
> > >> >> something
> > >> >>>> that it is not really designed to do so.
> > >> >>>>
> > >> >>>> Cheers,
> > >> >>>> Till
> > >> >>>>
> > >> >>>> On Thu, Feb 3, 2022 at 2:54 PM Chesnay Schepler <
> > [email protected]>
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>> Here's an example for the TM to run workloads without being
> > >> connected
> > >> >> to
> > >> >>>>> the RM, while potentially having a valid token:
> > >> >>>>>
> > >> >>>>>   1. TM registers at RM
> > >> >>>>>   2. JobMaster requests slot from RM -> TM gets notified
> > >> >>>>>   3. JM fails over
> > >> >>>>>   4. TM re-offers the slot to the failed over JobMaster
> > >> >>>>>   5. TM reconnects to RM at some point
> > >> >>>>>
> > >> >>>>> Here's an example for the TM to run workloads without being
> > >> connected
> > >> >> to
> > >> >>>>> the RM, without ever having a valid token:
> > >> >>>>>
> > >> >>>>>   1. TM1 has a valid token and is running some tasks.
> > >> >>>>>   2. TM1 crashes
> > >> >>>>>   3. TM2 is started to take over, and re-uses the working
> > directory
> > >> of
> > >> >>>>>      TM1 (new feature in 1.15!)
> > >> >>>>>   4. TM2 recovers the previous slot allocations
> > >> >>>>>   5. TM2 is informed about leading JM
> > >> >>>>>   6. TM2 starts registration with RM
> > >> >>>>>   7. TM2 offers slots to JobMaster
> > >> >>>>>   8. TM2 accepts task submission from JobMaster
> > >> >>>>>   9. ...some time later the registration completes...
> > >> >>>>>
> > >> >>>>>
> > >> >>>>> On 03/02/2022 14:24, Gabor Somogyi wrote:
> > >> >>>>>>> but it can happen that the JobMaster+TM collaborate to run
> stuff
> > >> >>>>>> without the TM being registered at the RM
> > >> >>>>>>
> > >> >>>>>> Honestly I'm not educated enough within Flink to give an
> example
> > to
> > >> >>>>>> such scenario.
> > >> >>>>>> Until now I thought JM defines tasks to be done and TM just
> > blindly
> > >> >>>>>> connects to external systems and does the processing.
> > >> >>>>>> All in all if external systems can be touched when JM + TM
> > >> >>>>>> collaboration happens then we need to consider that in the
> > design.
> > >> >>>>>> Since I don't have an example scenario I don't know what
> exactly
> > >> >> needs
> > >> >>>>>> to be solved.
> > >> >>>>>> I think we need an example case to decide whether we face a
> real
> > >> >> issue
> > >> >>>>>> or the design is not leaking.
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> On Thu, Feb 3, 2022 at 2:12 PM Chesnay Schepler <
> > >> [email protected]>
> > >> >>>>>> wrote:
> > >> >>>>>>
> > >> >>>>>>      > Just to learn something new. I think local recovery is
> > >> clear to
> > >> >>>>>>      me which is not touching external systems like Kafka or so
> > >> >>>>>>      (correct me if I'm wrong). Is it possible that such case
> the
> > >> user
> > >> >>>>>>      code just starts to run blindly w/o JM coordination and
> > >> connects
> > >> >>>>>>      to external systems to do data processing?
> > >> >>>>>>
> > >> >>>>>>      Local recovery itself shouldn't touch external systems;
> the
> > TM
> > >> >>>>>>      cannot just run user-code without the JobMaster being
> > >> involved,
> > >> >>>>>>      but it can happen that the JobMaster+TM collaborate to run
> > >> stuff
> > >> >>>>>>      without the TM being registered at the RM.
> > >> >>>>>>
> > >> >>>>>>      On 03/02/2022 13:48, Gabor Somogyi wrote:
> > >> >>>>>>>      > Any error in loading the provider (be it by accident or
> > >> >>>>>>>      explicit checks) then is a setup error and we can fail
> the
> > >> >>>>> cluster.
> > >> >>>>>>>      Fail fast is a good direction in my view. In Spark I
> wanted
> > >> to
> > >> >> go
> > >> >>>>>>>      to this direction but there were other opinions so there
> > if a
> > >> >>>>>>>      provider is not loaded then the workload goes further.
> > >> >>>>>>>      Of course the processing will fail if the token is
> > missing...
> > >> >>>>>>>
> > >> >>>>>>>      > Requiring HBase (and Hadoop for that matter) to be on
> the
> > >> JM
> > >> >>>>>>>      system classpath would be a bit unfortunate. Have you
> > >> considered
> > >> >>>>>>>      loading the providers as plugins?
> > >> >>>>>>>
> > >> >>>>>>>      Even if it's unfortunate the actual implementation is
> > >> depending
> > >> >>>>>>>      on that already. Moving HBase and/or all token providers
> > into
> > >> >>>>>>>      plugins is a possibility.
> > >> >>>>>>>      That way if one wants to use a specific provider then a
> > >> plugin
> > >> >>>>>>>      need to be added. If we would like to go to this
> direction
> > I
> > >> >>>>>>>      would do that in a separate
> > >> >>>>>>>      FLIP not to have feature creep here. The actual FLIP
> > already
> > >> >>>>>>>      covers several thousand lines of code changes.
> > >> >>>>>>>
> > >> >>>>>>>      > This is missing from the FLIP. From my experience with
> > the
> > >> >>>>>>>      metric reporters, having the implementation rely on the
> > >> >>>>>>>      configuration is really annoying for testing purposes.
> > That's
> > >> >> why
> > >> >>>>>>>      I suggested factories; they can take care of extracting
> all
> > >> >>>>>>>      parameters that the implementation needs, and then pass
> > them
> > >> >>>>>>>      nicely via the constructor.
> > >> >>>>>>>
> > >> >>>>>>>      ServiceLoader provided services must have a norarg
> > >> constructor
> > >> >>>>>>>      where no parameters can be passed.
> > >> >>>>>>>      As a side note testing delegation token providers is pain
> > in
> > >> the
> > >> >>>>>>>      ass and not possible with automated tests without
> creating
> > a
> > >> >>>>>>>      fully featured kerberos cluster with KDC, HDFS, HBase,
> > Kafka,
> > >> >>>>> etc..
> > >> >>>>>>>      We've had several tries in Spark but then gave it up
> > because
> > >> of
> > >> >>>>>>>      the complexity and the flakyness of it so I wouldn't care
> > >> much
> > >> >>>>>>>      about unit testing.
> > >> >>>>>>>      The sad truth is that most of the token providers can be
> > >> tested
> > >> >>>>>>>      manually on cluster.
> > >> >>>>>>>
> > >> >>>>>>>      Of course this doesn't mean that the whole code is not
> > >> intended
> > >> >>>>>>>      to be covered with tests. I mean couple of parts can be
> > >> >>>>>>>      automatically tested but providers are not such.
> > >> >>>>>>>
> > >> >>>>>>>      > This also implies that any fields of the provider
> > wouldn't
> > >> >>>>>>>      inherently have to be mutable.
> > >> >>>>>>>
> > >> >>>>>>>      I think this is not an issue. A provider connects to a
> > >> service,
> > >> >>>>>>>      obtains token(s) and then close the connection and never
> > seen
> > >> >> the
> > >> >>>>>>>      need of an intermediate state.
> > >> >>>>>>>      I've just mentioned the singleton behavior to be clear.
> > >> >>>>>>>
> > >> >>>>>>>      > One examples is a TM restart + local recovery, where
> the
> > TM
> > >> >>>>>>>      eagerly offers the previous set of slots to the leading
> JM.
> > >> >>>>>>>
> > >> >>>>>>>      Just to learn something new. I think local recovery is
> > clear
> > >> to
> > >> >>>>>>>      me which is not touching external systems like Kafka or
> so
> > >> >>>>>>>      (correct me if I'm wrong).
> > >> >>>>>>>      Is it possible that such case the user code just starts
> to
> > >> run
> > >> >>>>>>>      blindly w/o JM coordination and connects to external
> > systems
> > >> to
> > >> >>>>>>>      do data processing?
> > >> >>>>>>>
> > >> >>>>>>>
> > >> >>>>>>>      On Thu, Feb 3, 2022 at 1:09 PM Chesnay Schepler
> > >> >>>>>>>      <[email protected]> wrote:
> > >> >>>>>>>
> > >> >>>>>>>          1)
> > >> >>>>>>>          The manager certainly shouldn't check for specific
> > >> >>>>>>>          implementations.
> > >> >>>>>>>          The problem with classpath-based checks is it can
> > easily
> > >> >>>>>>>          happen that the provider can't be loaded in the first
> > >> place
> > >> >>>>>>>          (e.g., if you don't use reflection, which you
> currently
> > >> >> kinda
> > >> >>>>>>>          force), and in that case Flink can't tell whether the
> > >> token
> > >> >>>>>>>          is not required or the cluster isn't set up
> correctly.
> > >> >>>>>>>          As I see it we shouldn't try to be clever; if the
> users
> > >> >> wants
> > >> >>>>>>>          kerberos, then have him enable the providers. Any
> error
> > >> in
> > >> >>>>>>>          loading the provider (be it by accident or explicit
> > >> checks)
> > >> >>>>>>>          then is a setup error and we can fail the cluster.
> > >> >>>>>>>          If we still want to auto-detect whether the provider
> > >> should
> > >> >>>>>>>          be used, note that using factories would make this
> > >> easier;
> > >> >>>>>>>          the factory can check the classpath (not having any
> > >> direct
> > >> >>>>>>>          dependencies on HBase avoids the case above), and the
> > >> >>>>>>>          provider no longer needs reflection because it will
> > only
> > >> be
> > >> >>>>>>>          used iff HBase is on the CP.
> > >> >>>>>>>
> > >> >>>>>>>          Requiring HBase (and Hadoop for that matter) to be on
> > >> the JM
> > >> >>>>>>>          system classpath would be a bit unfortunate. Have you
> > >> >>>>>>>          considered loading the providers as plugins?
> > >> >>>>>>>
> > >> >>>>>>>          2) > DelegationTokenProvider#init method
> > >> >>>>>>>
> > >> >>>>>>>          This is missing from the FLIP. From my experience
> with
> > >> the
> > >> >>>>>>>          metric reporters, having the implementation rely on
> the
> > >> >>>>>>>          configuration is really annoying for testing
> purposes.
> > >> >> That's
> > >> >>>>>>>          why I suggested factories; they can take care of
> > >> extracting
> > >> >>>>>>>          all parameters that the implementation needs, and
> then
> > >> pass
> > >> >>>>>>>          them nicely via the constructor. This also implies
> that
> > >> any
> > >> >>>>>>>          fields of the provider wouldn't inherently have to be
> > >> >> mutable.
> > >> >>>>>>>          > workloads are not yet running until the initial
> token
> > >> set
> > >> >>>>>>>          is not propagated.
> > >> >>>>>>>
> > >> >>>>>>>          This isn't necessarily true. It can happen that tasks
> > are
> > >> >>>>>>>          being deployed to the TM without it having registered
> > >> with
> > >> >>>>>>>          the RM; there is currently no requirement that a TM
> > must
> > >> be
> > >> >>>>>>>          registered before it may offer slots / accept task
> > >> >>>>> submissions.
> > >> >>>>>>>          One examples is a TM restart + local recovery, where
> > the
> > >> TM
> > >> >>>>>>>          eagerly offers the previous set of slots to the
> leading
> > >> JM.
> > >> >>>>>>>
> > >> >>>>>>>          On 03/02/2022 12:39, Gabor Somogyi wrote:
> > >> >>>>>>>>          Thanks for the quick response!
> > >> >>>>>>>>          Appreciate your invested time...
> > >> >>>>>>>>
> > >> >>>>>>>>          G
> > >> >>>>>>>>
> > >> >>>>>>>>          On Thu, Feb 3, 2022 at 11:12 AM Chesnay Schepler
> > >> >>>>>>>>          <[email protected]> wrote:
> > >> >>>>>>>>
> > >> >>>>>>>>              Thanks for answering the questions!
> > >> >>>>>>>>
> > >> >>>>>>>>              1) Does the HBase provider require HBase to be
> on
> > >> the
> > >> >>>>>>>>              classpath?
> > >> >>>>>>>>
> > >> >>>>>>>>
> > >> >>>>>>>>          To be instantiated no, to obtain a token yes.
> > >> >>>>>>>>
> > >> >>>>>>>>                  If so, then could it even be loaded if Hbase
> > is
> > >> on
> > >> >>>>>>>>              the classpath?
> > >> >>>>>>>>
> > >> >>>>>>>>
> > >> >>>>>>>>          The provider can be loaded but inside the provider
> it
> > >> would
> > >> >>>>>>>>          detect whether HBase is on classpath.
> > >> >>>>>>>>          Just to be crystal clear here this is the actual
> > >> >>>>>>>>          implementation what I would like to take over into
> the
> > >> >>>>> Provider.
> > >> >>>>>>>>          Please see:
> > >> >>>>>>>>
> > >> >>
> > >>
> >
> https://github.com/apache/flink/blob/e6210d40491ff28c779b8604e425f01983f8a3d7/flink-yarn/src/main/java/org/apache/flink/yarn/Utils.java#L243-L254
> > >> >>>>>>>>          I've considered to load only the necessary Providers
> > but
> > >> >>>>>>>>          that would mean a generic Manager need to know that
> if
> > >> the
> > >> >>>>>>>>          newly loaded Provider is
> > >> >>>>>>>>          instanceof HBaseDelegationTokenProvider, then it
> need
> > >> to be
> > >> >>>>>>>>          skipped.
> > >> >>>>>>>>          I think it would add unnecessary complexity to the
> > >> Manager
> > >> >>>>>>>>          and it would contain ugly code parts(at least in my
> > view
> > >> >>>>>>>>          ugly), like this
> > >> >>>>>>>>          if (provider instanceof HBaseDelegationTokenProvider
> > &&
> > >> >>>>>>>>          hbaseIsNotOnClasspath()) {
> > >> >>>>>>>>            // Skip intentionally
> > >> >>>>>>>>          } else if (provider instanceof
> > >> >>>>>>>>          SomethingElseDelegationTokenProvider &&
> > >> >>>>>>>>          somethingElseIsNotOnClasspath()) {
> > >> >>>>>>>>            // Skip intentionally
> > >> >>>>>>>>          } else {
> > >> >>>>>>>>            providers.put(provider.serviceName(), provider);
> > >> >>>>>>>>          }
> > >> >>>>>>>>          I think the least code and most clear approach is to
> > >> load
> > >> >>>>>>>>          the providers and decide inside whether everything
> is
> > >> given
> > >> >>>>>>>>          to obtain a token.
> > >> >>>>>>>>
> > >> >>>>>>>>                  If not, then you're assuming the classpath
> of
> > >> the
> > >> >>>>>>>>              JM/TM to be the same, which isn't necessarily
> true
> > >> (in
> > >> >>>>>>>>              general; and also if Hbase is loaded from the
> > >> >> user-jar).
> > >> >>>>>>>>
> > >> >>>>>>>>          I'm not assuming that the classpath of JM/TM must be
> > the
> > >> >>>>>>>>          same. If the HBase jar is coming from the user-jar
> > then
> > >> the
> > >> >>>>>>>>          HBase code is going to use UGI within the JVM when
> > >> >>>>>>>>          authentication required.
> > >> >>>>>>>>          Of course I've not yet tested within Flink but in
> > Spark
> > >> it
> > >> >>>>>>>>          is working fine.
> > >> >>>>>>>>          All in all JM/TM classpath may be different but on
> > both
> > >> >> side
> > >> >>>>>>>>          HBase jar must exists somehow.
> > >> >>>>>>>>
> > >> >>>>>>>>              2) None of the /Providers/ in your PoC get
> access
> > to
> > >> >> the
> > >> >>>>>>>>              configuration. Only the /Manager/ is. Note that
> I
> > do
> > >> >> not
> > >> >>>>>>>>              know whether there is a need for the providers
> to
> > >> have
> > >> >>>>>>>>              access to the config, as that's very
> > implementation
> > >> >>>>>>>>              specific I suppose.
> > >> >>>>>>>>
> > >> >>>>>>>>
> > >> >>>>>>>>          You're right. Since this is just a POC and I don't
> > have
> > >> >>>>>>>>          green light I've not put too many effort for a
> proper
> > >> >>>>>>>>          self-review. DelegationTokenProvider#init method
> must
> > >> get
> > >> >>>>>>>>          Flink configuration.
> > >> >>>>>>>>          The reason behind is that several further
> > configuration
> > >> can
> > >> >>>>>>>>          be find out using that. A good example is to get
> > Hadoop
> > >> >> conf.
> > >> >>>>>>>>          The rationale behind is the same just like before,
> it
> > >> would
> > >> >>>>>>>>          be good to create a generic Manager as possible.
> > >> >>>>>>>>          To be more specific some code must load Hadoop conf
> > >> which
> > >> >>>>>>>>          could be the Manager or the Provider.
> > >> >>>>>>>>          If the manager does that then the generic Manager
> must
> > >> be
> > >> >>>>>>>>          modified all the time when something special thing
> is
> > >> >> needed
> > >> >>>>>>>>          for a new provider.
> > >> >>>>>>>>          This could be super problematic when a custom
> provider
> > >> is
> > >> >>>>>>>>          written.
> > >> >>>>>>>>
> > >> >>>>>>>>              10) I'm not sure myself. It could be something
> as
> > >> >>>>>>>>              trivial as creating some temporary directory in
> > >> HDFS I
> > >> >>>>>>>>              suppose.
> > >> >>>>>>>>
> > >> >>>>>>>>
> > >> >>>>>>>>          I've not found of such task.YARN and K8S are not
> > >> expecting
> > >> >>>>>>>>          such things from executors and workloads are not yet
> > >> >> running
> > >> >>>>>>>>          until the initial token set is not propagated.
> > >> >>>>>>>>
> > >> >>>>>>>>
> > >> >>>>>>>>              On 03/02/2022 10:23, Gabor Somogyi wrote:
> > >> >>>>>>>>>              Please see my answers inline. Hope provided
> > >> satisfying
> > >> >>>>> answers to all
> > >> >>>>>>>>>              questions.
> > >> >>>>>>>>>
> > >> >>>>>>>>>              G
> > >> >>>>>>>>>
> > >> >>>>>>>>>              On Thu, Feb 3, 2022 at 9:17 AM Chesnay
> Schepler<
> > >> >>>>> [email protected]>  <mailto:[email protected]>  wrote:
> > >> >>>>>>>>>>              I have a few question that I'd appreciate if
> you
> > >> >> could
> > >> >>>>> answer them.
> > >> >>>>>>>>>>                  1. How does the Provider know whether it
> is
> > >> >>>>> required or not?
> > >> >>>>>>>>>>              All registered providers which are registered
> > >> >> properly
> > >> >>>>> are going to be
> > >> >>>>>>>>>              loaded and asked to obtain tokens. Worth to
> > mention
> > >> >>>>> every provider
> > >> >>>>>>>>>              has the right to decide whether it wants to
> > obtain
> > >> >>>>> tokens or not (bool
> > >> >>>>>>>>>              delegationTokensRequired()). For instance if
> > >> provider
> > >> >>>>> detects that
> > >> >>>>>>>>>              HBase is not on classpath or not configured
> > >> properly
> > >> >>>>> then no tokens are
> > >> >>>>>>>>>              obtained from that specific provider.
> > >> >>>>>>>>>
> > >> >>>>>>>>>              You may ask how a provider is registered. Here
> it
> > >> is:
> > >> >>>>>>>>>              The provider is on classpath + there is a
> > META-INF
> > >> >> file
> > >> >>>>> which contains the
> > >> >>>>>>>>>              name of the provider, for example:
> > >> >>>>>>>>>
> > >> >>
> > >>
> >
> META-INF/services/org.apache.flink.runtime.security.token.DelegationTokenProvider
> > >> >>>>>>>>>              <
> > >> >>
> > >>
> >
> https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-b65ee7e64c5d2dfbb683d3569fc3e42f4b5a8052ab83d7ac21de5ab72f428e0b
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-b65ee7e64c5d2dfbb683d3569fc3e42f4b5a8052ab83d7ac21de5ab72f428e0b
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. How does the configuration of Providers
> > >> work
> > >> >>>>> (how do they get
> > >> >>>>>>>>>>                  access to a configuration)?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Flink configuration is going to be passed to
> all
> > >> >>>>> providers. Please see the
> > >> >>>>>>>>>              POC here:
> > >> >>>>>>>>>
> > >> >>
> > >>
> >
> https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1
> > >> >>>>>>>>>              Service specific configurations are loaded
> > >> on-the-fly.
> > >> >>>>> For example in HBase
> > >> >>>>>>>>>              case it looks for HBase configuration class
> which
> > >> will
> > >> >>>>> be instantiated
> > >> >>>>>>>>>              within the provider.
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. How does a user select providers? (Is
> it
> > >> >> purely
> > >> >>>>> based on the
> > >> >>>>>>>>>>                  provider being on the classpath?)
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Providers can be explicitly turned off with
> the
> > >> >>>>> following config:
> > >> >>>>>>>>>              "security.kerberos.tokens.${name}.enabled".
> I've
> > >> never
> > >> >>>>> seen that 2
> > >> >>>>>>>>>              different implementation would exist for a
> > specific
> > >> >>>>>>>>>              external service, but if this edge case would
> > exist
> > >> >>>>> then the mentioned
> > >> >>>>>>>>>              config need to be added, a new provider with a
> > >> >>>>> different name need to be
> > >> >>>>>>>>>              implemented and registered.
> > >> >>>>>>>>>              All in all we've seen that provider handling is
> > not
> > >> >>>>> user specific task but
> > >> >>>>>>>>>              a cluster admin one. If a specific provider is
> > >> needed
> > >> >>>>> then it's implemented
> > >> >>>>>>>>>              once per company, registered once
> > >> >>>>>>>>>              to the clusters and then all users may or may
> not
> > >> use
> > >> >>>>> the obtained tokens.
> > >> >>>>>>>>>              Worth to mention the system will know which
> token
> > >> need
> > >> >>>>> to be used when HDFS
> > >> >>>>>>>>>              is accessed, this part is automatic.
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. How can a user override an existing
> > >> provider?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Pease see the previous bulletpoint.
> > >> >>>>>>>>>>                  1. What is DelegationTokenProvider#name()
> > used
> > >> >> for?
> > >> >>>>>>>>>>              By default all providers which are registered
> > >> >> properly
> > >> >>>>> (on classpath +
> > >> >>>>>>>>>              META-INF entry) are on by default. With
> > >> >>>>>>>>>              "security.kerberos.tokens.${name}.enabled" a
> > >> specific
> > >> >>>>> provider can be
> > >> >>>>>>>>>              turned off.
> > >> >>>>>>>>>              Additionally I'm intended to use this in log
> > >> entries
> > >> >>>>> later on for debugging
> > >> >>>>>>>>>              purposes. For example "hadoopfs provider
> > obtained 2
> > >> >>>>> tokens with ID...".
> > >> >>>>>>>>>              This would help what and when is happening
> > >> >>>>>>>>>              with tokens. The same applies to TaskManager
> > side:
> > >> "2
> > >> >>>>> hadoopfs provider
> > >> >>>>>>>>>              tokens arrived with ID...". Important to note
> > that
> > >> the
> > >> >>>>> secret part will be
> > >> >>>>>>>>>              hidden in the mentioned log entries to keep the
> > >> >>>>>>>>>              attach surface low.
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. What happens if the names of 2
> providers
> > >> are
> > >> >>>>> identical?
> > >> >>>>>>>>>>              Presume you mean 2 different classes which
> both
> > >> >>>>> registered and having the
> > >> >>>>>>>>>              same logic inside. This case both will be
> loaded
> > >> and
> > >> >>>>> both is going to
> > >> >>>>>>>>>              obtain token(s) for the same service.
> > >> >>>>>>>>>              Both obtained token(s) are going to be added to
> > the
> > >> >>>>> UGI. As a result the
> > >> >>>>>>>>>              second will overwrite the first but the order
> is
> > >> not
> > >> >>>>> defined. Since both
> > >> >>>>>>>>>              token(s) are valid no matter which one is
> > >> >>>>>>>>>              used then access to the external system will
> > work.
> > >> >>>>>>>>>
> > >> >>>>>>>>>              When the class names are same then service
> loader
> > >> only
> > >> >>>>> loads a single entry
> > >> >>>>>>>>>              because services are singletons. That's the
> > reason
> > >> why
> > >> >>>>> state inside
> > >> >>>>>>>>>              providers are not advised.
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. Will we directly load the provider, or
> > >> first
> > >> >>>>> load a factory
> > >> >>>>>>>>>>                  (usually preferable)?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Intended to load a provider directly by DTM.
> We
> > >> can
> > >> >>>>> add an extra layer to
> > >> >>>>>>>>>              have factory but after consideration I came to
> a
> > >> >>>>> conclusion that it would
> > >> >>>>>>>>>              be and overkill this case.
> > >> >>>>>>>>>              Please have a look how it's planned to load
> > >> providers
> > >> >>>>> now:
> > >> >>
> > >>
> >
> https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-d56a0bc77335ff23c0318f6dec1872e7b19b1a9ef6d10fff8fbaab9aecac94faR54-R81
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. What is the Credentials class (it would
> > >> >>>>> necessarily have to be a
> > >> >>>>>>>>>>                  public api as well)?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Credentials class is coming from Hadoop. My
> main
> > >> >>>>> intention was not to bind
> > >> >>>>>>>>>              the implementation to Hadoop completely. It is
> > not
> > >> >>>>> possible because of the
> > >> >>>>>>>>>              following reasons:
> > >> >>>>>>>>>              * Several functionalities are must because
> there
> > >> are
> > >> >> no
> > >> >>>>> alternatives,
> > >> >>>>>>>>>              including but not limited to login from keytab,
> > >> proper
> > >> >>>>> TGT cache handling,
> > >> >>>>>>>>>              passing tokens to Hadoop services like HDFS,
> > HBase,
> > >> >>>>> Hive, etc.
> > >> >>>>>>>>>              * The partial win is that the whole delegation
> > >> token
> > >> >>>>> framework is going to
> > >> >>>>>>>>>              be initiated if hadoop-common is on classpath
> > >> (Hadoop
> > >> >>>>> is optional in core
> > >> >>>>>>>>>              libraries)
> > >> >>>>>>>>>              The possibility to eliminate Credentials from
> API
> > >> >> could
> > >> >>>>> be:
> > >> >>>>>>>>>              * to convert Credentials to byte array forth
> and
> > >> back
> > >> >>>>> while a provider
> > >> >>>>>>>>>              gives back token(s): I think this would be an
> > >> overkill
> > >> >>>>> and would make the
> > >> >>>>>>>>>              API less clear what to give back what Manager
> > >> >>>>> understands
> > >> >>>>>>>>>              * to re-implement Credentials internal
> structure
> > >> in a
> > >> >>>>> POJO, here the same
> > >> >>>>>>>>>              convert forth and back would happen between
> > >> provider
> > >> >>>>> and manager. I think
> > >> >>>>>>>>>              this case would be the re-invent the wheel
> > scenario
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. What does the TaskManager do with the
> > >> received
> > >> >>>>> token?
> > >> >>>>>>>>>>              Puts the tokens into the UserGroupInformation
> > >> >> instance
> > >> >>>>> for the current
> > >> >>>>>>>>>              user. Such way Hadoop compatible services can
> > pick
> > >> up
> > >> >>>>> the tokens from there
> > >> >>>>>>>>>              properly.
> > >> >>>>>>>>>              This is an existing pattern inside Spark.
> > >> >>>>>>>>>
> > >> >>>>>>>>>>                  1. Is there any functionality in the
> > >> TaskManager
> > >> >>>>> that could require a
> > >> >>>>>>>>>>                  token on startup (i.e., before registering
> > >> with
> > >> >>>>> the RM)?
> > >> >>>>>>>>>>              Never seen such functionality in Spark and
> after
> > >> >>>>> analysis not seen in
> > >> >>>>>>>>>              Flink too. If you have something in mind which
> > I've
> > >> >>>>> missed plz help me out.
> > >> >>>>>>>>>
> > >> >>>>>>>>>
> > >> >>>>>>>>>              On 11/01/2022 14:58, Gabor Somogyi wrote:
> > >> >>>>>>>>>>              Hi All,
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Hope all of you have enjoyed the holiday
> season.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              I would like to start the discussion on
> > FLIP-211<
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework
> > >> >>>>>>>>>>              which
> > >> >>>>>>>>>>              aims to provide a
> > >> >>>>>>>>>>              Kerberos delegation token framework that
> > >> >>>>> /obtains/renews/distributes tokens
> > >> >>>>>>>>>>              out-of-the-box.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Please be aware that the FLIP wiki area is not
> > >> fully
> > >> >>>>> done since the
> > >> >>>>>>>>>>              discussion may
> > >> >>>>>>>>>>              change the feature in major ways. The proposal
> > >> can be
> > >> >>>>> found in a google doc
> > >> >>>>>>>>>>              here<
> > >> >>
> > >>
> >
> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
> > >> >>>>> <
> > >> >>>>>
> > >> >>
> > >>
> >
> https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ
> > >> >>>>>>>>>>              .
> > >> >>>>>>>>>>              As the community agrees on the approach the
> > >> content
> > >> >>>>> will be moved to the
> > >> >>>>>>>>>>              wiki page.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>              Feel free to add your thoughts to make this
> > >> feature
> > >> >>>>> better!
> > >> >>>>>>>>>>              BR,
> > >> >>>>>>>>>>              G
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >>
> > >>
> >
>

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

Reply via email to