yeah, hadoop dt interface doesn't say anything about kerberos being
required, it is just that spark doesn't ask for dt unless security is
enabled. s3a and abfs connectors will, if delegation tokens are enabled for
them, happily issue their tokens

HadoopDelegationTokenProvider *does not require kerberos*. it is just that
spark doesn't ask filesystems for tokens without it.

You can see this with the fetchdt command of cloudstore
https://github.com/steveloughran/cloudstore
https://github.com/steveloughran/cloudstore/blob/main/src/site/markdown/fetchdt.md

point it an FS and it'll ask for them;
https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/delegation_tokens.html

azure can do the same with fs.azure.enable.delegation.token and a token
type, such as oauth.

for example, s3a set to ask for session tokens in an extra restricted role
  <property>
    <name>fs.s3a.delegation.token.role.arn</name>
    <value>${ARN-restricted}</value>
  </property>

  <property>
    <name>fs.s3a.delegation.token.binding</name>

<value>org.apache.hadoop.fs.s3a.auth.delegation.SessionTokenBinding</value>
  </property>


then you can use cloudstore (which is soon to have its first asf release,
but currently needs to be downloaded from
https://github.com/steveloughran/cloudstore )

> bin/hadoop jar $CLOUDSTORE fetchdt out.tokens s3a://stevel-london/

Collecting tokens for 1 filesystem to to
file:/Users/stevel/Projects/Releases/hadoop-3.5.0/out.tokens
2026-06-22 19:39:30,652 [main] INFO  commands.FetchTokens
(StoreDurationInfo.java:<init>(84)) - Starting: Fetching tokens for
s3a://stevel-london/
2026-06-22 19:39:32,204 [main] INFO  delegation.S3ADelegationTokens
(DurationInfo.java:<init>(77)) - Starting: Creating New Delegation Token
2026-06-22 19:39:32,282 [main] INFO  auth.STSClientFactory
(STSClientFactory.java:lambda$requestSessionCredentials$0(227)) -
Requesting Amazon STS Session credentials
2026-06-22 19:39:32,936 [main] INFO  delegation.S3ADelegationTokens
(S3ADelegationTokens.java:noteTokenCreated(443)) - Created S3A Delegation
Token: Kind: S3ADelegationToken/Session, Service: s3a://stevel-london,
Ident: (S3ATokenIdentifier{S3ADelegationToken/Session;
uri=s3a://stevel-london; timestamp=1782153572890; renewer=stevel;
encryption=SSE-KMS; 7eef8fbd-7a09-4a7e-a9ee-79a631c9f469; Created on
VXM63P4JG2/192.168.50.99 at time 2026-06-22T18:39:32.212268Z.}; session
credentials, expiry 2026-06-23T06:39:32Z; (valid))
2026-06-22 19:39:32,938 [main] INFO  delegation.S3ADelegationTokens
(DurationInfo.java:close(98)) - Creating New Delegation Token: duration
0:00.732s
*Fetched token: Kind: S3ADelegationToken/Session, Service:
s3a://stevel-london, Ident: (S3ATokenIdentifier{S3ADelegationToken/Session;
uri=s3a://stevel-london; timestamp=1782153572890; renewer=stevel;
encryption=SSE-KMS; 7eef8fbd-7a09-4a7e-a9ee-79a631c9f469; Created on
VXM63P4JG2/192.168.50.99 <http://192.168.50.99> at time
2026-06-22T18:39:32.212268Z.}; session credentials, expiry
2026-06-23T06:39:32Z; (valid))*
2026-06-22 19:39:32,941 [main] INFO  commands.FetchTokens
(StoreDurationInfo.java:close(190)) - Duration of Fetching tokens for
s3a://stevel-london/: 00:00:02.290
2026-06-22 19:39:32,941 [main] INFO  commands.FetchTokens
(StoreDurationInfo.java:<init>(84)) - Starting: Saving 1 token to
file:/Users/stevel/Projects/Releases/hadoop-3.5.0/out.tokens
2026-06-22 19:39:33,233 [main] INFO  commands.FetchTokens
(StoreDurationInfo.java:close(190)) - Duration of Saving 1 token to
file:/Users/stevel/Projects/Releases/hadoop-3.5.0/out.tokens: 00:00:00.292
Saved 1 token to
file:/Users/stevel/Projects/Releases/hadoop-3.5.0/out.tokens

Token issued; no kerberos around and the file now has session credentials
valid for 12h and fs encryption settings included.

Accordingly, I'm going to suggest a different design


   1. Don't bother with a new subclass
   2. Simply add a switch to enable token collection even if kerberos is off
   3. For a test, add a subclass of file:// with a new url and see if you
   can issue tokens off it. More rigorously, add a switch enabling it to blow
   up during creation/unmarshalling to test spark reslience

The hard part here is actually implementing any test DT implementation; if
you can avoid that your life is better.

On Thu, 18 Jun 2026 at 11:55, Peter Toth <[email protected]> wrote:

> Hi Parth,
>
> Thanks for the SPIP and for answering our questions.
> Does anyone have any other questions or points they'd like to discuss?
>
> Peter
>
> On Thu, Jun 4, 2026 at 10:02 PM Parth Chandra <[email protected]> wrote:
>
>> Hi all,
>>   I've a proposal to enhance the current mechanism to distribute
>> delegation tokens  and other secure tokens.
>>
>>   The summary is that the current mechanism is gated behind Kerberos
>> even though the actual distribution does not require Kerberos except where
>> the tokens themselves are Kerberos tokens. Cloud environments may not have
>> a Kerberos setup and this creates an unnecessary setup step that users may
>> have to perform. The current implementation of KafkaDelegationTokenProvider
>> illustrates this. The implementation does not require Kerberos, yet it has
>> to pass the Kerberos gates.
>>
>>   The proposal then is to allow a second path that does not require the
>> Kerberos gates unless the provider indicates that it be required. the
>> design has minimal change to the existing code and is fully backward
>> compatible.
>>
>>   The proposal and corresponding JIRA are in [1], [2]
>>
>>   I'd greatly appreciate it if committers can take some time to review
>> and provide feedback
>>
>> Thanks
>>
>> Parth
>> [1]
>> https://docs.google.com/document/d/1PPqAoJAj48MdjMJNc7DlytXi745z-imFpVaFDnt18Xg/edit?tab=t.0#heading=h.21tncge82jbl
>> [2] https://issues.apache.org/jira/browse/SPARK-57252
>>
>

Reply via email to