On 4/9/21 11:35 AM, Osipov, Michael (LDA IT PLM) wrote: > I am quite sure that this is a race condition where stat() is performed, > file does not exist, open() with write is performed, in parallel it is > already created and the later call returns in EEXIST.
I agree, except I think it's just unlink() and open(O_CREAT|O_EXCL) calls with no stat(). I had erroneously assumed that the unexpected error was happening inside fcc_store() because of "Failed to store credentials" in the message, but that string turns out to be from get_in_tkt.c in a block of code that also calls krb5_cc_initialize(). The fcc_initialize() EEXIST self-race has existed since 1.0. I'd speculate that the original developers' assumption was that lots of processes might be competing to use a file ccache, but that creating ccaches would be a rare and one-at-a-time affair (happening at login or when a user runs "kinit"). With client keytab support, that is no longer the case; it's easy to have multiple threads or processes competing to create or refresh a cache as part of gss_acquire_cred() or gss_init_sec_context(). Just fixing the fcc_initialize() race wouldn't really solve the problem; there would still be a window between krb5_cc_initialize() and krb5_cc_store_cred() where other threads (or processes) would see an initialized cache with no TGT in it, and would fail the gss_init_sec_context() call. This ticket describes that problem and some possible solutions: https://krbdev.mit.edu/rt/Ticket/Display.html?id=7707 Heimdal has implemented option 5. I'm not wild about it and it won't work with other ccache types, but it's a working stopgap and it can always be backed out in favor of a different solution later. ________________________________________________ Kerberos mailing list Kerberos@mit.edu https://mailman.mit.edu/mailman/listinfo/kerberos