Hi,
We have been testing out CAS 7.1.0 the past few weeks and believe we have
discovered some sort of issue with delegating logins to Entra (potentially
other OIDC IDPs too). The issue started after upgrading to 7.1.0 from 7.0.4
with sporadic reports from users getting a "Unauthorized Access" CAS error
page. We were able to replicate the issue by logging in and out of
applications being delegated to Entra. Initially, we were getting the error
on roughly 1 out of every 10 delegated logins, the other ~9 times it worked
perfectly. We turned up logging levels and extracted these logs when the
error occurs:
2024-09-24 15:24:16,132 DEBUG
[org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key:
access_token / value: [redacted]
2024-09-24 15:24:16,132 DEBUG
[org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key:
expiration / value: 1727214663132 / class java.lang.Long>
2024-09-24 15:24:16,132 DEBUG
[org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key: id_token
/ value: [redacted]
2024-09-24 15:24:16,132 WARN [org.apereo.cas.util.function.FunctionUtils] -
<Cannot invoke
"org.pac4j.oidc.profile.creator.TokenValidator.validate(com.nimbusds.jwt.JWT,
com.nimbusds.openid.connect.sdk.Nonce)" because the return value of
"org.pac4j.oidc.metadata.OidcOpMetadataResolver.getTokenValidator()" is null
OidcProfileCreator.java:create:115
BaseClient.java:getUserProfile:146
DelegatedClientAuthenticationHandler.java:lambda$doAuthentication$2:89
FunctionUtils.java:lambda$doAndHandle$12:425
>
2024-09-24 15:24:16,132 ERROR
[org.apereo.cas.authentication.DefaultAuthenticationManager] -
<Authentication has failed. Credentials may be incorrect or CAS cannot find
authentication handler that supports
[ClientCredential(credentials=OidcCredentials
I'll also note, we have read through the CAS 7.1 release notes and made all
the appropriate changes in the overlay regarding the consolidation of the
pac4j libraries.
At this point we assumed it was related to something in our config/other
libraries we were pulling in. We pulled the latest 7.1 overlay, and added
the absolute bare minimum to try to replicate the issue. Only thing we
changed in the overlay was adding the minimum libraries:
implementation "org.apereo.cas:cas-server-support-json-service-registry"
implementation "org.apereo.cas:cas-server-webapp"
implementation "org.apereo.cas:cas-server-support-pac4j-oidc"
The bare minimum in cas.properties:
cas.server.tomcat.http[0].enabled=true
cas.server.name=http://localhost:8080
cas.server.prefix=http://localhost:8080/cas
logging.config=file:/etc/cas/config/log4j2.xml
cas.service-registry.json.location=file:/etc/cas/services
cas.authn.pac4j.oidc[0].azure.tenant=[redacted]
cas.authn.pac4j.oidc[0].azure.id=[redacted]
cas.authn.pac4j.oidc[0].azure.secret=[redacted]
cas.authn.pac4j.oidc[0].azure.client-name=AADAuth
cas.authn.pac4j.oidc[0].generic.discovery-uri=https://login.microsoftonline.com/2a00728e-f0d0-40b4-a4e8-ce433f3fbca7/v2.0/.well-known/openid-configuration
A simple service definition:
{
"@class": "org.apereo.cas.services.CasRegisteredService",
"serviceId": ".*",
"name": "local",
"id": 1,
"description": "Login"
}
Then ran CAS locally on the embedded tomcat to test. Initially we could not
replicate the error, delegation worked every time. On a whim, we decided to
put a little load on the locally running instance of CAS using locust (25
Users ramping up 3 at a time). We immediately started getting the same
error locally. The error rate seems to be correlated with the number of
requests CAS is handling. For instance if we increase the number of users
in locust we get the error much more frequently to the point where you will
get it constantly when there is enough requests to CAS. I should also point
out that I don't think it is a resource issue. Any amount of load will
eventually let you get the error. Also, the CAS instance on the server has
a 4 core CPU and 16 GB of ram available to the JVM and hardly any of that
is being consumed when getting the error. It almost seems there is some
sort on concurrency issue happening when CAS handles the response from
Entra (which we confirmed contains a valid access/ID token) when there are
multiple requests. We have replicated the error on CAS 7.1.1 and CAS
7.2.0-SNAPSHOT as well.
It could still be we have something misconfigured as well but we are unsure
where to go from here, any help would be greatly appreciated!
Thanks!
--
- Website: https://apereo.github.io/cas
- List Guidelines: https://goo.gl/1VRrw7
- Contributions: https://goo.gl/mh7qDG
---
You received this message because you are subscribed to the Google Groups "CAS
Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/a/apereo.org/d/msgid/cas-user/253058d0-b8a4-4c59-8b7b-0f835ac9dfd1n%40apereo.org.