Hi,

disclaimer: this has not been explicitly verified on the JNDIRealm, but the code is very similar to my ActiveDirectoryRealm wich exposes the same behavior.

The code in question:
            try {

                // Authenticate the specified username if possible
                principal = getPrincipal(connection, username, gssCredential);

            } catch (CommunicationException | ServiceUnavailableException e) {
                // log the exception so we know it's there.
                containerLog.info(sm.getString("jndiRealm.exception.retry"), e);

                // close the connection so we know it will be reopened.
                close(connection);
                closePooledConnections();

                // open a new directory context.
                connection = get();

                // Try the authentication again.
                principal = getPrincipal(connection, username, gssCredential);
            }

            // Release this context
            release(connection);

            // Return the authenticated Principal (if any)
            return principal;

        } catch (NamingException e) {
            // Log the problem for posterity
            containerLog.error(sm.getString("jndiRealm.exception"), e);

            // close the connection so we know it will be reopened.
            close(connection);
            closePooledConnections();

            // Return "not authenticated" for this request
            return null;
        }

The connection has been verified to be valid, getPrincipal() is performed, but the LDAP request can still fail in flight. Therefore CommunicationException and ServiceUnavailableException shall handle this case, drop the connection, open a new one and retry. This does not happen.
The issue is rooted in Sun's JDNI code. From LdapRequest#getReplyBer(long):

        // poll from 'replies' blocking queue ended-up with timeout
        if (result == null) {
            throw new NamingException(String.format(TIMEOUT_MSG_FMT, millis));
        }
        // Unexpected EOF can be caused by connection closure or cancellation
        if (result == EOF) {
            throw new NamingException(CLOSE_MSG);
        }

Both are actually communication issues which -- for some strange reason -- do not make use of CommunicationException. Therefore, the retry never happens also this is a valid request.
(I can provide a verbose log if necessary).

I see two ways to solve this:
1. Short term: catch NamingException and parse out those two messages:
  * LDAP connection has been closed
  * LDAP response read timed out
2. Report a bug/RFE with Java to use new CommunicationException instead of NamingException. I believe this can easily be backported to Java 8 and 11 since the external signature does not change.

My question is: Mark, you have direct access to JBS, would you be willing to file this issue directly or do you want me to file through bugreport.java.com first and when it arrives in JBS you could drop a comment that this also affects Tomcat?

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to