Am 2020-10-15 um 16:48 schrieb Rémy Maucherat:
On Thu, Oct 15, 2020 at 2:31 PM Michael Osipov <1983-01...@gmx.net> wrote:

On Wed, Oct 14, 2020 at 6:32 PM Michael Osipov <micha...@apache.org>
wrote:

Am 2020-10-14 um 12:32 schrieb R=C3=A9my Maucherat:
On Tue, Oct 13, 2020 at 8:27 PM Michael Osipov <micha...@apache.org>
wrote:

Am 2020-10-13 um 16:05 schrieb r...@apache.org:
This is an automated email from the ASF dual-hosted git repository.

remm pushed a commit to branch 8.5.x
in repository https://gitbox.apache.org/repos/asf/tomcat.git


The following commit(s) were added to refs/heads/8.5.x by this
push:
        new 883600b  Always retry on a new connection, even when
pooli=
ng
883600b is described below

commit 883600b8a77c0be93b3a8dc2404e8d1110970ee7
Author: remm <r...@apache.org>
AuthorDate: Tue Oct 13 14:19:54 2020 +0200

       Always retry on a new connection, even when pooling

       This keeps the same very simple design as for the single
connection
       scenario, for now.
---
    java/org/apache/catalina/realm/JNDIRealm.java | 22
+++++++++++++++++++---
    webapps/docs/changelog.xml                    |  5 +++++
    2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/java/org/apache/catalina/realm/JNDIRealm.java
b/java/org/apache/catalina/realm/JNDIRealm.java
index 72087ab..98007f7 100644
--- a/java/org/apache/catalina/realm/JNDIRealm.java
+++ b/java/org/apache/catalina/realm/JNDIRealm.java
@@ -1311,7 +1311,7 @@ public class JNDIRealm extends RealmBase {
                    close(connection);

                    // open a new directory context.
-                connection =3D get();
+                connection =3D get(true);

                    // Try the authentication again.
                    principal =3D authenticate(connection, username,
credentials);
@@ -2389,12 +2389,28 @@ public class JNDIRealm extends RealmBase {
         * @exception NamingException if a directory server error
occu=
rs
         */
        protected JNDIConnection get() throws NamingException {
+        return get(false);
+    }
+
+    /**
+     * Open (if necessary) and return a connection to the
configured
+     * directory server for this Realm.
+     * @param create when pooling, this forces creation of a new
connection,
+     *   for example after an error
+     * @return the connection
+     * @exception NamingException if a directory server error
occurs
+     */
+    protected JNDIConnection get(boolean create) throws
NamingException
{
            JNDIConnection connection =3D null;
            // Use the pool if available, otherwise use the single
connection
            if (connectionPool !=3D null) {
-            connection =3D connectionPool.pop();
-            if (connection =3D=3D null) {
+            if (create) {
                    connection =3D create();
+            } else {
+                connection =3D connectionPool.pop();
+                if (connection =3D=3D null) {
+                    connection =3D create();
+                }
                }
            } else {
                singleConnectionLock.lock();

That suitable and simple approach.


If you have the code for adding a max idle check on hand and tested,
yo=
u
should add it IMO, it will be more efficient.

I will need to give this a couple more weeks of testing. This is what I
have observed today:
2020-10-14T16:01:47.039 147.54.155.198 w99sezx0... "GET
/x2tc-proxy-bln/rest/info/targetSystem HTTP/1.1" 200 8 92132

20609 2020-10-14T16:00:14 FEIN [https-openssl-apr-8444-exec-166]
net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.acquire Acquiring
directory server connection from pool
20610 2020-10-14T16:00:14 FEIN [https-openssl-apr-8444-exec-166]
net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.acquire Directory
serve=
r
connection from pool exceeds max idle time, closing it
20611 2020-10-14T16:00:14 FEIN [https-openssl-apr-8444-exec-166]
net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.close Closing
directory
server connection
20612 2020-10-14T16:00:14 FEIN [https-openssl-apr-8444-exec-166]
net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.open Opening new
directory server connection
20613 2020-10-14T16:01:47 FEIN [https-openssl-apr-8444-exec-166]
net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.getUser Searching for
username 'w99sezx0' in base ...

As you can see it took 90 seconds to server the request because the
connection has expired and close took way too long. In average the
request takes:
2020-10-14T13:57:06.730 10.81.50.232 osipovmi@... "GET
/x2tc-proxy-bln/rest/info/targetSystem HTTP/1.1" 200 8 70

when the connection is healthy.


Ok, so there's some real incentive to avoid reusing a connection that was
idle for too long.

I made further analysis. I was partially wrong about my statement. The
cause
for the 90 s was a faulty KDC which did not respond in time for a service
ticket. Java Kerberos does 3 retries with 30 s timeout. I have set to 1
retry
and 1000 ms wait until the next KDC will be tried.

Anyways, I am still convinced that some idle timeout is the right choice to
avoid resource depletion on both sides. If hundreds of clients keep tens of
connections open to a directory server for no reaon sooner or later
everything
will stall. I will let you know in a couple of weeks.


Yes, no problem, it sounds like a good idea (with a good default).



But related to this, I have made a very interesting observation when the
webapp
is shutdown:
2020-10-15T14:24:15.120 WARNUNG [deblndw024v...-startStop-2]
org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The
web application [x2tc-proxy-dsv] appears to have started a thread named
[Thread-7] but has failed to stop it. This is very likely to create a
memory leak. Stack trace of thread:
  java.lang.Object.wait(Native Method)
  java.lang.Object.wait(Object.java:502)
  com.sun.jndi.ldap.Connection.pauseReader(Connection.java:804)
  com.sun.jndi.ldap.Connection.run(Connection.java:944)
  java.lang.Thread.run(Thread.java:748)

When you look how LdapCtx#close() is implemented, you'll see that it holds
a
reference count to all naming enumrations. As long as they aren't really
closed
the connection is not closed immediately. I need to analyze why this is
happening. The result could be stopping and starting an application could
drown
the entire VM sooner or later.


That's not nice at all. I suppose it means the context classloader must be
switched when opening the ldap connection or it could leak. This is of
course going to be less an issue with a single connection. I'll need to fix
it.

I thnk you don't need to do anything, but one thing. I Will explain why. I did further testing, reviewed my code. All private LDAP classes on com.sum have a debug flag which can only be enabled when recompiled. I did that and put the JAR in the boot classpath. The crucial point is immediately closing a connection requires *all* NamingEnumerations closed properly. A reference counter is held and if that counter is not 0, the connection will be scheduled to be closed at some point in time, here is your webapp classloader leak:
LdapCtx.removeUnsolicited: false
LdapCtx: calling clnt.close() com.sun.jndi.ldap.LdapCtx@34d8f0
LdapClient: com.sun.jndi.ldap.LdapClient@1b223fc
LdapClient: close() called: 1
java.lang.Throwable
        at com.sun.jndi.ldap.LdapClient.close(LdapClient.java:433)
        at com.sun.jndi.ldap.LdapCtx.closeConnection(LdapCtx.java:2856)
        at com.sun.jndi.ldap.LdapCtx.close(LdapCtx.java:2641)
2020-10-19T14:40:34.393 WARNUNG [https-openssl-apr-8444-exec-215] 
org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web 
application [x2tc-proxy-bln] appears to have started a thread named 
[Thread-1671] but has failed to stop it. This is very likely to create a memory 
leak. Stack trace of thread:
 java.net.SocketInputStream.socketRead0(Native Method)
 java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 java.net.SocketInputStream.read(SocketInputStream.java:172)
 java.net.SocketInputStream.read(SocketInputStream.java:141)
 java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 com.sun.jndi.ldap.sasl.SaslInputStream.readFully(SaslInputStream.java:166)
 com.sun.jndi.ldap.sasl.SaslInputStream.fill(SaslInputStream.java:123)
 com.sun.jndi.ldap.sasl.SaslInputStream.read(SaslInputStream.java:90)
 com.sun.jndi.ldap.Connection.run(Connection.java:848)
 java.lang.Thread.run(Thread.java:748)


It should say "called: 0" when properly implemented. This means that all naming enumeration instances has been closed properly before closing the connection. Changing the class loader will just move the problem and not solve it.

In my case the issue is wihin an application. It uses Spring's PooledContextSource and retrieves a few attributes for a user from Active Directory. I have either incorrectly implemented by anonymous AbstractContextMapper or it is a bug in Spring LDAP.

So all you have to do is review NamingEnumeration#close() calls.

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to