https://bz.apache.org/bugzilla/show_bug.cgi?id=61313
Bug ID: 61313 Summary: JNDIRealm LDAP server failover to alternateURL takes very long 15m32s Product: Tomcat 8 Version: 8.5.16 Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: Catalina Assignee: dev@tomcat.apache.org Reporter: peter.malo...@brockmann-consult.de Target Milestone: ---- Created attachment 35146 --> https://bz.apache.org/bugzilla/attachment.cgi?id=35146&action=edit my hacky patch I have worked with csutherl on #tomcat on irc.freenode.net who decided this is a bug and should be reported here rather than the ML. JDK version tested was oracle JDK 1.8.0_66. I have set connectionURL and alternateURL to try to get LDAP server failover to work. If both servers are up, it "works well". If only the connectionURL server is down (firewall is set to REJECT) then a newly restarted tomcat works fine, but if it goes down while tomcat is already running, the next LDAP lookup takes 15m32s. If only the alternateURL server is down (firewall is set to REJECT) then a newly restarted tomcat works fine, but if it goes down while tomcat is already running, and the connectionURL server was down before but up again now (so the JNDIRealm's instance variable "context" is currently set using alternateURL) the next LDAP lookup takes 15m32s. Setting connectionTimeout has no effect on the time. I have verified it ends up in the Hashtable returned by getDirectoryContextEnvironment(). And if I apply my hacky patch (attached, applies to tomcat85 git repo, tag TOMCAT_8_5_16) for the method "JNDIRealm.open()" so it works like a fresh tomcat startup, and never returns the old context (closes it and sets it to null, then lets the rest of the code run), it always "works well", taking 4-7s on a fresh tomcat, or less than 0.1s on a warmed up tomcat. server.xml snippets: <Realm className="org.apache.catalina.realm.MemoryRealm" digest="MD5" /> </Realm> <Realm className="org.apache.catalina.realm.JNDIRealm" connectionURL="ldap://auth1:389" connectionTimeout="1000" connectionAttempt="0" alternateURL="ldap://auth2:389" userPattern="uid={0},ou=People,dc=example,dc=com" userRoleAttribute="gidNumber" roleBase="ou=Group,dc=example,dc=com" roleName="cn" roleSearch="(|(gidNumber={2})(memberUid={1}))" /> an example webapp that can be used for testing is attached firewall test code (assuming otherwise blank firewall with policy ACCEPT): t() { iptables -D INPUT 1 ; iptables -I INPUT 1 -p tcp -s 10.3.0.21 -j "$1" ; iptables -nvL; } # run this on machines that should work t LOG # run this on machines that should fail t REJECT the test: # prerequisite for the test is a server named "auth1" and another "auth2" which run on port 389. In our case it's slapd, and they have start_tls enabled but not required. # on both LDAP servers: t LOG # then start tomcat, then # on the first LDAP server: t REJECT # on a test machine (content of the user and password here shouldn't matter... we aren't testing authentication, only time taken) time curl --user someuser:somepassword http://exampledomain:8080/dummy-service/test/ldap (ignore response, but look at time taken) result: Almost always takes 15m32.4s plus up to 4s or so, but usually within a few ms. When it "works well" (described in detail above), the first run takes about 4-7s, and after that it takes around 0.02s to 0.1s. expected: It should always be quick, within some small multiple of the connection timeout, preferrably 1x. So in this case, it should take about 1s extra, or at most a few seconds, so 1.1s on a warmed up tomcat. Side comment: other LDAP clients support an arbitrary number of urls, or one line with all the urls together... I find it limiting to have only 2 that you can set here. We have 3 LDAP servers. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org