-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Michael,

On 7/8/19 15:36, Osipov, Michael wrote:
> Christopher,
> 
> Am 2019-07-08 um 19:55 schrieb Christopher Schultz:
>> Michael,
>> 
>> On 7/8/19 03:58, Osipov, Michael wrote:
>>> Christopher,
>>> 
>>> Am 2019-07-05 um 19:07 schrieb Christopher Schultz:
>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
>>>> 
>>>> Michael,
>>>> 
>>>> On 7/5/19 11:00, Osipov, Michael wrote:
>>>>> Hi Christopher,
>>>>> 
>>>>> Am 2019-07-02 um 17:49 schrieb [ext] Osipov, Michael:
>>>>>> 
>>>>>> [...]
>>>>>>> During your ~1min stall, Tomcat is still waiting for
>>>>>>> data, right? When the connection fails, Tomcat drops
>>>>>>> its error message at the same time, right? Can you post
>>>>>>> a stack trace of what the Tomcat thread is doing at
>>>>>>> that time? I assume it's blocked on a read of some
>>>>>>> kind.
>>>>>> 
>>>>>> I need to check this with jstack. I'll get back to you as
>>>>>> soon as possible.
>>>>> 
>>>>> So I checked this and was able to get the dump right in the
>>>>> moment the request stalled. To my disappointment the
>>>>> offending thread did not lock or did not wait for read() on
>>>>> the native socket.
>>>>> 
>>>>> I have noticed this:
>>>>>> "http-apr-127.0.1.2-8081-exec-3" #33 daemon prio=5
>>>>>> os_prio=15 tid=0x0000000a68036800 nid=0x188be runnable
>>>>>> [0x00007fffdd1cc000] java.lang.Thread.State: RUNNABLE at 
>>>>>> java.net.PlainSocketImpl.socketConnect(Native Method) at 
>>>>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImp
l.ja
>>>>
>>>>>> 
va:350)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> - - locked <0x0000000965edc140> (a java.net.SocksSocketImpl)
>>>>>> at 
>>>>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSo
cket
>>>>
>>>>>> 
Impl.java:206)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.
java
>>>>
>>>>>> 
:188)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>>> at java.net.Socket.connect(Socket.java:589) at 
>>>>>> java.net.Socket.connect(Socket.java:538) at 
>>>>>> java.net.Socket.<init>(Socket.java:434) at 
>>>>>> java.net.Socket.<init>(Socket.java:211) at 
>>>>>> com.sun.jndi.ldap.Connection.createSocket(Connection.java:375)
>>>>>> at 
>>>>>> com.sun.jndi.ldap.Connection.<init>(Connection.java:215)
>>>>>> at 
>>>>>> com.sun.jndi.ldap.LdapClient.<init>(LdapClient.java:137)
>>>>>> at 
>>>>>> com.sun.jndi.ldap.LdapClient.getInstance(LdapClient.java:1609)
>>>>>> at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2749)
>>>>>> at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:319) at 
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:
199)
>>>>>>
>>>>>>
>>>>
>>>>>> 
at
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java
:217
>>>>
>>>>>> 
)
>>>>>> 
>>>>>> 
>>>> at
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:
195)
>>>>>>
>>>>>>
>>>>
>>>>>> 
at
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java
:217
>>>>
>>>>>> 
)
>>>>>> 
>>>>>> 
>>>> at
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactor
y.ja
>>>>
>>>>>> 
va:156)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory
.jav
>>>>
>>>>>> 
a:86)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> javax.naming.spi.NamingManager.getInitialContext(NamingManager.ja
va:6
>>>>
>>>>>> 
84)
>>>>>> 
>>>>>> 
>>>> at
>>>>>> javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java
:313
>>>>
>>>>>> 
)
>>>>>> 
>>>>>> 
>>>> at javax.naming.InitialContext.init(InitialContext.java:244)
>>>>>> at
>>>>>> javax.naming.InitialContext.<init>(InitialContext.java:216)
>>>>>>
>>>>>> 
at
>>>>>> javax.naming.directory.InitialDirContext.<init>(InitialDirContext
.jav
>>>>
>>>>>> 
a:101)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.dirctxsrc.DirContextSource$GSSInitialDirContext.<
init
>>>>>
>>>>>> 
(DirContextSource.java:115)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.dirctxsrc.DirContextSource$1.run(DirContextSource
.jav
>>>>
>>>>>> 
a:606)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.dirctxsrc.DirContextSource$1.run(DirContextSource
.jav
>>>>
>>>>>> 
a:583)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at java.security.AccessController.doPrivileged(Native
>>>> Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) at 
>>>>>> net.sf.michaelo.dirctxsrc.DirContextSource.getGssApiDirContext(Di
rCon
>>>>
>>>>>> 
textSource.java:583)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.dirctxsrc.DirContextSource.getDirContext(DirConte
xtSo
>>>>
>>>>>> 
urce.java:692)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.open(ActiveDire
ctor
>>>>
>>>>>> 
yRealm.java:321)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.getPrincipal(Ac
tive
>>>>
>>>>>> 
DirectoryRealm.java:268)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.tomcat.realm.ActiveDirectoryRealm.authenticate(Ac
tive
>>>>
>>>>>> 
DirectoryRealm.java:255)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> net.sf.michaelo.tomcat.authenticator.SpnegoAuthenticator.doAuthen
tica
>>>>
>>>>>> 
te(SpnegoAuthenticator.java:166)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authen
tica
>>>>
>>>>>> 
torBase.java:575)
>>>>>> 
>>>>>> 
>>>>>> 
>>>> at
>>>>>> org.apache.catalina.valves.rewrite.RewriteValve.invoke(RewriteVal
ve.j
>>>>
>>>>>> 
ava:556)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> We query the Active Directory via LDAP with the user's
>>>>> Kerberos principal. As you can see the thread is waiting
>>>>> for a socket to connect. No DCs are hardcoded, they are all
>>>>> retreived via DNS SRV lookups for our AD site. The point
>>>>> here is that we have major trouble with two of four DCs at
>>>>> our site not properly respoding to services like DNS,
>>>>> Kerberos, and LDAP. (Completely out of my department's
>>>>> control) I have made a quick standalone reproducer to try
>>>>> those faulty DCs on port 389/3268 and I had my
>>>>> confirmation. They do block the thread for more than a
>>>>> minute (OS connect timeout).
>>>>> 
>>>>> Our counter measures were to reduce the default connect
>>>>> timeout for InitialDirContext down to 1000 ms and query
>>>>> another local AD site which is not serving our subnet.
>>>>> 
>>>>> So, thank you very much giving me the right pointer to
>>>>> start!
>>>> 
>>>> Strange that everything seems to work well when you connect
>>>> directly to Tomcat. Can you confirm that you *never* have any
>>>> issues connecting directly to Tomcat? Or did you just get
>>>> lucky a few times?
>>> 
>>> The issue did not show up via Tomcat directly because Tomcat
>>> does not drop the request (timeout), the client simply waits
>>> for it. Our previous services never used HTTPd as reserve
>>> proxy. I started to use it to gain some experience and prepare
>>> for potential balancing requirements.
>>> 
>>>>> One question arises though: How do I properly size the 
>>>>> ProxyTimeout parameter? The longest possible request?
>>>> 
>>>> I think that's really up to you. If it's too low, you'll end
>>>> up with probably many hung LDAP queries with no client
>>>> waiting on them, right? If it's too high, you'll make users
>>>> wait and they might just stop and try again, which comes to
>>>> the same conclusion.
>>>> 
>>>> What if you add a timeout to your LDAP queries instead?
>>> 
>>> The queries do not hang. The connect does. AS soon as the
>>> connection is established, it is pretty fast. I have not set 
>>> "com.sun.jndi.ldap.connect.timeout=1000" and verified it to
>>> work. It will quickly fail over to the next available DC from
>>> SRV RRs.
>> 
>> That's good that (a) it can be configured and (b) it actually
>> works. Is that LDAP connect timeout set as a system property? Can
>> it be set on a per-connection basis using e.g. a connection URL
>> parameter? I'm mostly asking for my own edification, and possibly
>> to help anyone with a similar problem in the future.
> 
> That's actually straight forward:
> 
>> <GlobalNamingResources> <Resource name="gc/ad001.siemens.net" 
>> type="net.sf.michaelo.dirctxsrc.DirContextSource" 
>> factory="net.sf.michaelo.dirctxsrc.DirContextSourceFactory" 
>> urls="ldap://ad001.siemens.net:3268"; auth="gssapi"
>> loginEntryName="tomcat-initiate" referral="ignore"
>> 
>> additionalProperties="com.siemens.dymowerk.activedirectory.site=S-DEB
LN-01;com.sun.jndi.ldap.connect.timeout=1000"
>>
>> 
/>
>> </GlobalNamingResources>

Oh, interesting -- it's not a part of the URL. That's unfortunate that
it has to be set at the <Resource> level and it can't be set on a
per-server basis. Oh, well. At least it's not a system property
affecting the whole JVM!

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl0l7OoACgkQHPApP6U8
pFgdVBAAjNHmo5PmXnfYuiFaMVeg+fDfAs978+FDZD5Vbn/P5V/NsP933A8nYc60
wDC4kfNsSNzm1CmyvhnptyMpbQokMuDdds3mxa6yHrTgmtlzTvivF4yG8rgNjBvp
zAJxmuQVpuL6bwxeuRV2GDTOrW04RqQpTR8Aagi53ExjRfqigisIbIMjK8EIyZVk
+732bgax6xta6SMTjb8xSXxV/NwtfmyUpsYfBTXlDkdmxdixA9JHXiVRq+ZPGX88
itZYCjA265LYd94gg0b6+sIBH6t27oUHnJ+eMPii+9a3zJK7BKghBEG00pT4T0bf
glFToNi2+jHroFuWomBBJoQvCdoePLwdxYGo/mKAdF+WN+2NbjPl6knGmaHVhxm0
lnTDsJIpi1F8EelvELpGVKlgkRAeuf2pAgTRptPCDVEBjlf3eDJyAQDLikI+CRvh
c9pnDS+4Uj/UeUFloMoQMa5i3OuFt4PY7Kw6qrQDy+UUU4UCFRsdDSzUtWU3DHrn
7FR9Ba8Dygz/z5ErKK+6ezvtcdrKjWN90MPJsx+JuxdRBp9xOg88lyugJfw3kl3h
di3TLj6dcKkMDGlqnkg52TmB92mJFCuni7yuwM25aTwH+p1caawusUzNpCgjJsgy
/rYGrJrN6idMkj3zGKZzfxdyAFjR2l/ly4ClcmR5DMybkMN2qcU=
=EAb4
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to