[ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424727#comment-16424727
 ] 

Xiao Chen commented on HADOOP-15359:
------------------------------------

bq. JDK-7092821
Yes, I wasn't accurate, updated the description. That jira is the closest I can 
find. But if that's taken care of, (i.e. either the synchronized method isn't 
called any more, or the lock removed), then there is no deadlock in this case 
anyways.

One of its linked jiras also mentioned " In order to alleviate this problem 
applications should cache the result of the Cipher.getInstance() call per 
thread and reinitialise (Cipher.init(...)) the cached copy instead of calling 
Cipher.getInstance() again. " but the caller here would be krb5...

> IPC client hang in kerberized cluster due to JDK deadlock
> ---------------------------------------------------------
>
>                 Key: HADOOP-15359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15359
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.6.0, 2.8.0, 3.0.0
>            Reporter: Xiao Chen
>            Priority: Major
>         Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x00007f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x00007f6bc2bd6000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at java.security.Provider.getService(Provider.java:1035)
>         - waiting to lock <0x0000000080277040> (a sun.security.provider.Sun)
>         at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
>         at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
>         at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
>         at javax.crypto.Cipher.getInstance(Cipher.java:513)
>         at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
>         at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
>         at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
>         at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
>         at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
>         at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
>         at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
>         at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
>         at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
>         at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
>         at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
>         at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
>         at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
>         at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
>         at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
>         at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
>         - locked <0x0000000083444878> (a java.nio.HeapByteBuffer)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>         - locked <0x00000000834448c0> (a java.io.BufferedInputStream)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
>         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =============================
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x0000000017ff49f8 (object 0x0000000080277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x0000000050607000
> Java stack information for the threads listed above:
> ===================================================
> "IPC Parameter Sending Thread #29":
>         at java.security.Provider.getService(Provider.java:1035)
>         - waiting to lock <0x0000000080277040> (a sun.security.provider.Sun)
>         at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
>         at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
>         at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
>         at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
>         - locked <0x00000000834386b8> (a java.lang.Object)
>         at javax.crypto.SecretKeyFactory.<init>(SecretKeyFactory.java:121)
>         at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
>         at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
>         at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
>         at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
>         at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
>         at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
>         at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
>         at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
>         at 
> sun.security.jgss.krb5.MessageToken.genSignAndSeqNumber(MessageToken.java:315)
>         at sun.security.jgss.krb5.WrapToken.<init>(WrapToken.java:422)
>         at sun.security.jgss.krb5.Krb5Context.wrap(Krb5Context.java:922)
>         at sun.security.jgss.GSSContextImpl.wrap(GSSContextImpl.java:385)
>         at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.wrap(GssKrb5Base.java:103)
>         at 
> org.apache.hadoop.security.SaslRpcClient$WrappedOutputStream.write(SaslRpcClient.java:636)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>         - locked <0x00000000834389a8> (a java.io.BufferedOutputStream)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:123)
>         at org.apache.hadoop.ipc.Client$Connection$3.run(Client.java:1072)
>         - locked <0x00000000834389c0> (a java.io.DataOutputStream)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {noformat}
> After some research, the closest I found is 
> https://bugs.openjdk.java.net/browse/JDK-7092821 . 
> Filing this jira for discussions.
> JDK version used was 1.8.0_144.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to