Re: Cluster freeze with SSL enabled and JDK 11
Perfect, thank you for the update! On Wed, Feb 13, 2019 at 5:58 AM Ilya Kasnacheev wrote: > Hello! > > For TLSv1.2 on Windows the fix is ready and tests are running for it. Hope > that it will be integrated soon. > > Regards, > -- > Ilya Kasnacheev > > > вт, 12 февр. 2019 г. в 20:46, Loredana Radulescu Ivanoff < > lradu...@tibco.com>: > >> Thank you very much for the info, it was very helpful. >> >> I assume it worked on Linux because I specifically set TLS v1.2 as a JVM >> argument, by specifying -Djdk.tls.server.protocols="TLSv1.2" >> -Djdk.tls.client.protocols="TLSv1.2" >> >> Would you be able to provide a (very) loose estimate for the fix? Is it >> likely to go into 2.8? >> >> Thank you again! >> >> On Tue, Feb 12, 2019 at 7:10 AM Ilya Kasnacheev < >> ilya.kasnach...@gmail.com> wrote: >> >>> Hello! >>> >>> It seems that you have problems due not just one but two issues: >>> >>> 1) Java 11 has TLSv1.3 by default and Ignite does not support that - >>> https://issues.apache.org/jira/browse/IGNITE-11298 >>> why it worked for you on CentOS is a mystery. For some reason by Ubuntu >>> has Java 10 in openjdk-11-jdk package and it worked. When I manually >>> installed proper Java 11 it would not work on Linux just the same as on >>> Windows. Falling back to TLSv1.2 could help, but, >>> >>> 2) on Windows SSL fails to work on Java 11 due to mistake in Ignite's >>> NIO code. I also has created the ticket and currently devising a patch: >>> https://issues.apache.org/jira/browse/IGNITE-11299 >>> More details in JIRA. >>> >>> I'm afraid your options are limited on Windows - use older Java or move >>> to Linux. >>> >>> Regards, >>> -- >>> Ilya Kasnacheev >>> >>> >>> пт, 8 февр. 2019 г. в 02:31, Loredana Radulescu Ivanoff < >>> lradu...@tibco.com>: >>> Hello, I would like to restart this topic because I can get a repro on Windows 10 with Java 11 and SSL enabled by starting two nodes using just the 2.7 Ignite distribution. I'm starting the Ignite nodes via ignite.bat, and I've only added a few extra JVM options to allow Ignite to start with Java 11, as follows: --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED --add-exports=java.base/sun.nio.ch=ALL-UNNAMED -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" -Djdk.tls.acknowledgeCloseNotify=true -DIGNITE_QUIET=false -DIGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT=6 I'm attaching the logs from work/log and the configuration I've used. Could you please take a look and let me know if you see something wrong in the configuration, or a possible explanation? What is also interesting is that I used the same setup on two CentOS machines, and the same type of configuration, and the nodes do connect (with SSL and Java 11), without any errors. Could there be a platform issue here? Additionally, I confirmed that the nodes are able to connect as expected on both Windows and CentOS when SSL is disabled (I used the same configuration, but with the sslContextFactory bean commented out. Any help on the issue would be greatly appreciated. Thank you! On Thu, Oct 18, 2018 at 2:56 PM Loredana Radulescu Ivanoff < lradu...@tibco.com> wrote: > Hello, > > I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and > SSL enabled: > > >- the second node that I bring up joins, and then shortly after >freezes and prints this message every minute: > > "WARN ...[*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange" > > >- once the second node joins, the first node starts experiencing >very frequent 100% CPU spikes; these are the messages I see: > > WARN 2018-10-18T13:50:52,728-0700 [] > communication.tcp.TcpCommunicationSpi: Communication SPI session write > timed out (consider increasing 'socketWriteTimeout' configuration > property) > [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] > WARN 2018-10-18T13:50:52,737-0700 [] > communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session > gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect > SSL > engine status after closeOutbound call [status=OK, > handshakeStatus=NEED_WRAP, > WARN 2018-10-18T13:51:01,441-0700 [] > dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions > release latch within timeout: ServerLatch [permits=1, > pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch > [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] > > Other observations: > > I can reproduce this every time I start the nodes, and it doesn't > matter which node comes up first. > > > The issue goes away if I disable SSL. >
Re: Cluster freeze with SSL enabled and JDK 11
Hello! For TLSv1.2 on Windows the fix is ready and tests are running for it. Hope that it will be integrated soon. Regards, -- Ilya Kasnacheev вт, 12 февр. 2019 г. в 20:46, Loredana Radulescu Ivanoff : > Thank you very much for the info, it was very helpful. > > I assume it worked on Linux because I specifically set TLS v1.2 as a JVM > argument, by specifying -Djdk.tls.server.protocols="TLSv1.2" > -Djdk.tls.client.protocols="TLSv1.2" > > Would you be able to provide a (very) loose estimate for the fix? Is it > likely to go into 2.8? > > Thank you again! > > On Tue, Feb 12, 2019 at 7:10 AM Ilya Kasnacheev > wrote: > >> Hello! >> >> It seems that you have problems due not just one but two issues: >> >> 1) Java 11 has TLSv1.3 by default and Ignite does not support that - >> https://issues.apache.org/jira/browse/IGNITE-11298 >> why it worked for you on CentOS is a mystery. For some reason by Ubuntu >> has Java 10 in openjdk-11-jdk package and it worked. When I manually >> installed proper Java 11 it would not work on Linux just the same as on >> Windows. Falling back to TLSv1.2 could help, but, >> >> 2) on Windows SSL fails to work on Java 11 due to mistake in Ignite's NIO >> code. I also has created the ticket and currently devising a patch: >> https://issues.apache.org/jira/browse/IGNITE-11299 >> More details in JIRA. >> >> I'm afraid your options are limited on Windows - use older Java or move >> to Linux. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пт, 8 февр. 2019 г. в 02:31, Loredana Radulescu Ivanoff < >> lradu...@tibco.com>: >> >>> Hello, >>> >>> I would like to restart this topic because I can get a repro on Windows >>> 10 with Java 11 and SSL enabled by starting two nodes using just the 2.7 >>> Ignite distribution. I'm starting the Ignite nodes via ignite.bat, and I've >>> only added a few extra JVM options to allow Ignite to start with Java 11, >>> as follows: >>> >>> --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED >>> --add-exports=java.base/sun.nio.ch=ALL-UNNAMED >>> -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" >>> -Djdk.tls.acknowledgeCloseNotify=true -DIGNITE_QUIET=false >>> -DIGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT=6 >>> >>> I'm attaching the logs from work/log and the configuration I've used. >>> Could you please take a look and let me know if you see something wrong in >>> the configuration, or a possible explanation? >>> >>> What is also interesting is that I used the same setup on two CentOS >>> machines, and the same type of configuration, and the nodes do connect >>> (with SSL and Java 11), without any errors. Could there be a platform issue >>> here? >>> >>> Additionally, I confirmed that the nodes are able to connect as expected >>> on both Windows and CentOS when SSL is disabled (I used the same >>> configuration, but with the sslContextFactory bean commented out. >>> >>> Any help on the issue would be greatly appreciated. Thank you! >>> >>> >>> >>> On Thu, Oct 18, 2018 at 2:56 PM Loredana Radulescu Ivanoff < >>> lradu...@tibco.com> wrote: >>> Hello, I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL enabled: - the second node that I bring up joins, and then shortly after freezes and prints this message every minute: "WARN ...[*Initialization*] processors.cache.GridCachePartitionExchangeManager: Still waiting for initial partition map exchange" - once the second node joins, the first node starts experiencing very frequent 100% CPU spikes; these are the messages I see: WARN 2018-10-18T13:50:52,728-0700 [] communication.tcp.TcpCommunicationSpi: Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] WARN 2018-10-18T13:50:52,737-0700 [] communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL engine status after closeOutbound call [status=OK, handshakeStatus=NEED_WRAP, WARN 2018-10-18T13:51:01,441-0700 [] dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions release latch within timeout: ServerLatch [permits=1, pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] Other observations: I can reproduce this every time I start the nodes, and it doesn't matter which node comes up first. The issue goes away if I disable SSL. Increasing the socketWriteTimeout, networkTimeout or the failureDetectionTimeout does not help. It seems to be happening only with JDK 11, and not with JDK 8. Do you have any suggestions/known issues about this? Thank you, Loredana >>>
Re: Cluster freeze with SSL enabled and JDK 11
Thank you very much for the info, it was very helpful. I assume it worked on Linux because I specifically set TLS v1.2 as a JVM argument, by specifying -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" Would you be able to provide a (very) loose estimate for the fix? Is it likely to go into 2.8? Thank you again! On Tue, Feb 12, 2019 at 7:10 AM Ilya Kasnacheev wrote: > Hello! > > It seems that you have problems due not just one but two issues: > > 1) Java 11 has TLSv1.3 by default and Ignite does not support that - > https://issues.apache.org/jira/browse/IGNITE-11298 > why it worked for you on CentOS is a mystery. For some reason by Ubuntu > has Java 10 in openjdk-11-jdk package and it worked. When I manually > installed proper Java 11 it would not work on Linux just the same as on > Windows. Falling back to TLSv1.2 could help, but, > > 2) on Windows SSL fails to work on Java 11 due to mistake in Ignite's NIO > code. I also has created the ticket and currently devising a patch: > https://issues.apache.org/jira/browse/IGNITE-11299 > More details in JIRA. > > I'm afraid your options are limited on Windows - use older Java or move to > Linux. > > Regards, > -- > Ilya Kasnacheev > > > пт, 8 февр. 2019 г. в 02:31, Loredana Radulescu Ivanoff < > lradu...@tibco.com>: > >> Hello, >> >> I would like to restart this topic because I can get a repro on Windows >> 10 with Java 11 and SSL enabled by starting two nodes using just the 2.7 >> Ignite distribution. I'm starting the Ignite nodes via ignite.bat, and I've >> only added a few extra JVM options to allow Ignite to start with Java 11, >> as follows: >> >> --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED >> --add-exports=java.base/sun.nio.ch=ALL-UNNAMED >> -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" >> -Djdk.tls.acknowledgeCloseNotify=true -DIGNITE_QUIET=false >> -DIGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT=6 >> >> I'm attaching the logs from work/log and the configuration I've used. >> Could you please take a look and let me know if you see something wrong in >> the configuration, or a possible explanation? >> >> What is also interesting is that I used the same setup on two CentOS >> machines, and the same type of configuration, and the nodes do connect >> (with SSL and Java 11), without any errors. Could there be a platform issue >> here? >> >> Additionally, I confirmed that the nodes are able to connect as expected >> on both Windows and CentOS when SSL is disabled (I used the same >> configuration, but with the sslContextFactory bean commented out. >> >> Any help on the issue would be greatly appreciated. Thank you! >> >> >> >> On Thu, Oct 18, 2018 at 2:56 PM Loredana Radulescu Ivanoff < >> lradu...@tibco.com> wrote: >> >>> Hello, >>> >>> I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and >>> SSL enabled: >>> >>> >>>- the second node that I bring up joins, and then shortly after >>>freezes and prints this message every minute: >>> >>> "WARN ...[*Initialization*] >>> processors.cache.GridCachePartitionExchangeManager: Still waiting for >>> initial partition map exchange" >>> >>> >>>- once the second node joins, the first node starts experiencing >>>very frequent 100% CPU spikes; these are the messages I see: >>> >>> WARN 2018-10-18T13:50:52,728-0700 [] >>> communication.tcp.TcpCommunicationSpi: Communication SPI session write >>> timed out (consider increasing 'socketWriteTimeout' configuration property) >>> [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] >>> WARN 2018-10-18T13:50:52,737-0700 [] >>> communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session >>> gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL >>> engine status after closeOutbound call [status=OK, >>> handshakeStatus=NEED_WRAP, >>> WARN 2018-10-18T13:51:01,441-0700 [] >>> dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions >>> release latch within timeout: ServerLatch [permits=1, >>> pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch >>> [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] >>> >>> Other observations: >>> >>> I can reproduce this every time I start the nodes, and it doesn't matter >>> which node comes up first. >>> >>> >>> The issue goes away if I disable SSL. >>> >>> >>> Increasing the socketWriteTimeout, networkTimeout or the >>> failureDetectionTimeout does not help. >>> >>> It seems to be happening only with JDK 11, and not with JDK 8. >>> >>> >>> Do you have any suggestions/known issues about this? >>> >>> Thank you, >>> >>> Loredana >>> >>> >>> >>> >>>
Re: Cluster freeze with SSL enabled and JDK 11
Hello! It seems that you have problems due not just one but two issues: 1) Java 11 has TLSv1.3 by default and Ignite does not support that - https://issues.apache.org/jira/browse/IGNITE-11298 why it worked for you on CentOS is a mystery. For some reason by Ubuntu has Java 10 in openjdk-11-jdk package and it worked. When I manually installed proper Java 11 it would not work on Linux just the same as on Windows. Falling back to TLSv1.2 could help, but, 2) on Windows SSL fails to work on Java 11 due to mistake in Ignite's NIO code. I also has created the ticket and currently devising a patch: https://issues.apache.org/jira/browse/IGNITE-11299 More details in JIRA. I'm afraid your options are limited on Windows - use older Java or move to Linux. Regards, -- Ilya Kasnacheev пт, 8 февр. 2019 г. в 02:31, Loredana Radulescu Ivanoff : > Hello, > > I would like to restart this topic because I can get a repro on Windows 10 > with Java 11 and SSL enabled by starting two nodes using just the 2.7 > Ignite distribution. I'm starting the Ignite nodes via ignite.bat, and I've > only added a few extra JVM options to allow Ignite to start with Java 11, > as follows: > > --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED > --add-exports=java.base/sun.nio.ch=ALL-UNNAMED > -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" > -Djdk.tls.acknowledgeCloseNotify=true -DIGNITE_QUIET=false > -DIGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT=6 > > I'm attaching the logs from work/log and the configuration I've used. > Could you please take a look and let me know if you see something wrong in > the configuration, or a possible explanation? > > What is also interesting is that I used the same setup on two CentOS > machines, and the same type of configuration, and the nodes do connect > (with SSL and Java 11), without any errors. Could there be a platform issue > here? > > Additionally, I confirmed that the nodes are able to connect as expected > on both Windows and CentOS when SSL is disabled (I used the same > configuration, but with the sslContextFactory bean commented out. > > Any help on the issue would be greatly appreciated. Thank you! > > > > On Thu, Oct 18, 2018 at 2:56 PM Loredana Radulescu Ivanoff < > lradu...@tibco.com> wrote: > >> Hello, >> >> I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL >> enabled: >> >> >>- the second node that I bring up joins, and then shortly after >>freezes and prints this message every minute: >> >> "WARN ...[*Initialization*] >> processors.cache.GridCachePartitionExchangeManager: Still waiting for >> initial partition map exchange" >> >> >>- once the second node joins, the first node starts experiencing very >>frequent 100% CPU spikes; these are the messages I see: >> >> WARN 2018-10-18T13:50:52,728-0700 [] >> communication.tcp.TcpCommunicationSpi: Communication SPI session write >> timed out (consider increasing 'socketWriteTimeout' configuration property) >> [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] >> WARN 2018-10-18T13:50:52,737-0700 [] >> communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session >> gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL >> engine status after closeOutbound call [status=OK, >> handshakeStatus=NEED_WRAP, >> WARN 2018-10-18T13:51:01,441-0700 [] >> dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions >> release latch within timeout: ServerLatch [permits=1, >> pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch >> [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] >> >> Other observations: >> >> I can reproduce this every time I start the nodes, and it doesn't matter >> which node comes up first. >> >> >> The issue goes away if I disable SSL. >> >> >> Increasing the socketWriteTimeout, networkTimeout or the >> failureDetectionTimeout does not help. >> >> It seems to be happening only with JDK 11, and not with JDK 8. >> >> >> Do you have any suggestions/known issues about this? >> >> Thank you, >> >> Loredana >> >> >> >> >>
Re: Cluster freeze with SSL enabled and JDK 11
Hello, I would like to restart this topic because I can get a repro on Windows 10 with Java 11 and SSL enabled by starting two nodes using just the 2.7 Ignite distribution. I'm starting the Ignite nodes via ignite.bat, and I've only added a few extra JVM options to allow Ignite to start with Java 11, as follows: --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED --add-exports=java.base/sun.nio.ch=ALL-UNNAMED -Djdk.tls.server.protocols="TLSv1.2" -Djdk.tls.client.protocols="TLSv1.2" -Djdk.tls.acknowledgeCloseNotify=true -DIGNITE_QUIET=false -DIGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT=6 I'm attaching the logs from work/log and the configuration I've used. Could you please take a look and let me know if you see something wrong in the configuration, or a possible explanation? What is also interesting is that I used the same setup on two CentOS machines, and the same type of configuration, and the nodes do connect (with SSL and Java 11), without any errors. Could there be a platform issue here? Additionally, I confirmed that the nodes are able to connect as expected on both Windows and CentOS when SSL is disabled (I used the same configuration, but with the sslContextFactory bean commented out. Any help on the issue would be greatly appreciated. Thank you! On Thu, Oct 18, 2018 at 2:56 PM Loredana Radulescu Ivanoff < lradu...@tibco.com> wrote: > Hello, > > I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL > enabled: > > >- the second node that I bring up joins, and then shortly after >freezes and prints this message every minute: > > "WARN ...[*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange" > > >- once the second node joins, the first node starts experiencing very >frequent 100% CPU spikes; these are the messages I see: > > WARN 2018-10-18T13:50:52,728-0700 [] > communication.tcp.TcpCommunicationSpi: Communication SPI session write > timed out (consider increasing 'socketWriteTimeout' configuration property) > [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] > WARN 2018-10-18T13:50:52,737-0700 [] > communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session > gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL > engine status after closeOutbound call [status=OK, > handshakeStatus=NEED_WRAP, > WARN 2018-10-18T13:51:01,441-0700 [] > dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions > release latch within timeout: ServerLatch [permits=1, > pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch > [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] > > Other observations: > > I can reproduce this every time I start the nodes, and it doesn't matter > which node comes up first. > > > The issue goes away if I disable SSL. > > > Increasing the socketWriteTimeout, networkTimeout or the > failureDetectionTimeout does not help. > > It seems to be happening only with JDK 11, and not with JDK 8. > > > Do you have any suggestions/known issues about this? > > Thank you, > > Loredana > > > > > <>
Re: Cluster freeze with SSL enabled and JDK 11
Hello! I have tried running SSL tests again and they seem to pass (only one test fails for some different reason) Can you try running those 2 nodes in stand-alone processes, see if problem persists? I can see that you have SSL-enabled Tomcat running in same VM, which I imaging could interfere with Ignite's SSL. Note that you will need to do some load (such as REST cache operations) to see if communication indeed works (or doesn't). Regards, -- Ilya Kasnacheev ср, 28 нояб. 2018 г. в 16:01, Loredana Radulescu Ivanoff : > Hello again, > > I haven't been able to solve this issue on my own, so I'm hoping you'd be > able to take another look. > > To recap: only with Java 11 and TLS enabled, the second node I bring in > the cluster never starts up, and remains stuck at "Still waiting for > initial partition map exchange". The first nodes keeps logging "Unable to > await partitions release latch within timeout". To me, this looks like an > Ignite issue, and no matter what causes the situation (arguably in this > case an SSL error), there should be a more elegant exit out of it, i.e. the > second node should give up after a while, if there isn't a better way to > retry and achieve successful communication. The two nodes are able to > communicate, and increasing various network timeouts/failure detection > timeout does not help. > > Previously it was mentioned that the Ignite unit test did not show a > repro. I suggest running a test that uses two different machines, because > when I run the nodes on the same machine, I do not get a repro either. > > Attaching here logging from the two nodes including SSL messages. > > Is Ignite support for Java 11 going to be available before Oracle ends > free commercial support for Java 1.8 in Jan 2019? > > Thank you. > > On Thu, Oct 25, 2018 at 9:29 AM Ilya Kasnacheev > wrote: > >> Hello! >> >> I have tried to run the test with protocol "TLSv1.2", didn't see any >> difference. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> ср, 24 окт. 2018 г. в 20:23, Loredana Radulescu Ivanoff < >> lradu...@tibco.com>: >> >>> Hello again, >>> >>> I am working on getting the full SSL logs over to you, but I have one >>> more question in between: TLS v1.3 is enabled by default in JDK 11, and my >>> app is using TLS v1.2 specifically. There's a known issue that's recently >>> addressed by the JDK related to TLS v1.3 half close policy, details here: >>> https://bugs.java.com/view_bug.do?bug_id=8207009 >>> >>> Would you be able to confirm whether your SSL test runs successfully >>> when the connecting client/server use TLS v.12 specifically ? >>> >>> FYI, I have tried specifically disabling TLS v1.3 using both the >>> "jdk.tls.client.protocols" and "jdk.tls.server.protocols" system >>> properties, and also set "jdk.tls.acknowledgeCloseNotify" to true on >>> both sides as indicated here - >>> https://bugs.java.com/view_bug.do?bug_id=8208526 >>> >>> Based on my explorations so far, I think this may be a JDK issue >>> (specifically in the JSSE provider) that has not been addressed yet. Either >>> way, do you think three is anything that could be done in Ignite to >>> explicitly close the connection on both sides in a scenario like this ? >>> >>> What I can safely share on the SSL logs so far is this (both nodes get >>> stuck, node 1 in failing to close the SSL connection, node 2 in waiting for >>> partition exchange) >>> >>> Node 1: >>> >>> "2018-10-23 14:18:40.981 PDT|SSLEngineImpl.java:715|Closing inbound of >>> SSLEngine >>> javax.net.ssl|ERROR|3F|grid-nio-worker-tcp-comm-1-#26%%|2018-10-23 >>> 14:18:40.981 PDT|TransportContext.java:313|Fatal (INTERNAL_ERROR): closing >>> inbound before receiving peer's close_notify ( >>> "throwable" : { >>> javax.net.ssl.SSLException: closing inbound before receiving peer's >>> close_notify >>> at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:129) >>> at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117) >>> at >>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:308) >>> at >>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:264) >>> at >>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:255) >>> at >>> java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:724) >>> at >>> org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.shutdown(GridNioSslHandler.java:185) >>> at >>> org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.onSessionClosed(GridNioSslFilter.java:223) >>> at >>> org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:95) >>> at >>> org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionClosed(GridNioServer.java:3447) >>> at >>> org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClosed(GridNioFilterChain.java:149) >>> at >>> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridN
Re: Cluster freeze with SSL enabled and JDK 11
Hello! I have tried to run the test with protocol "TLSv1.2", didn't see any difference. Regards, -- Ilya Kasnacheev ср, 24 окт. 2018 г. в 20:23, Loredana Radulescu Ivanoff : > Hello again, > > I am working on getting the full SSL logs over to you, but I have one more > question in between: TLS v1.3 is enabled by default in JDK 11, and my app > is using TLS v1.2 specifically. There's a known issue that's recently > addressed by the JDK related to TLS v1.3 half close policy, details here: > https://bugs.java.com/view_bug.do?bug_id=8207009 > > Would you be able to confirm whether your SSL test runs successfully when > the connecting client/server use TLS v.12 specifically ? > > FYI, I have tried specifically disabling TLS v1.3 using both the > "jdk.tls.client.protocols" and "jdk.tls.server.protocols" system > properties, and also set "jdk.tls.acknowledgeCloseNotify" to true on both > sides as indicated here - https://bugs.java.com/view_bug.do?bug_id=8208526 > > Based on my explorations so far, I think this may be a JDK issue > (specifically in the JSSE provider) that has not been addressed yet. Either > way, do you think three is anything that could be done in Ignite to > explicitly close the connection on both sides in a scenario like this ? > > What I can safely share on the SSL logs so far is this (both nodes get > stuck, node 1 in failing to close the SSL connection, node 2 in waiting for > partition exchange) > > Node 1: > > "2018-10-23 14:18:40.981 PDT|SSLEngineImpl.java:715|Closing inbound of > SSLEngine > javax.net.ssl|ERROR|3F|grid-nio-worker-tcp-comm-1-#26%%|2018-10-23 > 14:18:40.981 PDT|TransportContext.java:313|Fatal (INTERNAL_ERROR): closing > inbound before receiving peer's close_notify ( > "throwable" : { > javax.net.ssl.SSLException: closing inbound before receiving peer's > close_notify > at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:129) > at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:308) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:264) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:255) > at > java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:724) > at > org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.shutdown(GridNioSslHandler.java:185) > at > org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.onSessionClosed(GridNioSslFilter.java:223) > at > org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:95) > at > org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionClosed(GridNioServer.java:3447) > at > org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClosed(GridNioFilterChain.java:149) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2689) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2002) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > at java.base/java.lang.Thread.run(Thread.java:834)} > > ) > 2018-10-23 14:18:40.981 PDT|SSLSessionImpl.java:753|Invalidated session: > Session(1540329505125|TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) > DEBUG 14:18:40,981-0700 [] communication.tcp.TcpCommunicationSpi: Unable > to correctly close inbound data stream (will ignore) " > > > Server 2 > > "javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLSocketOutputRecord.java:310|WRITE: TLS12 application_data, length = 1 > javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLCipher.java:1727|Plaintext before ENCRYPTION ( > : 01 . > ) > javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLSocketOutputRecord.java:324|Raw write ( > : 17 03 03 00 19 00 00 00 00 00 00 00 A1 45 E5 C8 .E.. > 0010: 27 3D EE E2 89 4A 0F DD B6 9A 2A A8 5F 6B'=...J*._k > ) > WARN 14:18:10,585-0700 [*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange " > > On Mon, Oct 22, 2018 at 12:31 AM Ilya Kasnacheev < > ilya.kasnach...@gmail.com> wrote: > >> Hello! >> >> I would suggest regular (or DEBUG) Ignite logs + SSL debug logs. >> >> >> https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/ReadDebug.html >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пт, 19 окт. 2018 г. в 20:58, Loredana Radulescu Ivanoff < >> lradu...@tibco.com>: >> >>> Definitely - do you want debug logs, and if yes, should I enable them on >>> all Ignite packages, or just for certain ones? Any other diagn
Re: Cluster freeze with SSL enabled and JDK 11
Hello again, I am working on getting the full SSL logs over to you, but I have one more question in between: TLS v1.3 is enabled by default in JDK 11, and my app is using TLS v1.2 specifically. There's a known issue that's recently addressed by the JDK related to TLS v1.3 half close policy, details here: https://bugs.java.com/view_bug.do?bug_id=8207009 Would you be able to confirm whether your SSL test runs successfully when the connecting client/server use TLS v.12 specifically ? FYI, I have tried specifically disabling TLS v1.3 using both the "jdk.tls.client.protocols" and "jdk.tls.server.protocols" system properties, and also set "jdk.tls.acknowledgeCloseNotify" to true on both sides as indicated here - https://bugs.java.com/view_bug.do?bug_id=8208526 Based on my explorations so far, I think this may be a JDK issue (specifically in the JSSE provider) that has not been addressed yet. Either way, do you think three is anything that could be done in Ignite to explicitly close the connection on both sides in a scenario like this ? What I can safely share on the SSL logs so far is this (both nodes get stuck, node 1 in failing to close the SSL connection, node 2 in waiting for partition exchange) Node 1: "2018-10-23 14:18:40.981 PDT|SSLEngineImpl.java:715|Closing inbound of SSLEngine javax.net.ssl|ERROR|3F|grid-nio-worker-tcp-comm-1-#26%%|2018-10-23 14:18:40.981 PDT|TransportContext.java:313|Fatal (INTERNAL_ERROR): closing inbound before receiving peer's close_notify ( "throwable" : { javax.net.ssl.SSLException: closing inbound before receiving peer's close_notify at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:129) at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:308) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:264) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:255) at java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:724) at org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.shutdown(GridNioSslHandler.java:185) at org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.onSessionClosed(GridNioSslFilter.java:223) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:95) at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionClosed(GridNioServer.java:3447) at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClosed(GridNioFilterChain.java:149) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2689) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2002) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.base/java.lang.Thread.run(Thread.java:834)} ) 2018-10-23 14:18:40.981 PDT|SSLSessionImpl.java:753|Invalidated session: Session(1540329505125|TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) DEBUG 14:18:40,981-0700 [] communication.tcp.TcpCommunicationSpi: Unable to correctly close inbound data stream (will ignore) " Server 2 "javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 PDT|SSLSocketOutputRecord.java:310|WRITE: TLS12 application_data, length = 1 javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 PDT|SSLCipher.java:1727|Plaintext before ENCRYPTION ( : 01 . ) javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 PDT|SSLSocketOutputRecord.java:324|Raw write ( : 17 03 03 00 19 00 00 00 00 00 00 00 A1 45 E5 C8 .E.. 0010: 27 3D EE E2 89 4A 0F DD B6 9A 2A A8 5F 6B'=...J*._k ) WARN 14:18:10,585-0700 [*Initialization*] processors.cache.GridCachePartitionExchangeManager: Still waiting for initial partition map exchange " On Mon, Oct 22, 2018 at 12:31 AM Ilya Kasnacheev wrote: > Hello! > > I would suggest regular (or DEBUG) Ignite logs + SSL debug logs. > > > https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/ReadDebug.html > > Regards, > -- > Ilya Kasnacheev > > > пт, 19 окт. 2018 г. в 20:58, Loredana Radulescu Ivanoff < > lradu...@tibco.com>: > >> Definitely - do you want debug logs, and if yes, should I enable them on >> all Ignite packages, or just for certain ones? Any other diagnostic flags >> that I should set? >> >> On Fri, Oct 19, 2018 at 7:48 AM Ilya Kasnacheev < >> ilya.kasnach...@gmail.com> wrote: >> >>> Hello! >>> >>> Apache Ignite does not claim support JDK 11 at this moment. However, I >>> was able to run SSL test (TcpDiscoverySslTrustedUntrustedTest) and it >>> turned out mostly fine. >>> >>> More info is needed from your side, such as full instances logs. >
Re: Cluster freeze with SSL enabled and JDK 11
Hello! I would suggest regular (or DEBUG) Ignite logs + SSL debug logs. https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/ReadDebug.html Regards, -- Ilya Kasnacheev пт, 19 окт. 2018 г. в 20:58, Loredana Radulescu Ivanoff : > Definitely - do you want debug logs, and if yes, should I enable them on > all Ignite packages, or just for certain ones? Any other diagnostic flags > that I should set? > > On Fri, Oct 19, 2018 at 7:48 AM Ilya Kasnacheev > wrote: > >> Hello! >> >> Apache Ignite does not claim support JDK 11 at this moment. However, I >> was able to run SSL test (TcpDiscoverySslTrustedUntrustedTest) and it >> turned out mostly fine. >> >> More info is needed from your side, such as full instances logs. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пт, 19 окт. 2018 г. в 0:56, Loredana Radulescu Ivanoff < >> lradu...@tibco.com>: >> >>> Hello, >>> >>> I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and >>> SSL enabled: >>> >>> >>>- the second node that I bring up joins, and then shortly after >>>freezes and prints this message every minute: >>> >>> "WARN ...[*Initialization*] >>> processors.cache.GridCachePartitionExchangeManager: Still waiting for >>> initial partition map exchange" >>> >>> >>>- once the second node joins, the first node starts experiencing >>>very frequent 100% CPU spikes; these are the messages I see: >>> >>> WARN 2018-10-18T13:50:52,728-0700 [] >>> communication.tcp.TcpCommunicationSpi: Communication SPI session write >>> timed out (consider increasing 'socketWriteTimeout' configuration property) >>> [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] >>> WARN 2018-10-18T13:50:52,737-0700 [] >>> communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session >>> gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL >>> engine status after closeOutbound call [status=OK, >>> handshakeStatus=NEED_WRAP, >>> WARN 2018-10-18T13:51:01,441-0700 [] >>> dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions >>> release latch within timeout: ServerLatch [permits=1, >>> pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch >>> [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] >>> >>> Other observations: >>> >>> I can reproduce this every time I start the nodes, and it doesn't matter >>> which node comes up first. >>> >>> >>> The issue goes away if I disable SSL. >>> >>> >>> Increasing the socketWriteTimeout, networkTimeout or the >>> failureDetectionTimeout does not help. >>> >>> It seems to be happening only with JDK 11, and not with JDK 8. >>> >>> >>> Do you have any suggestions/known issues about this? >>> >>> Thank you, >>> >>> Loredana >>> >>> >>> >>> >>>
Re: Cluster freeze with SSL enabled and JDK 11
Definitely - do you want debug logs, and if yes, should I enable them on all Ignite packages, or just for certain ones? Any other diagnostic flags that I should set? On Fri, Oct 19, 2018 at 7:48 AM Ilya Kasnacheev wrote: > Hello! > > Apache Ignite does not claim support JDK 11 at this moment. However, I was > able to run SSL test (TcpDiscoverySslTrustedUntrustedTest) and it turned > out mostly fine. > > More info is needed from your side, such as full instances logs. > > Regards, > -- > Ilya Kasnacheev > > > пт, 19 окт. 2018 г. в 0:56, Loredana Radulescu Ivanoff >: > >> Hello, >> >> I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL >> enabled: >> >> >>- the second node that I bring up joins, and then shortly after >>freezes and prints this message every minute: >> >> "WARN ...[*Initialization*] >> processors.cache.GridCachePartitionExchangeManager: Still waiting for >> initial partition map exchange" >> >> >>- once the second node joins, the first node starts experiencing very >>frequent 100% CPU spikes; these are the messages I see: >> >> WARN 2018-10-18T13:50:52,728-0700 [] >> communication.tcp.TcpCommunicationSpi: Communication SPI session write >> timed out (consider increasing 'socketWriteTimeout' configuration property) >> [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] >> WARN 2018-10-18T13:50:52,737-0700 [] >> communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session >> gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL >> engine status after closeOutbound call [status=OK, >> handshakeStatus=NEED_WRAP, >> WARN 2018-10-18T13:51:01,441-0700 [] >> dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions >> release latch within timeout: ServerLatch [permits=1, >> pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch >> [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] >> >> Other observations: >> >> I can reproduce this every time I start the nodes, and it doesn't matter >> which node comes up first. >> >> >> The issue goes away if I disable SSL. >> >> >> Increasing the socketWriteTimeout, networkTimeout or the >> failureDetectionTimeout does not help. >> >> It seems to be happening only with JDK 11, and not with JDK 8. >> >> >> Do you have any suggestions/known issues about this? >> >> Thank you, >> >> Loredana >> >> >> >> >>
Re: Cluster freeze with SSL enabled and JDK 11
Hello! Apache Ignite does not claim support JDK 11 at this moment. However, I was able to run SSL test (TcpDiscoverySslTrustedUntrustedTest) and it turned out mostly fine. More info is needed from your side, such as full instances logs. Regards, -- Ilya Kasnacheev пт, 19 окт. 2018 г. в 0:56, Loredana Radulescu Ivanoff : > Hello, > > I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL > enabled: > > >- the second node that I bring up joins, and then shortly after >freezes and prints this message every minute: > > "WARN ...[*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange" > > >- once the second node joins, the first node starts experiencing very >frequent 100% CPU spikes; these are the messages I see: > > WARN 2018-10-18T13:50:52,728-0700 [] > communication.tcp.TcpCommunicationSpi: Communication SPI session write > timed out (consider increasing 'socketWriteTimeout' configuration property) > [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] > WARN 2018-10-18T13:50:52,737-0700 [] > communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session > gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL > engine status after closeOutbound call [status=OK, > handshakeStatus=NEED_WRAP, > WARN 2018-10-18T13:51:01,441-0700 [] > dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions > release latch within timeout: ServerLatch [permits=1, > pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch > [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] > > Other observations: > > I can reproduce this every time I start the nodes, and it doesn't matter > which node comes up first. > > > The issue goes away if I disable SSL. > > > Increasing the socketWriteTimeout, networkTimeout or the > failureDetectionTimeout does not help. > > It seems to be happening only with JDK 11, and not with JDK 8. > > > Do you have any suggestions/known issues about this? > > Thank you, > > Loredana > > > > >
Re: Cluster freeze with SSL enabled and JDK 11
Hello I think that the only supported version is Java 8 right now, and Java 11 support should come with 2.7 or 2.8 release. https://issues.apache.org/jira/browse/IGNITE-8441 Cheers Gianluca Il giorno gio 18 ott 2018 alle ore 23:56 Loredana Radulescu Ivanoff < lradu...@tibco.com> ha scritto: > Hello, > > I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL > enabled: > > >- the second node that I bring up joins, and then shortly after >freezes and prints this message every minute: > > "WARN ...[*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange" > > >- once the second node joins, the first node starts experiencing very >frequent 100% CPU spikes; these are the messages I see: > > WARN 2018-10-18T13:50:52,728-0700 [] > communication.tcp.TcpCommunicationSpi: Communication SPI session write > timed out (consider increasing 'socketWriteTimeout' configuration property) > [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] > WARN 2018-10-18T13:50:52,737-0700 [] > communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session > gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL > engine status after closeOutbound call [status=OK, > handshakeStatus=NEED_WRAP, > WARN 2018-10-18T13:51:01,441-0700 [] > dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions > release latch within timeout: ServerLatch [permits=1, > pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch > [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] > > Other observations: > > I can reproduce this every time I start the nodes, and it doesn't matter > which node comes up first. > > > The issue goes away if I disable SSL. > > > Increasing the socketWriteTimeout, networkTimeout or the > failureDetectionTimeout does not help. > > It seems to be happening only with JDK 11, and not with JDK 8. > > > Do you have any suggestions/known issues about this? > > Thank you, > > Loredana > > > > >
Cluster freeze with SSL enabled and JDK 11
Hello, I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and SSL enabled: - the second node that I bring up joins, and then shortly after freezes and prints this message every minute: "WARN ...[*Initialization*] processors.cache.GridCachePartitionExchangeManager: Still waiting for initial partition map exchange" - once the second node joins, the first node starts experiencing very frequent 100% CPU spikes; these are the messages I see: WARN 2018-10-18T13:50:52,728-0700 [] communication.tcp.TcpCommunicationSpi: Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] WARN 2018-10-18T13:50:52,737-0700 [] communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect SSL engine status after closeOutbound call [status=OK, handshakeStatus=NEED_WRAP, WARN 2018-10-18T13:51:01,441-0700 [] dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions release latch within timeout: ServerLatch [permits=1, pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] Other observations: I can reproduce this every time I start the nodes, and it doesn't matter which node comes up first. The issue goes away if I disable SSL. Increasing the socketWriteTimeout, networkTimeout or the failureDetectionTimeout does not help. It seems to be happening only with JDK 11, and not with JDK 8. Do you have any suggestions/known issues about this? Thank you, Loredana