Hello! I have tried to run the test with protocol "TLSv1.2", didn't see any difference.
Regards, -- Ilya Kasnacheev ср, 24 окт. 2018 г. в 20:23, Loredana Radulescu Ivanoff <lradu...@tibco.com >: > Hello again, > > I am working on getting the full SSL logs over to you, but I have one more > question in between: TLS v1.3 is enabled by default in JDK 11, and my app > is using TLS v1.2 specifically. There's a known issue that's recently > addressed by the JDK related to TLS v1.3 half close policy, details here: > https://bugs.java.com/view_bug.do?bug_id=8207009 > > Would you be able to confirm whether your SSL test runs successfully when > the connecting client/server use TLS v.12 specifically ? > > FYI, I have tried specifically disabling TLS v1.3 using both the > "jdk.tls.client.protocols" and "jdk.tls.server.protocols" system > properties, and also set "jdk.tls.acknowledgeCloseNotify" to true on both > sides as indicated here - https://bugs.java.com/view_bug.do?bug_id=8208526 > > Based on my explorations so far, I think this may be a JDK issue > (specifically in the JSSE provider) that has not been addressed yet. Either > way, do you think three is anything that could be done in Ignite to > explicitly close the connection on both sides in a scenario like this ? > > What I can safely share on the SSL logs so far is this (both nodes get > stuck, node 1 in failing to close the SSL connection, node 2 in waiting for > partition exchange) > > Node 1: > > "2018-10-23 14:18:40.981 PDT|SSLEngineImpl.java:715|Closing inbound of > SSLEngine > javax.net.ssl|ERROR|3F|grid-nio-worker-tcp-comm-1-#26%%|2018-10-23 > 14:18:40.981 PDT|TransportContext.java:313|Fatal (INTERNAL_ERROR): closing > inbound before receiving peer's close_notify ( > "throwable" : { > javax.net.ssl.SSLException: closing inbound before receiving peer's > close_notify > at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:129) > at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:308) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:264) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:255) > at > java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:724) > at > org.apache.ignite.internal.util.nio.ssl.GridNioSslHandler.shutdown(GridNioSslHandler.java:185) > at > org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter.onSessionClosed(GridNioSslFilter.java:223) > at > org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:95) > at > org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionClosed(GridNioServer.java:3447) > at > org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClosed(GridNioFilterChain.java:149) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2689) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2002) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > at java.base/java.lang.Thread.run(Thread.java:834)} > > ) > 2018-10-23 14:18:40.981 PDT|SSLSessionImpl.java:753|Invalidated session: > Session(1540329505125|TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) > DEBUG 14:18:40,981-0700 [] communication.tcp.TcpCommunicationSpi: Unable > to correctly close inbound data stream (will ignore) " > > > Server 2 > > "javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLSocketOutputRecord.java:310|WRITE: TLS12 application_data, length = 1 > javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLCipher.java:1727|Plaintext before ENCRYPTION ( > 0000: 01 . > ) > javax.net.ssl|DEBUG|46|tcp-disco-sock-reader-#4%%|2018-10-23 14:18:09.194 > PDT|SSLSocketOutputRecord.java:324|Raw write ( > 0000: 17 03 03 00 19 00 00 00 00 00 00 00 A1 45 E5 C8 .............E.. > 0010: 27 3D EE E2 89 4A 0F DD B6 9A 2A A8 5F 6B '=...J....*._k > ) > WARN 14:18:10,585-0700 [*Initialization*] > processors.cache.GridCachePartitionExchangeManager: Still waiting for > initial partition map exchange " > > On Mon, Oct 22, 2018 at 12:31 AM Ilya Kasnacheev < > ilya.kasnach...@gmail.com> wrote: > >> Hello! >> >> I would suggest regular (or DEBUG) Ignite logs + SSL debug logs. >> >> >> https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/ReadDebug.html >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пт, 19 окт. 2018 г. в 20:58, Loredana Radulescu Ivanoff < >> lradu...@tibco.com>: >> >>> Definitely - do you want debug logs, and if yes, should I enable them on >>> all Ignite packages, or just for certain ones? Any other diagnostic flags >>> that I should set? >>> >>> On Fri, Oct 19, 2018 at 7:48 AM Ilya Kasnacheev < >>> ilya.kasnach...@gmail.com> wrote: >>> >>>> Hello! >>>> >>>> Apache Ignite does not claim support JDK 11 at this moment. However, I >>>> was able to run SSL test (TcpDiscoverySslTrustedUntrustedTest) and it >>>> turned out mostly fine. >>>> >>>> More info is needed from your side, such as full instances logs. >>>> >>>> Regards, >>>> -- >>>> Ilya Kasnacheev >>>> >>>> >>>> пт, 19 окт. 2018 г. в 0:56, Loredana Radulescu Ivanoff < >>>> lradu...@tibco.com>: >>>> >>>>> Hello, >>>>> >>>>> I can consistently reproduce this issue with Ignite 2.6.0, JDK 11 and >>>>> SSL enabled: >>>>> >>>>> >>>>> - the second node that I bring up joins, and then shortly after >>>>> freezes and prints this message every minute: >>>>> >>>>> "WARN ...[*Initialization*] >>>>> processors.cache.GridCachePartitionExchangeManager: Still waiting for >>>>> initial partition map exchange" >>>>> >>>>> >>>>> - once the second node joins, the first node starts experiencing >>>>> very frequent 100% CPU spikes; these are the messages I see: >>>>> >>>>> WARN 2018-10-18T13:50:52,728-0700 [] >>>>> communication.tcp.TcpCommunicationSpi: Communication SPI session write >>>>> timed out (consider increasing 'socketWriteTimeout' configuration >>>>> property) >>>>> [remoteAddr=/10.100.36.82:51620, writeTimeout=15000] >>>>> WARN 2018-10-18T13:50:52,737-0700 [] >>>>> communication.tcp.TcpCommunicationSpi: Failed to shutdown SSL session >>>>> gracefully (will force close) [ex=javax.net.ssl.SSLException: Incorrect >>>>> SSL >>>>> engine status after closeOutbound call [status=OK, >>>>> handshakeStatus=NEED_WRAP, >>>>> WARN 2018-10-18T13:51:01,441-0700 [] >>>>> dht.preloader.GridDhtPartitionsExchangeFuture: Unable to await partitions >>>>> release latch within timeout: ServerLatch [permits=1, >>>>> pendingAcks=[aeba8bb7-c9b8-4d46-be8a-df361eaa8fc5], super=CompletableLatch >>>>> [id=exchange, topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] >>>>> >>>>> Other observations: >>>>> >>>>> I can reproduce this every time I start the nodes, and it doesn't >>>>> matter which node comes up first. >>>>> >>>>> >>>>> The issue goes away if I disable SSL. >>>>> >>>>> >>>>> Increasing the socketWriteTimeout, networkTimeout or the >>>>> failureDetectionTimeout does not help. >>>>> >>>>> It seems to be happening only with JDK 11, and not with JDK 8. >>>>> >>>>> >>>>> Do you have any suggestions/known issues about this? >>>>> >>>>> Thank you, >>>>> >>>>> Loredana >>>>> >>>>> >>>>> >>>>> >>>>>