[ 
https://issues.apache.org/jira/browse/GEODE-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154754#comment-17154754
 ] 

ASF subversion and git services commented on GEODE-8020:
--------------------------------------------------------

Commit 120f94a3ee1b7934673978ae9c82f1d3e30cb9c8 in geode's branch 
refs/heads/support/1.13 from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=120f94a ]

GEODE-8020: buffer management problems (#5048)

* GEODE-8020: buffer management problems

This fixes some buffer handling in MsgStreamerList and alters
MstStreamer to avoid creating MsgStreamerList and VersionedMsgStreamers
during normal, non-upgrade, operations.

It also changes NioSslEngine to use synchronization in more places,
notably the close() method, which was possibly allowing multiple threads to
change the state of the engine.

* revert unnecessary change to ClusterCommunicationsDUnitTest

* fixing another null version check

* renamed new BufferPool property

* restore logging of ssl exceptions

(cherry picked from commit 7375c591f25bbba413237aed1f56f8a9f70075df)


> buffer corruption in SSL communications
> ---------------------------------------
>
>                 Key: GEODE-8020
>                 URL: https://issues.apache.org/jira/browse/GEODE-8020
>             Project: Geode
>          Issue Type: Bug
>          Components: membership, messaging
>            Reporter: Bruce J Schuchardt
>            Assignee: Bruce J Schuchardt
>            Priority: Major
>             Fix For: 1.14.0
>
>
> update: May 8, 2020: the main problem described here seemed to only occur on 
> JDK8 when TLSv1 is used. JDK11 with TLSv1 doesn't exhibit the problem. Nor is 
> the problem apparent when TLSv1.2 is used on either JDK. This issue is marked 
> resolved but the problem still occurs on JDK8 with TLSv1. Recommend customers 
> use TLSv1.2 or later.  Other buffering problems were found in this 
> investigation and a PR was merged to address those.
> When running an application with SSL enabled I ran into a hang with a lost 
> message.  The sender had a 15 second ack-wait warning pointing to another 
> server in the cluster.  That server had this in its log file at the time the 
> message would have been processed:
> {noformat}
> [info 2020/04/21 11:22:39.437 PDT <P2P message reader for 
> rs-bschuchardt-1053-hydra-client-1(bridgegemfire4_host1_12599:12599)<ec><v1>:41003
>  unshared ordered uid=354 dom #2 port=55262> tid=0xad] P2P message 
> reader@2580db5f io exception for 
> rs-bschuchardt-1053-hydra-client-1(bridgegemfire4_host1_12599:12599)<ec><v1>:41003@354(GEODE
>  1.10.0)
> javax.net.ssl.SSLException: bad record MAC
>       at sun.security.ssl.Alerts.getSSLException(Alerts.java:214)
>       at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
>       at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:986)
>       at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:912)
>       at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:782)
>       at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
>       at 
> org.apache.geode.internal.net.NioSslEngine.unwrap(NioSslEngine.java:275)
>       at 
> org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2894)
>       at 
> org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
>       at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.BadPaddingException: bad record MAC
>       at sun.security.ssl.InputRecord.decrypt(InputRecord.java:219)
>       at 
> sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:177)
>       at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:979)
>       ... 10 more
> {noformat}
> I bisected to see when this problem was introduced and found it was this 
> commit:
> {noformat}
> commit 418d929e3e03185cd6330c828c9b9ed395a76d4b
> Author: Mario Ivanac <48509724+miva...@users.noreply.github.com>
> Date:   Fri Nov 1 20:28:57 2019 +0100
>     GEODE-6661: Fixed use of Direct and Non-Direct buffers (#4267)
>     - Fixed use of Direct and Non-Direct buffers
> {noformat}
> That commit modified the NioSSLEngine to use a "direct" byte buffer instead 
> of a heap byte buffer.  If I revert that one part of the PR the test works 
> okay.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to