[jira] [Assigned] (CASSANDRA-5391) SSL problems with inter-DC communication
[ https://issues.apache.org/jira/browse/CASSANDRA-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita reassigned CASSANDRA-5391: - Assignee: Yuki Morishita (was: T Jake Luciani) > SSL problems with inter-DC communication > > > Key: CASSANDRA-5391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5391 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.3 > Environment: $ /etc/alternatives/jre_1.6.0/bin/java -version > java version "1.6.0_23" > Java(TM) SE Runtime Environment (build 1.6.0_23-b05) > Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) > $ uname -a > Linux hostname 2.6.32-358.2.1.el6.x86_64 #1 SMP Tue Mar 12 14:18:09 CDT 2013 > x86_64 x86_64 x86_64 GNU/Linux > $ cat /etc/redhat-release > Scientific Linux release 6.3 (Carbon) > $ facter | grep ec2 > ... > ec2_placement => availability_zone=us-east-1d > ... > $ rpm -qi cassandra > cassandra-1.2.3-1.el6.cmp1.noarch > (custom built rpm from cassandra tarball distribution) >Reporter: Ondřej Černoš >Assignee: Yuki Morishita >Priority: Blocker > Attachments: 5391-1.2.txt > > > I get SSL and snappy compression errors in multiple datacenter setup. > The setup is simple: 3 nodes in AWS east, 3 nodes in Rackspace. I use > slightly modified Ec2MultiRegionSnitch in Rackspace (I just added a regex > able to parse the Rackspace/Openstack availability zone which happens to be > in unusual format). > During {{nodetool rebuild}} tests I managed to (consistently) trigger the > following error: > {noformat} > 2013-03-19 12:42:16.059+0100 [Thread-13] [DEBUG] > IncomingTcpConnection.java(79) > org.apache.cassandra.net.IncomingTcpConnection: IOException reading from > socket; closing > java.io.IOException: FAILED_TO_UNCOMPRESS(5) > at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78) > at org.xerial.snappy.SnappyNative.rawUncompress(Native Method) > at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391) > at > org.apache.cassandra.io.compress.SnappyCompressor.uncompress(SnappyCompressor.java:93) > at > org.apache.cassandra.streaming.compress.CompressedInputStream.decompress(CompressedInputStream.java:101) > at > org.apache.cassandra.streaming.compress.CompressedInputStream.read(CompressedInputStream.java:79) > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:337) > at > org.apache.cassandra.utils.BytesReadTracker.readUnsignedShort(BytesReadTracker.java:140) > at > org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:361) > at > org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371) > at > org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:160) > at > org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122) > at > org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226) > at > org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166) > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66) > {noformat} > The exception is raised during DB file download. What is strange is the > following: > * the exception is raised only when rebuildig from AWS into Rackspace > * the exception is raised only when all nodes are up and running in AWS (all > 3). In other words, if I bootstrap from one or two nodes in AWS, the command > succeeds. > Packet-level inspection revealed malformed packets _on both ends of > communication_ (the packet is considered malformed on the machine it > originates on). > Further investigation raised two more concerns: > * We managed to get another stacktrace when testing the scenario. The > exception was raised only once during the tests and was raised when I > throttled the inter-datacenter bandwidth to 1Mbps. > {noformat} > java.lang.RuntimeException: javax.net.ssl.SSLException: bad record MAC > at com.google.common.base.Throwables.propagate(Throwables.java:160) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > at java.lang.Thread.run(Thread.java:662) > Caused by: javax.net.ssl.SSLException: bad record MAC > at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:190) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1649) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1607) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:859) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:755) >
[jira] [Assigned] (CASSANDRA-5391) SSL problems with inter-DC communication
[ https://issues.apache.org/jira/browse/CASSANDRA-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-5391: - Assignee: T Jake Luciani Can you shed any light, Jake? > SSL problems with inter-DC communication > > > Key: CASSANDRA-5391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5391 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.3 > Environment: $ /etc/alternatives/jre_1.6.0/bin/java -version > java version "1.6.0_23" > Java(TM) SE Runtime Environment (build 1.6.0_23-b05) > Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) > $ uname -a > Linux hostname 2.6.32-358.2.1.el6.x86_64 #1 SMP Tue Mar 12 14:18:09 CDT 2013 > x86_64 x86_64 x86_64 GNU/Linux > $ cat /etc/redhat-release > Scientific Linux release 6.3 (Carbon) > $ facter | grep ec2 > ... > ec2_placement => availability_zone=us-east-1d > ... > $ rpm -qi cassandra > cassandra-1.2.3-1.el6.cmp1.noarch > (custom built rpm from cassandra tarball distribution) >Reporter: Ondřej Černoš >Assignee: T Jake Luciani >Priority: Blocker > > I get SSL and snappy compression errors in multiple datacenter setup. > The setup is simple: 3 nodes in AWS east, 3 nodes in Rackspace. I use > slightly modified Ec2MultiRegionSnitch in Rackspace (I just added a regex > able to parse the Rackspace/Openstack availability zone which happens to be > in unusual format). > During {{nodetool rebuild}} tests I managed to (consistently) trigger the > following error: > {noformat} > 2013-03-19 12:42:16.059+0100 [Thread-13] [DEBUG] > IncomingTcpConnection.java(79) > org.apache.cassandra.net.IncomingTcpConnection: IOException reading from > socket; closing > java.io.IOException: FAILED_TO_UNCOMPRESS(5) > at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78) > at org.xerial.snappy.SnappyNative.rawUncompress(Native Method) > at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391) > at > org.apache.cassandra.io.compress.SnappyCompressor.uncompress(SnappyCompressor.java:93) > at > org.apache.cassandra.streaming.compress.CompressedInputStream.decompress(CompressedInputStream.java:101) > at > org.apache.cassandra.streaming.compress.CompressedInputStream.read(CompressedInputStream.java:79) > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:337) > at > org.apache.cassandra.utils.BytesReadTracker.readUnsignedShort(BytesReadTracker.java:140) > at > org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:361) > at > org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371) > at > org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:160) > at > org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122) > at > org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226) > at > org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166) > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66) > {noformat} > The exception is raised during DB file download. What is strange is the > following: > * the exception is raised only when rebuildig from AWS into Rackspace > * the exception is raised only when all nodes are up and running in AWS (all > 3). In other words, if I bootstrap from one or two nodes in AWS, the command > succeeds. > Packet-level inspection revealed malformed packets _on both ends of > communication_ (the packet is considered malformed on the machine it > originates on). > Further investigation raised two more concerns: > * We managed to get another stacktrace when testing the scenario. The > exception was raised only once during the tests and was raised when I > throttled the inter-datacenter bandwidth to 1Mbps. > {noformat} > java.lang.RuntimeException: javax.net.ssl.SSLException: bad record MAC > at com.google.common.base.Throwables.propagate(Throwables.java:160) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > at java.lang.Thread.run(Thread.java:662) > Caused by: javax.net.ssl.SSLException: bad record MAC > at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:190) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1649) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1607) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:859) > at > com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:755) > at > com.sun.