[ https://issues.apache.org/jira/browse/CASSANDRA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166908#comment-14166908 ]
J.B. Langston edited comment on CASSANDRA-8084 at 10/10/14 2:26 PM: -------------------------------------------------------------------- I tested and it appears to work. Here is the cluster I am testing with: {code} Datacenter: DC1_EAST ==================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.165.222.3 711.26 MB 1 25.0% dd449706-2059-4b65-ae98-0012d2cf8f67 rack1 UN 54.172.118.222 561.14 MB 1 25.0% 18cd7d0a-74ca-4835-a7ff-7ffaa92b35ef rack1 Datacenter: DC1_WEST ==================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.183.192.248 721.2 MB 1 25.0% c4dd37f1-d937-4876-8669-f0b01a3942db rack1 UN 54.215.139.161 909.26 MB 1 25.0% 16499349-8cef-4a62-a99c-ab145cb70921 rack1 {code} I wasn't sure initially because the logs and `nodetool netstats` still show the broadcast address. You can see here that nodetool netstats, when run on 54.215.139.161, shows we are streaming from 54.183.192.248 (the broadcast address of the other node in the same DC): {code} Mode: NORMAL Repair dbc7ea40-5082-11e4-8190-c9fac3589773 /54.183.192.248 Receiving 9 files, 229856794 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-100-Data.db 58878176/58878176 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-106-Data.db 97856/97856 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-109-Data.db 69407704/69407704 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-108-Data.db 3203116/3203116 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-102-Data.db 12545306/12545306 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-103-Data.db 69407704/69407704 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-104-Data.db 1536228/1536228 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-105-Data.db 12589230/12589230 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-107-Data.db 2191474/2191474 bytes(100%) received from /54.183.192.248 Sending 5 files, 109645980 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-87-Data.db 14323672/14323672 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-97-Data.db 20581730/20581730 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-98-Data.db 3161694/3161694 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-95-Data.db 69407704/69407704 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-99-Data.db 2171180/2171180 bytes(100%) sent to /54.183.192.248 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 1495191 Responses n/a 0 714928 {code} However, the output of `sudo netstat -anp | grep 7000 | sort -k5` shows that we are only connecting to the local node on its listen address (172.31.7.50): {code} tcp 0 0 172.31.5.143:7000 0.0.0.0:* LISTEN 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34936 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34937 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34938 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34936 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34937 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34938 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.7.50:52125 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.7.50:52126 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57502 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57560 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57601 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57602 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.165.222.3:33876 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.165.222.3:33878 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:44120 54.165.222.3:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:44198 54.165.222.3:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.172.118.222:54515 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.172.118.222:54518 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:35960 54.172.118.222:7000 ESTABLISHED 17279/java tcp 0 161 172.31.5.143:35880 54.172.118.222:7000 ESTABLISHED 17279/java unix 2 [ ] DGRAM 7000 613/acpid {code} The only connections established to the broadcast addresses are for the nodes in the other DC (54.165.222.3 and 54.172.118.222). Is use of the broadcast address in netstats and the logs intentional? I can see some customers getting confused by this. On the other hand, it matches what we show for nodetool ring and status, so I could see arguments both ways. was (Author: jblangs...@datastax.com): I tested and it appears to work. Here is the cluster I am testing with: {code} Datacenter: DC1_EAST ==================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.165.222.3 711.26 MB 1 25.0% dd449706-2059-4b65-ae98-0012d2cf8f67 rack1 UN 54.172.118.222 561.14 MB 1 25.0% 18cd7d0a-74ca-4835-a7ff-7ffaa92b35ef rack1 Datacenter: DC1_WEST ==================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.183.192.248 721.2 MB 1 25.0% c4dd37f1-d937-4876-8669-f0b01a3942db rack1 UN 54.215.139.161 909.26 MB 1 25.0% 16499349-8cef-4a62-a99c-ab145cb70921 rack1 {code} I wasn't sure initially because the logs and `nodetool netstats` still show the broadcast address. You can see here that nodetool netstats, when run on 54.215.139.161, shows we are streaming from 54.183.192.248 (the broadcast address of the other node in the same DC): {code} Mode: NORMAL Repair dbc7ea40-5082-11e4-8190-c9fac3589773 /54.183.192.248 Receiving 9 files, 229856794 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-100-Data.db 58878176/58878176 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-106-Data.db 97856/97856 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-109-Data.db 69407704/69407704 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-108-Data.db 3203116/3203116 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-102-Data.db 12545306/12545306 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-103-Data.db 69407704/69407704 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-104-Data.db 1536228/1536228 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-105-Data.db 12589230/12589230 bytes(100%) received from /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-107-Data.db 2191474/2191474 bytes(100%) received from /54.183.192.248 Sending 5 files, 109645980 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-87-Data.db 14323672/14323672 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-97-Data.db 20581730/20581730 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-98-Data.db 3161694/3161694 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-95-Data.db 69407704/69407704 bytes(100%) sent to /54.183.192.248 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-99-Data.db 2171180/2171180 bytes(100%) sent to /54.183.192.248 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 1495191 Responses n/a 0 714928 {code} However, the output of `sudo netstat -anp | grep 7000 | sort -k5` shows that we are only connecting to the local node on its listen address (172.31.7.50): {code} tcp 0 0 172.31.5.143:7000 0.0.0.0:* LISTEN 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34936 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34937 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.5.143:34938 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34936 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34937 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:34938 172.31.5.143:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.7.50:52125 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 172.31.7.50:52126 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57502 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57560 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57601 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:57602 172.31.7.50:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.165.222.3:33876 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.165.222.3:33878 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:44120 54.165.222.3:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:44198 54.165.222.3:7000 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.172.118.222:54515 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:7000 54.172.118.222:54518 ESTABLISHED 17279/java tcp 0 0 172.31.5.143:35960 54.172.118.222:7000 ESTABLISHED 17279/java tcp 0 161 172.31.5.143:35880 54.172.118.222:7000 ESTABLISHED 17279/java unix 2 [ ] DGRAM 7000 613/acpid {code} The only connections established to the broadcast addresses are for the nodes in the other DC (54.165.222.3 and 54.172.118.222). Is use of the broadcast address in netstats and the logs intentional? I can see some customers getting confused by this. On the other hand, it matches what we show for nodetool ring and status, so... > GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE > clusters doesnt use the PRIVATE IPS for Intra-DC communications - When > running nodetool repair > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8084 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8084 > Project: Cassandra > Issue Type: Bug > Components: Config > Environment: Tested this in GCE and AWS clusters. Created multi > region and multi dc cluster once in GCE and once in AWS and ran into the same > problem. > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=12.04 > DISTRIB_CODENAME=precise > DISTRIB_DESCRIPTION="Ubuntu 12.04.3 LTS" > NAME="Ubuntu" > VERSION="12.04.3 LTS, Precise Pangolin" > ID=ubuntu > ID_LIKE=debian > PRETTY_NAME="Ubuntu precise (12.04.3 LTS)" > VERSION_ID="12.04" > Tried to install Apache Cassandra version ReleaseVersion: 2.0.10 and also > latest DSE version which is 4.5 and which corresponds to 2.0.8.39. > Reporter: Jana > Assignee: Yuki Morishita > Labels: features > Fix For: 2.0.11 > > Attachments: 8084-2.0.txt > > > Neither of these snitches(GossipFilePropertySnitch and EC2MultiRegionSnitch ) > used the PRIVATE IPS for communication between INTRA-DC nodes in my > multi-region multi-dc cluster in cloud(on both AWS and GCE) when I ran > "nodetool repair -local". It works fine during regular reads. > Here are the various cluster flavors I tried and failed- > AWS + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + > (Prefer_local=true) in rackdc-properties file. > AWS + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in > rackdc-properties file. > GCE + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + > (Prefer_local=true) in rackdc-properties file. > GCE + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in > rackdc-properties file. > I am expecting with the above setup all of my nodes in a given DC all > communicate via private ips since the cloud providers dont charge us for > using the private ips and they charge for using public ips. > But they can use PUBLIC IPs for INTER-DC communications which is working as > expected. > Here is a snippet from my log files when I ran the "nodetool repair -local" - > Node responding to 'node running repair' > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,628 Validator.java (line 254) > [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree > to /54.172.118.222 for system_traces/sessions > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,741 Validator.java (line 254) > [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree > to /54.172.118.222 for system_traces/events > Node running repair - > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,927 RepairSession.java (line > 166) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Received merkle tree for > events from /54.172.118.222 > Note: The IPs its communicating is all PUBLIC Ips and it should have used the > PRIVATE IPs starting with 172.x.x.x > YAML file values : > The listen address is set to: PRIVATE IP > The broadcast address is set to: PUBLIC IP > The SEEDs address is set to: PUBLIC IPs from both DCs > The SNITCHES tried: GPFS and EC2MultiRegionSnitch > RACK-DC: Had prefer_local set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)