[ https://issues.apache.org/jira/browse/CASSANDRA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175588#comment-14175588 ]
J.B. Langston edited comment on CASSANDRA-8084 at 10/17/14 9:43 PM: -------------------------------------------------------------------- I don't think sstableloader is working right. Here is the output for sstableloader itself: {code} automaton@ip-172-31-7-50:~/Keyspace1/Standard1$ sstableloader -d localhost `pwd` Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-320-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-326-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-325-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-283-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-267-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-211-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-301-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-316-Data.db to [/54.183.192.248, /54.215.139.161, /54.165.222.3, /54.172.118.222] Streaming session ID: ac5dd440-5645-11e4-a813-3d13c3d3c540 progress: [/54.172.118.222 8/8 (100%)] [/54.183.192.248 8/8 (100%)] [/54.165.222.3 8/8 (100%)] [/54.215.139.161 8/8 (100%)] [total: 100% - 2147483647MB/s (avg: 30MB/s) {code} Here is netstats on the node where it is running (54.183.192.248): {code} Responses n/a 0 812 automaton@ip-172-31-7-50:~$ nodetool netstats Mode: NORMAL Bulk Load ac5dd440-5645-11e4-a813-3d13c3d3c540 /172.31.7.50 (using /54.183.192.248) Receiving 8 files, 1059673728 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-10-Data.db 56468194/164372226 bytes(34%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-4-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-3-Data.db 50674396/50674396 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-5-Data.db 68597334/68597334 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-7-Data.db 139068110/139068110 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-6-Data.db 12682638/12682638 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-9-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-8-Data.db 68279024/68279024 bytes(100%) received from /172.31.7.50 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 0 Responses n/a 0 970 {code} Here's netstats on the other node in the same DC (54.215.139.161): {code} automaton@ip-172-31-40-169:~$ nodetool netstats Mode: NORMAL Bulk Load ac5dd440-5645-11e4-a813-3d13c3d3c540 /172.31.7.50 (using /54.183.192.248) Receiving 8 files, 1059673728 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-239-Data.db 68279024/68279024 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-245-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-246-Data.db 43078602/50674396 bytes(85%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-240-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-241-Data.db 12682638/12682638 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-243-Data.db 139068110/139068110 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-242-Data.db 164372226/164372226 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-244-Data.db 68597334/68597334 bytes(100%) received from /172.31.7.50 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 249589 Responses n/a 0 1390344 {code} The IP addresses seem backwards in netstats output. Here is the output of netstat -anp | grep 7000 on the node where sstableloader is running: {code} tcp 0 0 172.31.7.50:7000 0.0.0.0:* LISTEN 21544/java tcp 0 0 172.31.7.50:7000 172.31.5.143:44869 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:56991 172.31.5.143:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:7000 54.165.222.3:50968 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:50599 54.165.222.3:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:50624 54.165.222.3:7000 ESTABLISHED 22226/java tcp 0 1132336 172.31.7.50:50626 54.165.222.3:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:7000 54.172.118.222:58561 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:37769 54.172.118.222:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:37796 54.172.118.222:7000 ESTABLISHED 22226/java tcp 0 1149712 172.31.7.50:37798 54.172.118.222:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:7000 54.183.192.248:47451 ESTABLISHED 21544/java tcp 43688 0 172.31.7.50:7000 54.183.192.248:47453 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:47451 54.183.192.248:7000 ESTABLISHED 22226/java tcp 0 98464 172.31.7.50:47453 54.183.192.248:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:41240 54.215.139.161:7000 ESTABLISHED 22226/java tcp 0 81088 172.31.7.50:41242 54.215.139.161:7000 ESTABLISHED 22226/java {code} It's establishing a connection to itself (54.183.192.248) and to the other node in the local DC (54.215.139.161) with the broadcast address instead of the listen address. was (Author: jblangs...@datastax.com): I don't think sstableloader is working right. Here is the output for sstableloader itself: {code} automaton@ip-172-31-7-50:~/Keyspace1/Standard1$ sstableloader -d localhost `pwd` Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-320-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-326-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-325-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-283-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-267-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-211-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-301-Data.db /home/automaton/Keyspace1/Standard1/Keyspace1-Standard1-jb-316-Data.db to [/54.183.192.248, /54.215.139.161, /54.165.222.3, /54.172.118.222] Streaming session ID: ac5dd440-5645-11e4-a813-3d13c3d3c540 progress: [/54.172.118.222 8/8 (100%)] [/54.183.192.248 8/8 (100%)] [/54.165.222.3 8/8 (100%)] [/54.215.139.161 8/8 (100%)] [total: 100% - 2147483647MB/s (avg: 30MB/s) {code} Here is netstats on the node where it is running: {code} Responses n/a 0 812 automaton@ip-172-31-7-50:~$ nodetool netstats Mode: NORMAL Bulk Load ac5dd440-5645-11e4-a813-3d13c3d3c540 /172.31.7.50 (using /54.183.192.248) Receiving 8 files, 1059673728 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-10-Data.db 56468194/164372226 bytes(34%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-4-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-3-Data.db 50674396/50674396 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-5-Data.db 68597334/68597334 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-7-Data.db 139068110/139068110 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-6-Data.db 12682638/12682638 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-9-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-8-Data.db 68279024/68279024 bytes(100%) received from /172.31.7.50 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 0 Responses n/a 0 970 {code} Here's netstats on the other node in the same DC: {code} automaton@ip-172-31-40-169:~$ nodetool netstats Mode: NORMAL Bulk Load ac5dd440-5645-11e4-a813-3d13c3d3c540 /172.31.7.50 (using /54.183.192.248) Receiving 8 files, 1059673728 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-239-Data.db 68279024/68279024 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-245-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-246-Data.db 43078602/50674396 bytes(85%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-240-Data.db 278000000/278000000 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-241-Data.db 12682638/12682638 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-243-Data.db 139068110/139068110 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-242-Data.db 164372226/164372226 bytes(100%) received from /172.31.7.50 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-244-Data.db 68597334/68597334 bytes(100%) received from /172.31.7.50 Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool Name Active Pending Completed Commands n/a 0 249589 Responses n/a 0 1390344 {code} The IP addresses seem backwards in netstats output. Here is the output of netstat -anp | grep 7000 on the node where sstableloader is running: {code} tcp 0 0 172.31.7.50:7000 0.0.0.0:* LISTEN 21544/java tcp 0 0 172.31.7.50:7000 172.31.5.143:44869 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:56991 172.31.5.143:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:7000 54.165.222.3:50968 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:50599 54.165.222.3:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:50624 54.165.222.3:7000 ESTABLISHED 22226/java tcp 0 1132336 172.31.7.50:50626 54.165.222.3:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:7000 54.172.118.222:58561 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:37769 54.172.118.222:7000 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:37796 54.172.118.222:7000 ESTABLISHED 22226/java tcp 0 1149712 172.31.7.50:37798 54.172.118.222:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:7000 54.183.192.248:47451 ESTABLISHED 21544/java tcp 43688 0 172.31.7.50:7000 54.183.192.248:47453 ESTABLISHED 21544/java tcp 0 0 172.31.7.50:47451 54.183.192.248:7000 ESTABLISHED 22226/java tcp 0 98464 172.31.7.50:47453 54.183.192.248:7000 ESTABLISHED 22226/java tcp 0 0 172.31.7.50:41240 54.215.139.161:7000 ESTABLISHED 22226/java tcp 0 81088 172.31.7.50:41242 54.215.139.161:7000 ESTABLISHED 22226/java {code} It's establishing a connection to itself (54.183.192.248) and to the other node in the local DC (54.215.139.161) with the broadcast address instead of the listen address. > GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE > clusters doesnt use the PRIVATE IPS for Intra-DC communications - When > running nodetool repair > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8084 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8084 > Project: Cassandra > Issue Type: Bug > Components: Config > Environment: Tested this in GCE and AWS clusters. Created multi > region and multi dc cluster once in GCE and once in AWS and ran into the same > problem. > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=12.04 > DISTRIB_CODENAME=precise > DISTRIB_DESCRIPTION="Ubuntu 12.04.3 LTS" > NAME="Ubuntu" > VERSION="12.04.3 LTS, Precise Pangolin" > ID=ubuntu > ID_LIKE=debian > PRETTY_NAME="Ubuntu precise (12.04.3 LTS)" > VERSION_ID="12.04" > Tried to install Apache Cassandra version ReleaseVersion: 2.0.10 and also > latest DSE version which is 4.5 and which corresponds to 2.0.8.39. > Reporter: Jana > Assignee: Yuki Morishita > Labels: features > Fix For: 2.0.12 > > Attachments: 8084-2.0-v2.txt, 8084-2.0-v3.txt, 8084-2.0-v4.txt, > 8084-2.0.txt > > > Neither of these snitches(GossipFilePropertySnitch and EC2MultiRegionSnitch ) > used the PRIVATE IPS for communication between INTRA-DC nodes in my > multi-region multi-dc cluster in cloud(on both AWS and GCE) when I ran > "nodetool repair -local". It works fine during regular reads. > Here are the various cluster flavors I tried and failed- > AWS + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + > (Prefer_local=true) in rackdc-properties file. > AWS + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in > rackdc-properties file. > GCE + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + > (Prefer_local=true) in rackdc-properties file. > GCE + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in > rackdc-properties file. > I am expecting with the above setup all of my nodes in a given DC all > communicate via private ips since the cloud providers dont charge us for > using the private ips and they charge for using public ips. > But they can use PUBLIC IPs for INTER-DC communications which is working as > expected. > Here is a snippet from my log files when I ran the "nodetool repair -local" - > Node responding to 'node running repair' > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,628 Validator.java (line 254) > [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree > to /54.172.118.222 for system_traces/sessions > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,741 Validator.java (line 254) > [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree > to /54.172.118.222 for system_traces/events > Node running repair - > INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,927 RepairSession.java (line > 166) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Received merkle tree for > events from /54.172.118.222 > Note: The IPs its communicating is all PUBLIC Ips and it should have used the > PRIVATE IPs starting with 172.x.x.x > YAML file values : > The listen address is set to: PRIVATE IP > The broadcast address is set to: PUBLIC IP > The SEEDs address is set to: PUBLIC IPs from both DCs > The SNITCHES tried: GPFS and EC2MultiRegionSnitch > RACK-DC: Had prefer_local set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)