Re: nodetool repair keeping an empty cluster busy
Sven So basically when you run a repair you are essentially telling your cluster to run a validation compaction, which generates a merkle tree on all the nodes. These trees are used to identify the inconsistencies. So there is quite a bit of streaming which you see as your network traffic. Rahul On Wed, Dec 11, 2013 at 11:02 AM, Sven Stark sven.st...@m-square.com.auwrote: Corollary: what is getting shipped over the wire? The ganglia screenshot shows the network traffic on all the three hosts on which I ran the nodetool repair. [image: Inline image 1] remember UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 Much appreciated. Sven On Wed, Dec 11, 2013 at 3:56 PM, Sven Stark sven.st...@m-square.com.auwrote: Howdy! Not a matter of life or death, just curious. I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge boxes in AWS. Silly me forgot the correct replication factor for one of the needed keyspaces. So I changed it via cli and ran a nodetool repair. Well .. there is no data at all in the keyspace yet, only the definition and nodetool repair ran about 20minutes using 2 of the 8 CPU fully. Any hints what nodetool repair is doing on an empty cluster that makes the host spin so hard? Cheers, Sven == Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie Cpu(s): 22.7%us, 1.0%sy, 2.9%ni, 73.0%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 15339196k total, 7474360k used, 7864836k free, 251904k buffers Swap:0k total,0k used,0k free, 798324k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10840 cassandr 20 0 8354m 4.1g 19m S 218 28.0 35:25.73 jsvc 16675 kafka 20 0 3987m 192m 12m S2 1.3 0:47.89 java 20328 root 20 0 5613m 569m 16m S2 3.8 1:35.13 jsvc 5969 exhibito 20 0 6423m 116m 12m S1 0.8 0:25.87 java 14436 tomcat7 20 0 3701m 167m 11m S1 1.1 0:25.80 java 6278 exhibito 20 0 6487m 119m 9984 S0 0.8 0:22.63 java 17713 storm 20 0 6033m 159m 11m S0 1.1 0:10.99 java 18769 storm 20 0 5773m 156m 11m S0 1.0 0:10.71 java root@xxx-01:~# nodetool -h `hostname` status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 root@xxx-01:~# nodetool -h `hostname` compactionstats pending tasks: 1 compaction typekeyspace column family completed total unit progress Active compaction remaining time :n/a root@xxx-01:~# nodetool -h `hostname` netstats Mode: NORMAL Not sending any streams. Not receiving any streams. Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool NameActive Pending Completed Commandsn/a 0 57155 Responses n/a 0 14573 image.png
Re: nodetool repair keeping an empty cluster busy
Hi Rahul, thanks for replying. Could you please be a bit more specific, though. Eg what exactly is being compacted - there is/was no data at all in the cluster save for a few hundred kB in the system CF (see the nodetool status output). Or - how can those few hundred kB in data generate Gb of network traffic? Cheers, Sven On Wed, Dec 11, 2013 at 7:56 PM, Rahul Menon ra...@apigee.com wrote: Sven So basically when you run a repair you are essentially telling your cluster to run a validation compaction, which generates a merkle tree on all the nodes. These trees are used to identify the inconsistencies. So there is quite a bit of streaming which you see as your network traffic. Rahul On Wed, Dec 11, 2013 at 11:02 AM, Sven Stark sven.st...@m-square.com.auwrote: Corollary: what is getting shipped over the wire? The ganglia screenshot shows the network traffic on all the three hosts on which I ran the nodetool repair. [image: Inline image 1] remember UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 Much appreciated. Sven On Wed, Dec 11, 2013 at 3:56 PM, Sven Stark sven.st...@m-square.com.auwrote: Howdy! Not a matter of life or death, just curious. I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge boxes in AWS. Silly me forgot the correct replication factor for one of the needed keyspaces. So I changed it via cli and ran a nodetool repair. Well .. there is no data at all in the keyspace yet, only the definition and nodetool repair ran about 20minutes using 2 of the 8 CPU fully. Any hints what nodetool repair is doing on an empty cluster that makes the host spin so hard? Cheers, Sven == Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie Cpu(s): 22.7%us, 1.0%sy, 2.9%ni, 73.0%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 15339196k total, 7474360k used, 7864836k free, 251904k buffers Swap:0k total,0k used,0k free, 798324k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10840 cassandr 20 0 8354m 4.1g 19m S 218 28.0 35:25.73 jsvc 16675 kafka 20 0 3987m 192m 12m S2 1.3 0:47.89 java 20328 root 20 0 5613m 569m 16m S2 3.8 1:35.13 jsvc 5969 exhibito 20 0 6423m 116m 12m S1 0.8 0:25.87 java 14436 tomcat7 20 0 3701m 167m 11m S1 1.1 0:25.80 java 6278 exhibito 20 0 6487m 119m 9984 S0 0.8 0:22.63 java 17713 storm 20 0 6033m 159m 11m S0 1.1 0:10.99 java 18769 storm 20 0 5773m 156m 11m S0 1.0 0:10.71 java root@xxx-01:~# nodetool -h `hostname` status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 root@xxx-01:~# nodetool -h `hostname` compactionstats pending tasks: 1 compaction typekeyspace column family completed total unit progress Active compaction remaining time :n/a root@xxx-01:~# nodetool -h `hostname` netstats Mode: NORMAL Not sending any streams. Not receiving any streams. Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool NameActive Pending Completed Commandsn/a 0 57155 Responses n/a 0 14573 image.png
Re: nodetool repair keeping an empty cluster busy
On Wed, Dec 11, 2013 at 1:35 AM, Sven Stark sven.st...@m-square.com.auwrote: thanks for replying. Could you please be a bit more specific, though. Eg what exactly is being compacted - there is/was no data at all in the cluster save for a few hundred kB in the system CF (see the nodetool status output). Or - how can those few hundred kB in data generate Gb of network traffic? The only answer I can come up with is that the Merkle trees generated and compared by repair are of a fixed size, and don't scale with the data present in the cluster. While I'm pretty sure each node can be aware that it has little to no data to repair, it generates and compares the trees anyway. It's a bit surprising that this might be Gbs of network traffic... The system keyspace will always have some data in it, have you tried only compacting your empty keyspace instead of the whole node? If so, and it exhibits the same behavior, that seems like a bug or at least unexpected behavior to me. If you're running a modern version of Cassandra, I would file a JIRA. =Rob
Re: nodetool repair keeping an empty cluster busy
Corollary: what is getting shipped over the wire? The ganglia screenshot shows the network traffic on all the three hosts on which I ran the nodetool repair. [image: Inline image 1] remember UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 Much appreciated. Sven On Wed, Dec 11, 2013 at 3:56 PM, Sven Stark sven.st...@m-square.com.auwrote: Howdy! Not a matter of life or death, just curious. I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge boxes in AWS. Silly me forgot the correct replication factor for one of the needed keyspaces. So I changed it via cli and ran a nodetool repair. Well .. there is no data at all in the keyspace yet, only the definition and nodetool repair ran about 20minutes using 2 of the 8 CPU fully. Any hints what nodetool repair is doing on an empty cluster that makes the host spin so hard? Cheers, Sven == Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie Cpu(s): 22.7%us, 1.0%sy, 2.9%ni, 73.0%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 15339196k total, 7474360k used, 7864836k free, 251904k buffers Swap:0k total,0k used,0k free, 798324k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10840 cassandr 20 0 8354m 4.1g 19m S 218 28.0 35:25.73 jsvc 16675 kafka 20 0 3987m 192m 12m S2 1.3 0:47.89 java 20328 root 20 0 5613m 569m 16m S2 3.8 1:35.13 jsvc 5969 exhibito 20 0 6423m 116m 12m S1 0.8 0:25.87 java 14436 tomcat7 20 0 3701m 167m 11m S1 1.1 0:25.80 java 6278 exhibito 20 0 6487m 119m 9984 S0 0.8 0:22.63 java 17713 storm 20 0 6033m 159m 11m S0 1.1 0:10.99 java 18769 storm 20 0 5773m 156m 11m S0 1.0 0:10.71 java root@xxx-01:~# nodetool -h `hostname` status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 10.1.2.11 107.47 KB 256 32.9% 1f800723-10e4-4dcd-841f-73709a81d432 rack1 UN 10.1.2.10 127.67 KB 256 32.4% bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639 rack1 UN 10.1.2.12 107.62 KB 256 34.7% 5258f178-b20e-408f-a7bf-b6da2903e026 rack1 root@xxx-01:~# nodetool -h `hostname` compactionstats pending tasks: 1 compaction typekeyspace column family completed total unit progress Active compaction remaining time :n/a root@xxx-01:~# nodetool -h `hostname` netstats Mode: NORMAL Not sending any streams. Not receiving any streams. Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool NameActive Pending Completed Commandsn/a 0 57155 Responses n/a 0 14573 image.png