Re: nodetool repair keeping an empty cluster busy

2013-12-11 Thread Sven Stark
Hi Rahul,

thanks for replying. Could you please be a bit more specific, though. Eg
what exactly is being compacted - there is/was no data at all in the
cluster save for a few hundred kB in the system CF (see the nodetool status
output). Or - how can those few hundred kB in data generate Gb of network
traffic?

Cheers,
Sven



On Wed, Dec 11, 2013 at 7:56 PM, Rahul Menon ra...@apigee.com wrote:

 Sven

 So basically when you run a repair you are essentially telling your
 cluster to run a validation compaction, which generates a merkle tree on
 all the nodes. These trees are used to identify the inconsistencies. So
 there is quite a bit of streaming which you see as your network traffic.

 Rahul


 On Wed, Dec 11, 2013 at 11:02 AM, Sven Stark 
 sven.st...@m-square.com.auwrote:

 Corollary:

 what is getting shipped over the wire? The ganglia screenshot shows the
 network traffic on all the three hosts on which I ran the nodetool repair.

 [image: Inline image 1]

 remember

 UN  10.1.2.11  107.47 KB  256 32.9%
  1f800723-10e4-4dcd-841f-73709a81d432  rack1
 UN  10.1.2.10  127.67 KB  256 32.4%
  bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639  rack1
 UN  10.1.2.12  107.62 KB  256 34.7%
  5258f178-b20e-408f-a7bf-b6da2903e026  rack1

 Much appreciated.
 Sven


 On Wed, Dec 11, 2013 at 3:56 PM, Sven Stark 
 sven.st...@m-square.com.auwrote:

 Howdy!

 Not a matter of life or death, just curious.

 I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge
 boxes in AWS. Silly me forgot the correct replication factor for one of the
 needed keyspaces. So I changed it via cli and ran a nodetool repair.
 Well .. there is no data at all in the keyspace yet, only the definition
 and nodetool repair ran about 20minutes using 2 of the 8 CPU fully.

 Any hints what nodetool repair is doing on an empty cluster that makes
 the host spin so hard?

 Cheers,
 Sven

 ==

 Tasks: 125 total,   1 running, 124 sleeping,   0 stopped,   0 zombie
 Cpu(s): 22.7%us,  1.0%sy,  2.9%ni, 73.0%id,  0.0%wa,  0.0%hi,  0.4%si,
  0.0%st
 Mem:  15339196k total,  7474360k used,  7864836k free,   251904k buffers
 Swap:0k total,0k used,0k free,   798324k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 10840 cassandr  20   0 8354m 4.1g  19m S  218 28.0  35:25.73 jsvc
 16675 kafka 20   0 3987m 192m  12m S2  1.3   0:47.89 java
 20328 root  20   0 5613m 569m  16m S2  3.8   1:35.13 jsvc
  5969 exhibito  20   0 6423m 116m  12m S1  0.8   0:25.87 java
 14436 tomcat7   20   0 3701m 167m  11m S1  1.1   0:25.80 java
  6278 exhibito  20   0 6487m 119m 9984 S0  0.8   0:22.63 java
 17713 storm 20   0 6033m 159m  11m S0  1.1   0:10.99 java
 18769 storm 20   0 5773m 156m  11m S0  1.0   0:10.71 java

 root@xxx-01:~# nodetool -h `hostname` status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID
   Rack
 UN  10.1.2.11  107.47 KB  256 32.9%
  1f800723-10e4-4dcd-841f-73709a81d432  rack1
 UN  10.1.2.10  127.67 KB  256 32.4%
  bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639  rack1
 UN  10.1.2.12  107.62 KB  256 34.7%
  5258f178-b20e-408f-a7bf-b6da2903e026  rack1

 root@xxx-01:~# nodetool -h `hostname` compactionstats
 pending tasks: 1
   compaction typekeyspace   column family
 completed   total  unit  progress
 Active compaction remaining time :n/a

 root@xxx-01:~# nodetool -h `hostname` netstats
 Mode: NORMAL
 Not sending any streams.
 Not receiving any streams.
 Read Repair Statistics:
 Attempted: 0
 Mismatch (Blocking): 0
 Mismatch (Background): 0
 Pool NameActive   Pending  Completed
 Commandsn/a 0  57155
 Responses   n/a 0  14573




image.png

Re: nodetool repair keeping an empty cluster busy

2013-12-10 Thread Sven Stark
Corollary:

what is getting shipped over the wire? The ganglia screenshot shows the
network traffic on all the three hosts on which I ran the nodetool repair.

[image: Inline image 1]

remember

UN  10.1.2.11  107.47 KB  256 32.9%
 1f800723-10e4-4dcd-841f-73709a81d432  rack1
UN  10.1.2.10  127.67 KB  256 32.4%
 bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639  rack1
UN  10.1.2.12  107.62 KB  256 34.7%
 5258f178-b20e-408f-a7bf-b6da2903e026  rack1

Much appreciated.
Sven


On Wed, Dec 11, 2013 at 3:56 PM, Sven Stark sven.st...@m-square.com.auwrote:

 Howdy!

 Not a matter of life or death, just curious.

 I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge boxes
 in AWS. Silly me forgot the correct replication factor for one of the
 needed keyspaces. So I changed it via cli and ran a nodetool repair.
 Well .. there is no data at all in the keyspace yet, only the definition
 and nodetool repair ran about 20minutes using 2 of the 8 CPU fully.

 Any hints what nodetool repair is doing on an empty cluster that makes the
 host spin so hard?

 Cheers,
 Sven

 ==

 Tasks: 125 total,   1 running, 124 sleeping,   0 stopped,   0 zombie
 Cpu(s): 22.7%us,  1.0%sy,  2.9%ni, 73.0%id,  0.0%wa,  0.0%hi,  0.4%si,
  0.0%st
 Mem:  15339196k total,  7474360k used,  7864836k free,   251904k buffers
 Swap:0k total,0k used,0k free,   798324k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 10840 cassandr  20   0 8354m 4.1g  19m S  218 28.0  35:25.73 jsvc
 16675 kafka 20   0 3987m 192m  12m S2  1.3   0:47.89 java
 20328 root  20   0 5613m 569m  16m S2  3.8   1:35.13 jsvc
  5969 exhibito  20   0 6423m 116m  12m S1  0.8   0:25.87 java
 14436 tomcat7   20   0 3701m 167m  11m S1  1.1   0:25.80 java
  6278 exhibito  20   0 6487m 119m 9984 S0  0.8   0:22.63 java
 17713 storm 20   0 6033m 159m  11m S0  1.1   0:10.99 java
 18769 storm 20   0 5773m 156m  11m S0  1.0   0:10.71 java

 root@xxx-01:~# nodetool -h `hostname` status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID
 Rack
 UN  10.1.2.11  107.47 KB  256 32.9%
  1f800723-10e4-4dcd-841f-73709a81d432  rack1
 UN  10.1.2.10  127.67 KB  256 32.4%
  bd6b2059-e9dc-4b01-95ab-d7c4fc0ec639  rack1
 UN  10.1.2.12  107.62 KB  256 34.7%
  5258f178-b20e-408f-a7bf-b6da2903e026  rack1

 root@xxx-01:~# nodetool -h `hostname` compactionstats
 pending tasks: 1
   compaction typekeyspace   column family   completed
   total  unit  progress
 Active compaction remaining time :n/a

 root@xxx-01:~# nodetool -h `hostname` netstats
 Mode: NORMAL
 Not sending any streams.
 Not receiving any streams.
 Read Repair Statistics:
 Attempted: 0
 Mismatch (Blocking): 0
 Mismatch (Background): 0
 Pool NameActive   Pending  Completed
 Commandsn/a 0  57155
 Responses   n/a 0  14573

image.png

Re: Opscenter 3.2.2 (?) jmx auth issues

2013-10-20 Thread Sven Stark
Hi Nick,

thanks for getting back. Much appreciated.

Cheers,
Sven



On Sat, Oct 19, 2013 at 3:58 AM, Nick Bailey n...@datastax.com wrote:

 Sven,

 I've verified there is an issue with jmx authentication in the 3.2.2
 release. Thanks for the bug report! Sorry it's giving you issues. The bug
 should be fixed in the next release of OpsCenter.

 Nick


 On Wed, Oct 16, 2013 at 8:07 PM, Sven Stark sven.st...@m-square.com.auwrote:

 Hi guys,

 we have secured C* jmx with username/pw. We upgraded our Opscenter from
 3.0.2 to 3.2.2 last week and noticed that the agents could not connect
 anymore

 ERROR [jmx-metrics-4] 2013-10-17 00:45:54,437 Error getting general
 metrics
 java.lang.SecurityException: Authentication failed! Credentials required
 at
 com.sun.jmx.remote.security.JMXPluggableAuthenticator.authenticationFailure(JMXPluggableAuthenticator.java:193)
  at
 com.sun.jmx.remote.security.JMXPluggableAuthenticator.authenticate(JMXPluggableAuthenticator.java:145)

 even though the credentials were correctly in
 /etc/opscenter/clusters/foo-cluster.conf

 [jmx]
 username = secret
 password = verysecret
 port = 20001

 Checks with other jmx based tools (nodetool, jmxtrans) confirm that the
 jmx setup is correct.

 Downgrading Opscenter to 3.0.2 immediately resolved the issue. Could
 anybody confirm whether that's a known bug?


 Cheers,
 Sven





Opscenter 3.2.2 (?) jmx auth issues

2013-10-16 Thread Sven Stark
Hi guys,

we have secured C* jmx with username/pw. We upgraded our Opscenter from
3.0.2 to 3.2.2 last week and noticed that the agents could not connect
anymore

ERROR [jmx-metrics-4] 2013-10-17 00:45:54,437 Error getting general metrics
java.lang.SecurityException: Authentication failed! Credentials required
at
com.sun.jmx.remote.security.JMXPluggableAuthenticator.authenticationFailure(JMXPluggableAuthenticator.java:193)
at
com.sun.jmx.remote.security.JMXPluggableAuthenticator.authenticate(JMXPluggableAuthenticator.java:145)

even though the credentials were correctly in
/etc/opscenter/clusters/foo-cluster.conf

[jmx]
username = secret
password = verysecret
port = 20001

Checks with other jmx based tools (nodetool, jmxtrans) confirm that the jmx
setup is correct.

Downgrading Opscenter to 3.0.2 immediately resolved the issue. Could
anybody confirm whether that's a known bug?


Cheers,
Sven