That is correct, only when node goes down or up, no other time.
On Tue, Dec 22, 2009 at 8:04 PM, Jaakko <rosvopaalli...@gmail.com> wrote: > OK, just to make sure: you can see these gossip/state messages when > the node is going down and coming back up again, but not afterwards? > That is, after you restart the node, you see "10.6.168.20 UP" and > "state jump to normal" only once and when the write rate goes to zero > and/or comes back to 230? > > > On Wed, Dec 23, 2009 at 12:01 PM, Ramzi Rabah <rra...@playdom.com> wrote: >> Hi Jaako thanks for your response. >> >> I compiled the very latest from 0.5 branch yesterday (whatever >> yesterday nights build was). I do see that Node X.X.X.X is dead, and >> Node X.X.X.X has restarted. >> >> This show up on all the 3 other servers: >> INFO [Timer-1] 2009-12-22 20:38:43,738 Gossiper.java (line 194) >> InetAddress /10.6.168.20 is now dead. >> >> Node /10.6.168.20 has restarted, now UP again >> INFO [GMFD:1] 2009-12-22 20:43:12,812 StorageService.java (line 475) >> Node /10.6.168.20 state jump to normal >> >> This time the first time I restarted the node it seemed fine, but the >> second time I restarted it, this is what cfstats is showing for >> traffic on it : >> >> Column Family: Datastore >> Memtable Columns Count: 407 >> Memtable Data Size: 42268 >> Memtable Switch Count: 1 >> Read Count: 0 >> Read Latency: NaN ms. >> Write Count: 0 >> Write Latency: NaN ms. >> Pending Tasks: 0 >> >> and then it went up and now it's back to: >> >> Column Family: Datastore >> Memtable Columns Count: 2331 >> Memtable Data Size: 242364 >> Memtable Switch Count: 1 >> Read Count: 107 >> Read Latency: 0.486 ms. >> Write Count: 113 >> Write Latency: 0.000 ms. >> Pending Tasks: 0 >> >> which is half the traffic the other nodes are showing. The other 3 >> nodes are showing a consistent ~230 reads/writes per second, which >> node 4 was showing before it was restarted. I hope data is not being >> lost in the process? >> >> >> On Tue, Dec 22, 2009 at 4:43 PM, Jaakko <rosvopaalli...@gmail.com> wrote: >>> Hi, >>> >>> Which revision number you are running? >>> >>> Can you see any log lines related to node being UP or dead? (like >>> "InetAddress X.X.X.X is now dead" or "Node X.X.X.X has restarted, now >>> UP again"). These messages come from the Gossiper and indicate if it >>> for some reason thinks the node is dead. Level of these messages is >>> info. >>> >>> Another thing is: can you see any log messages like "Node X.X.X.X >>> state normal, token XXX"? These are on debug level. >>> >>> -Jaakko >>> >>> >>> On Wed, Dec 23, 2009 at 12:59 AM, Ramzi Rabah <rra...@playdom.com> wrote: >>>> I just recently upgraded to latest in 0.5 branch, and I am running >>>> into a serious issue. I have a cluster with 4 nodes, rackunaware >>>> strategy, and using my own tokens distributed evenly over the hash >>>> space. I am writing/reading equally to them at an equal rate of about >>>> 230 reads/writes per second(and cfstats shows that). The first 3 nodes >>>> are seeds, the last one isn't. When I start all the nodes together at >>>> the same time, they all receive equal amounts of reads/writes (about >>>> 230). >>>> When I bring node 4 down and bring it back up again, node 4's load >>>> fluctuates between the 230 it used to get to sometimes no traffic at >>>> all. The other 3 still have the same amount of traffic. And no errors >>>> what so ever seen in logs. Any ideas what can be causing this >>>> fluctuation on node 4 after I restarted it? >>>> >>> >> >