one last comment about thesting this is i stopped all the servers, wiped their data and restarted. allowed each node to get about 15gb on them, then repeated the test. the nodetool repair does not repair the crashed node.
the only minorly interesting thing about my cluster is that i use random partitioner and assigned a token to each node. ________________________________________ From: Todd Burruss Sent: Saturday, March 20, 2010 6:48 PM To: Todd Burruss; user@cassandra.apache.org Subject: RE: node repair fyi ... i just compacted and node 105 is definitely not being repaired ________________________________________ From: Todd Burruss Sent: Saturday, March 20, 2010 12:34 PM To: user@cassandra.apache.org Subject: RE: node repair same IP, same token. i'm trying Handling Failure, #3. it is running, a part of the ring, and seems to be handling reads/writes, but does not appear to have received a copy of its data (the last node below). i've searched the all logs for ERRORs but there are none. i will compact the other nodes, but i don't think it will make a difference. [bburr...@kv-app05 ~]$ ~/cassandra/bin/nodetool -h localhost -p 9000 ring Address Status Load Range Ring 170141183460469231731687303715884105728 192.168.132.102Up 130.22 GB 42535295865117307932921825928971026431 |<--| 192.168.132.103Up 131.03 GB 85070591730234615865843651857942052863 | | 192.168.132.104Up 125.7 GB 127605887595351923798765477786913079295 | | 192.168.132.105Up 65.62 GB 170141183460469231731687303715884105728 |-->| ________________________________________ From: Jonathan Ellis [jbel...@gmail.com] Sent: Saturday, March 20, 2010 11:23 AM To: user@cassandra.apache.org Subject: Re: node repair if you bring up a new node w/ a different ip but the same token, it will confuse things. http://wiki.apache.org/cassandra/Operations "handling failure" section covers best practices here. On Sat, Mar 20, 2010 at 11:51 AM, Todd Burruss <bburr...@real.com> wrote: > i had a node fail, lost all data. so i brought it back up fresh, but > assigned it the same token in storage-conf.xml. then ran nodetool repair. > > all compactions have finished, no streams are happening. nothing. so i did > it again. same thing. i don't think its working. is there a log message i > can search for? INFO is my log level. i could try it again with debug i > suppose. > > thx