On Tue, Apr 3, 2012 at 1:55 PM, Nuno Jordao <nuno-m-jor...@telecom.pt> wrote: > Ok, Thank you! :) > > One last question then, is "nodetool repair -pr" enough to recover a failed > node?
It's not. It's more for doing repair of full cluster (to ensure the all nodes are in synch), in which case you'd want to run "nodetool repair -pr" on every node. This will however only repair one range on each node, so for rebuilding a failed node, you'll want to stick to "nodetool repair" on the node to recover. But then it's expected to get RF repair sessions on said node. -- Sylvain > > Nuno > > -----Original Message----- > From: Sylvain Lebresne [mailto:sylv...@datastax.com] > Sent: terça-feira, 3 de Abril de 2012 12:38 > To: user@cassandra.apache.org > Subject: Re: Repair in loop? > Importance: Low > > On Tue, Apr 3, 2012 at 12:52 PM, Nuno Jordao <nuno-m-jor...@telecom.pt> wrote: >> Thank you for your response. >> My question is that it is repeating the same column family: >> >> INFO 19:12:24,656 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_b6 is fully synced (255 remaining column family to sync for this >> session) >> [...] >> INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_b6 is fully synced (255 remaining column family to sync for this >> session) >> >> What I was showing in my previous email is the point where it restarted: > > Ok, then it's likely because because those correspond to different > ranges of the ring. Unless you've started the repair with "nodetool > repair -pr", the repair will try to repair every range of the node and > each repair will a different repair session. I'll admit though that > printing which range is being repaired would have avoid that > confusion. > > -- > Sylvain > >> >> INFO 09:54:51,112 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_e8 is fully synced (1 remaining column family to sync for this >> session) >> INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_b6 is fully synced (255 remaining column family to sync for this >> session) >> >> Notice the "1 remaining column family to sync for this session" indication >> changes to "255 remaining column family to sync for this session". >> >> Regards, >> >> Nuno Jordão >> >> -----Original Message----- >> From: Sylvain Lebresne [mailto:sylv...@datastax.com] >> Sent: terça-feira, 3 de Abril de 2012 11:36 >> To: user@cassandra.apache.org >> Subject: Re: Repair in loop? >> Importance: Low >> >> It just means that you have lots of column family and repair does 1 >> column family at a time. Each line is just saying it's done with one >> of the column family. There is nothing wrong, but it does mean the >> repair is *not* done yet. >> >> -- >> Sylvain >> >> On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao <nuno-m-jor...@telecom.pt> >> wrote: >>> Hello, >>> >>> >>> >>> I'm doing some test with cassandra 1.0.8 using multiple data directories >>> with individual disks in a three node cluster (replica=3). >>> >>> One of the tests was to replace a couple of disks and start a repair >>> process. >>> >>> It started ok and refilled the disks but I noticed that after the recovery >>> process finished, it started a new one again: >>> >>> >>> >>> INFO 09:34:42,481 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_6f is fully synced (6 remaining column family to sync for this >>> session) >>> >>> INFO 09:41:55,288 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_0d is fully synced (5 remaining column family to sync for this >>> session) >>> >>> INFO 09:42:50,169 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_07 is fully synced (4 remaining column family to sync for this >>> session) >>> >>> INFO 09:45:02,743 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_5a is fully synced (3 remaining column family to sync for this >>> session) >>> >>> INFO 09:48:03,010 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_da is fully synced (2 remaining column family to sync for this >>> session) >>> >>> INFO 09:54:51,112 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >>> BlockData_e8 is fully synced (1 remaining column family to sync for this >>> session) >>> >>> INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_b6 is fully synced (255 remaining column family to sync for this >>> session) >>> >>> INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_13 is fully synced (254 remaining column family to sync for this >>> session) >>> >>> INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_8b is fully synced (253 remaining column family to sync for this >>> session) >>> >>> INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_31 is fully synced (252 remaining column family to sync for this >>> session) >>> >>> INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_0c is fully synced (251 remaining column family to sync for this >>> session) >>> >>> INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >>> BlockData_1b is fully synced (250 remaining column family to sync for this >>> session) >>> >>> >>> >>> Is this normal? To me it doesn't make much sense. >>> >>> >>> >>> Regards, >>> >>> >>> >>> Nuno