> As you see, this node thinks lots of ranges are out of sync which shouldn't 
> be the case as successful repairs where done every night prior to the 
> upgrade. 
Could this be explained by writes occurring during the upgrade process ? 

> I found this bug which touches timestamp and tomstones which was fixed in 
> 1.1.10 but am not 100% sure if it could be related to this issue: 
> https://issues.apache.org/jira/browse/CASSANDRA-5153
Me neither, but the issue was fixed in 1.1.0

>  It appears that the repair task that I executed after upgrade, brought back 
> lots of deleted rows into life.
Was it entire rows or columns in a row?
Do you know if row level or column level deletes were used ? 

Can you look at the data in cassanca-cli and confirm the timestamps on the 
columns make sense ?  

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/03/2013, at 2:31 PM, Arya Goudarzi <gouda...@gmail.com> wrote:

> Hi,
> 
> I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running 
> repairs. It appears that the repair task that I executed after upgrade, 
> brought back lots of deleted rows into life. Here are some logistics:
> 
> - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 
> - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology;
> - Upgrade to : 1.1.10 with all other settings the same;
> - Successful repairs were being done on this cluster every night;
> - Our clients use nanosecond precision timestamp for cassandra calls;
> - After upgrade, while running repair I say some log messages like this in 
> one node:
> 
> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 
> AntiEntropyService.java (line 1022) [repair 
> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and /23.20.207.56 
> have 2223 range(s) out of sync for App
> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 
> AntiEntropyService.java (line 1022) [repair 
> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.250.43 and /23.20.207.56 
> have 161 range(s) out of sync for App
> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 
> AntiEntropyService.java (line 1022) [repair 
> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and /23.20.250.43 
> have 2294 range(s) out of sync for App
> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 
> AntiEntropyService.java (line 789) [repair 
> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] App is fully synced (13 remaining 
> column family to sync for this session)
> 
> As you see, this node thinks lots of ranges are out of sync which shouldn't 
> be the case as successful repairs where done every night prior to the 
> upgrade. 
> 
> The App CF uses SizeTiered with gc_grace of 10 days. It has caching = 'ALL', 
> and it is fairly small (11Mb on each node).
> 
> I found this bug which touches timestamp and tomstones which was fixed in 
> 1.1.10 but am not 100% sure if it could be related to this issue: 
> https://issues.apache.org/jira/browse/CASSANDRA-5153
> 
> Any advice on how to dig deeper into this would be appreciated.
> 
> Thanks,
> -Arya
> 
> 

Reply via email to