> I could imagine a scenario where a hint was replayed to a replica after all > replicas had purged their tombstones Scratch that, the hints are TTL'd with the lowest gc_grace. Ticket closed https://issues.apache.org/jira/browse/CASSANDRA-5379
Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 24/03/2013, at 6:24 AM, aaron morton <aa...@thelastpickle.com> wrote: >> Beside the joke, would hinted handoff really have any role in this issue? > I could imagine a scenario where a hint was replayed to a replica after all > replicas had purged their tombstones. That seems like a long shot, it would > need one node to be down for the write and all up for the delete and for all > of them to have purged the tombstone. But maybe we should have a max age on > hints so it cannot happen. > > Created https://issues.apache.org/jira/browse/CASSANDRA-5379 > > Ensuring no hints are in place during an upgrade would work around. I tend to > make sure hints and commit log are clear during an upgrade. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 22/03/2013, at 7:54 AM, Arya Goudarzi <gouda...@gmail.com> wrote: > >> Beside the joke, would hinted handoff really have any role in this issue? I >> have been struggling to reproduce this issue using the snapshot data taken >> from our cluster and following the same upgrade process from 1.1.6 to >> 1.1.10. I know snapshots only link to active SSTables. What if these >> returned rows belong to some inactive SSTables and some bug exposed itself >> and marked them as active? What are the possibilities that could lead to >> this? I am eager to find our as this is blocking our upgrade. >> >> On Tue, Mar 19, 2013 at 2:11 AM, <moshe.kr...@barclays.com> wrote: >> This obscure feature of Cassandra is called “haunted handoff”. >> >> >> >> Happy (early) April Fools J >> >> >> >> From: aaron morton [mailto:aa...@thelastpickle.com] >> Sent: Monday, March 18, 2013 7:45 PM >> To: user@cassandra.apache.org >> Subject: Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10 >> >> >> >> As you see, this node thinks lots of ranges are out of sync which shouldn't >> be the case as successful repairs where done every night prior to the >> upgrade. >> >> Could this be explained by writes occurring during the upgrade process ? >> >> >> >> I found this bug which touches timestamp and tomstones which was fixed in >> 1.1.10 but am not 100% sure if it could be related to this issue: >> https://issues.apache.org/jira/browse/CASSANDRA-5153 >> >> Me neither, but the issue was fixed in 1.1.0 >> >> >> >> It appears that the repair task that I executed after upgrade, brought back >> lots of deleted rows into life. >> >> Was it entire rows or columns in a row? >> >> Do you know if row level or column level deletes were used ? >> >> >> >> Can you look at the data in cassanca-cli and confirm the timestamps on the >> columns make sense ? >> >> >> >> Cheers >> >> >> >> ----------------- >> >> Aaron Morton >> >> Freelance Cassandra Consultant >> >> New Zealand >> >> >> >> @aaronmorton >> >> http://www.thelastpickle.com >> >> >> >> On 16/03/2013, at 2:31 PM, Arya Goudarzi <gouda...@gmail.com> wrote: >> >> >> >> >> Hi, >> >> >> >> I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running >> repairs. It appears that the repair task that I executed after upgrade, >> brought back lots of deleted rows into life. Here are some logistics: >> >> >> >> - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 >> >> - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology; >> >> - Upgrade to : 1.1.10 with all other settings the same; >> >> - Successful repairs were being done on this cluster every night; >> >> - Our clients use nanosecond precision timestamp for cassandra calls; >> >> - After upgrade, while running repair I say some log messages like this in >> one node: >> >> >> >> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 >> AntiEntropyService.java (line 1022) [repair >> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and >> /23.20.207.56 have 2223 range(s) out of sync for App >> >> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 >> AntiEntropyService.java (line 1022) [repair >> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.250.43 and >> /23.20.207.56 have 161 range(s) out of sync for App >> >> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 >> AntiEntropyService.java (line 1022) [repair >> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and >> /23.20.250.43 have 2294 range(s) out of sync for App >> >> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 >> AntiEntropyService.java (line 789) [repair >> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] App is fully synced (13 remaining >> column family to sync for this session) >> >> >> >> As you see, this node thinks lots of ranges are out of sync which shouldn't >> be the case as successful repairs where done every night prior to the >> upgrade. >> >> >> >> The App CF uses SizeTiered with gc_grace of 10 days. It has caching = 'ALL', >> and it is fairly small (11Mb on each node). >> >> >> >> I found this bug which touches timestamp and tomstones which was fixed in >> 1.1.10 but am not 100% sure if it could be related to this issue: >> https://issues.apache.org/jira/browse/CASSANDRA-5153 >> >> >> >> Any advice on how to dig deeper into this would be appreciated. >> >> >> >> Thanks, >> >> -Arya >> >> >> >> >> >> >> >> _______________________________________________ >> >> This message may contain information that is confidential or privileged. If >> you are not an intended recipient of this message, please delete it and any >> attachments, and notify the sender that you have received it in error. >> Unless specifically stated in the message or otherwise indicated, you may >> not duplicate, redistribute or forward this message or any portion thereof, >> including any attachments, by any means to any other person, including any >> retail investor or customer. This message is not a recommendation, advice, >> offer or solicitation, to buy/sell any product or service, and is not an >> official confirmation of any transaction. Any opinions presented are solely >> those of the author and do not necessarily represent those of Barclays. This >> message is subject to terms available at: www.barclays.com/emaildisclaimer >> and, if received from Barclays' Sales or Trading desk, the terms available >> at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays >> you consent to the foregoing. Barclays Bank PLC is a company registered in >> England (number 1026167) with its registered office at 1 Churchill Place, >> London, E14 5HP. This email may relate to or be sent from other members of >> the Barclays group. >> >> _______________________________________________ >> >> >