Re: [ceph-users] replacing an OSD or crush map sensitivity

Chen, Xiaoxi Mon, 03 Jun 2013 16:17:53 -0700

my 0.02， you really dont need to wait for health_ok between your recovery 
steps,just go ahead. Everytime a new map be generated and broadcasted,the old 
map and in-progress recovery will be canceled


发自我的 iPhone

在 2013-6-2，11:30，"Nigel Williams" <nigel.d.willi...@gmail.com> 写道：

> Could I have a critique of this approach please as to how I could have done 
> it better or whether what I experienced simply reflects work still to be done.
> 
> This is with Ceph 0.61.2 on a quite slow test cluster (logs shared with OSDs, 
> no separate journals, using CephFS).
> 
> I knocked the power cord out from a storage node taking down 4 of the hosted 
> OSDs, all but one came back ok. This is one OSD out of a total of 12 so 1/12 
> of the storage.
> 
> Losing an OSD put the cluster into recovery, so all good. Next action was how 
> to get the missing (downed) OSD back online.
> 
> The OSD was xfs based and so I had to throw away the xfs log to get it to 
> mount. Having done this and getting it re-mounted Ceph then started throwing 
> issue #4855 (I added dmesg and logs to that issue if it helps - I am wonder 
> if throwing away the xfs log caused an internal OSD inconsistency? and this 
> causes issue #4855?). Given that I could not "recover" this OSD as far as 
> Ceph is concerned I decided to delete and rebuild it.
> 
> Several hours later, cluster was back to HEALTH_OK. I proceeded to remove and 
> re-add the bad OSD. I following the doc suggestions to do this.
> 
> The problem is we each change, it caused a slight change in the crush map, 
> resulting in the cluster going back into recovery, adding several hours wait 
> for each change. I chose to wait until the cluster was back to HEALTH_OK 
> before doing the next step. Overall it has taken a few days to finally get a 
> single OSD back into the cluster.
> 
> At one point during recovery the full threshold was triggered on a single OSD 
> causing the recovery to stop, doing "ceph pg set_full_ratio 0.98" did not 
> help. I was not planning to add data to the cluster while doing recovery 
> operations and did not understand the suggestion the PGs could be deleted to 
> make space on a "full" OSD, so I expect raising the threshold was the best 
> option but it had no (immediate) effect.
> 
> I am now back to having all 12 OSDs in and the hopefully final recovery under 
> way while it re-balances the OSDs, although I note I am still getting the 
> full OSD warning I am expecting this to disappear soon now that the 12th OSD 
> is back online.
> 
> During this recovery the percentage degraded has been a little confusing. 
> While the 12th OSD was offline the percentages were around 15-20% IIRC. But 
> now I see the percentage is 35% and slowly dropping, not sure I understand 
> the ratios and why so high with a single missing OSD.
> 
> A few documentation errors caused confusion too.
> 
> This page still contains errors in the steps to create a new OSD (manually):
> 
> http://eu.ceph.com/docs/wip-3060/cluster-ops/add-or-rm-osds/#adding-an-osd-manual
> 
> "ceph osd create {osd-num}" should be "ceph osd create"
> 
> 
> and this:
> 
> http://eu.ceph.com/docs/wip-3060/cluster-ops/crush-map/#addosd
> 
> I had to put host= to get the command accepted.
> 
> Suggestions and questions:
> 
> 1. Is there a way to get documentation pages fixed? or at least 
> health-warnings on them: "This page badly needs updating since it is 
> wrong/misleading"
> 
> 2. We need a small set of definitive succinct recipes that provide steps to 
> recover from common failures with a narrative around what to expect at each 
> step (your cluster will be in recovery here...).
> 
> 3. Some commands are throwing erroneous errors that are actually benign 
> :ceph-osd -i 10 --mkfs --mkkey" complains about failures that are expected as 
> the OSD is initially empty.
> 
> 4. An easier way to capture the state of the cluster for analysis. I don't 
> feel confident that when asked for "logs" that I am giving the most useful 
> snippets or the complete story. It seems we need a tool that can gather all 
> this in a neat bundle for later dissection or forensics.
> 
> 5. Is there a more straightforward (faster) way getting an OSD back online. 
> It almost seems like it is worth having a standby OSD ready to step in and 
> assume duties (a hot spare?).
> 
> 6. Is there a way to make the crush map less sensitive to changes during 
> recovery operations? I would have liked to stall/slow recovery while I 
> replaced the OSD then let it run at full speed.
> 
> Excuses:
> 
> I'd be happy to action suggestions but my current level of Ceph understanding 
> is still too limited that effort on my part is unproductive; I am prodding 
> the community to see if there is consensus on the need.
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] replacing an OSD or crush map sensitivity

Reply via email to