Thx Irek. Number of replicas is 3. I have 3 servers with 2 OSDs on them on 1g switch (1 OSD already decommissioned), which is further connected to a new 10G switch/network with 3 servers on it with 12 OSDs each. I'm decommissioning old 3 nodes on 1G network...
So you suggest removing whole node with 2 OSDs manually from crush map? Per my knowledge, ceph never places 2 replicas on 1 node, all 3 replicas were originally been distributed over all 3 nodes. So anyway It could be safe to remove 2 OSDs at once together with the node itself...since replica count is 3... ? Thx again for your time On Mar 3, 2015 1:35 PM, "Irek Fasikhov" <malm...@gmail.com> wrote: > Once you have only three nodes in the cluster. > I recommend you add new nodes to the cluster, and then delete the old. > > 2015-03-03 15:28 GMT+03:00 Irek Fasikhov <malm...@gmail.com>: > >> You have a number of replication? >> >> 2015-03-03 15:14 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: >> >>> Hi Irek, >>> >>> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data >>> degraded and moved/recovered. >>> When I after that removed it from Crush map "ceph osd crush rm id", >>> that's when the stuff with 37% happened. >>> >>> And thanks Irek for help - could you kindly just let me know of the >>> prefered steps when removing whole node? >>> Do you mean I first stop all OSDs again, or just remove each OSD from >>> crush map, or perhaps, just decompile cursh map, delete the node >>> completely, compile back in, and let it heal/recover ? >>> >>> Do you think this would result in less data missplaces and moved arround >>> ? >>> >>> Sorry for bugging you, I really appreaciate your help. >>> >>> Thanks >>> >>> On 3 March 2015 at 12:58, Irek Fasikhov <malm...@gmail.com> wrote: >>> >>>> A large percentage of the rebuild of the cluster map (But low >>>> percentage degradation). If you had not made "ceph osd crush rm id", the >>>> percentage would be low. >>>> In your case, the correct option is to remove the entire node, rather >>>> than each disk individually >>>> >>>> 2015-03-03 14:27 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: >>>> >>>>> Another question - I mentioned here 37% of objects being moved arround >>>>> - this is MISPLACED object (degraded objects were 0.001%, after I removed >>>>> 1 >>>>> OSD from cursh map (out of 44 OSD or so). >>>>> >>>>> Can anybody confirm this is normal behaviour - and are there any >>>>> workarrounds ? >>>>> >>>>> I understand this is because of the object placement algorithm of >>>>> CEPH, but still 37% of object missplaces just by removing 1 OSD from crush >>>>> maps out of 44 make me wonder why this large percentage ? >>>>> >>>>> Seems not good to me, and I have to remove another 7 OSDs (we are >>>>> demoting some old hardware nodes). This means I can potentialy go with 7 x >>>>> the same number of missplaced objects...? >>>>> >>>>> Any thoughts ? >>>>> >>>>> Thanks >>>>> >>>>> On 3 March 2015 at 12:14, Andrija Panic <andrija.pa...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thanks Irek. >>>>>> >>>>>> Does this mean, that after peering for each PG, there will be delay >>>>>> of 10sec, meaning that every once in a while, I will have 10sec od the >>>>>> cluster NOT being stressed/overloaded, and then the recovery takes place >>>>>> for that PG, and then another 10sec cluster is fine, and then stressed >>>>>> again ? >>>>>> >>>>>> I'm trying to understand process before actually doing stuff (config >>>>>> reference is there on ceph.com but I don't fully understand the >>>>>> process) >>>>>> >>>>>> Thanks, >>>>>> Andrija >>>>>> >>>>>> On 3 March 2015 at 11:32, Irek Fasikhov <malm...@gmail.com> wrote: >>>>>> >>>>>>> Hi. >>>>>>> >>>>>>> Use value "osd_recovery_delay_start" >>>>>>> example: >>>>>>> [root@ceph08 ceph]# ceph --admin-daemon >>>>>>> /var/run/ceph/ceph-osd.94.asok config show | grep >>>>>>> osd_recovery_delay_start >>>>>>> "osd_recovery_delay_start": "10" >>>>>>> >>>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: >>>>>>> >>>>>>>> HI Guys, >>>>>>>> >>>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it >>>>>>>> caused over 37% od the data to rebalance - let's say this is fine >>>>>>>> (this is >>>>>>>> when I removed it frm Crush Map). >>>>>>>> >>>>>>>> I'm wondering - I have previously set some throtling mechanism, but >>>>>>>> during first 1h of rebalancing, my rate of recovery was going up to >>>>>>>> 1500 >>>>>>>> MB/s - and VMs were unusable completely, and then last 4h of the >>>>>>>> duration >>>>>>>> of recover this recovery rate went down to, say, 100-200 MB.s and >>>>>>>> during >>>>>>>> this VM performance was still pretty impacted, but at least I could >>>>>>>> work >>>>>>>> more or a less >>>>>>>> >>>>>>>> So my question, is this behaviour expected, is throtling here >>>>>>>> working as expected, since first 1h was almoust no throtling applied >>>>>>>> if I >>>>>>>> check the recovery rate 1500MB/s and the impact on Vms. >>>>>>>> And last 4h seemed pretty fine (although still lot of impact in >>>>>>>> general) >>>>>>>> >>>>>>>> I changed these throtling on the fly with: >>>>>>>> >>>>>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1' >>>>>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1' >>>>>>>> ceph tell osd.* injectargs '--osd_max_backfills 1' >>>>>>>> >>>>>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 journals on >>>>>>>> one SSD, 6 journals on another SSD) - I have 3 of these hosts. >>>>>>>> >>>>>>>> Any thought are welcome. >>>>>>>> -- >>>>>>>> >>>>>>>> Andrija Panić >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ceph-users mailing list >>>>>>>> ceph-users@lists.ceph.com >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> С уважением, Фасихов Ирек Нургаязович >>>>>>> Моб.: +79229045757 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Andrija Panić >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Andrija Panić >>>>> >>>> >>>> >>>> >>>> -- >>>> С уважением, Фасихов Ирек Нургаязович >>>> Моб.: +79229045757 >>>> >>> >>> >>> >>> -- >>> >>> Andrija Panić >>> >> >> >> >> -- >> С уважением, Фасихов Ирек Нургаязович >> Моб.: +79229045757 >> > > > > -- > С уважением, Фасихов Ирек Нургаязович > Моб.: +79229045757 >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com