Hi Robert, I already have this stuff set. CEph is 0.87.0 now...
Thanks, will schedule this for weekend, 10G network and 36 OSDs - should move data in less than 8h per my last experineced that was arround8h, but some 1G OSDs were included... Thx! On 4 March 2015 at 17:49, Robert LeBlanc <rob...@leblancnet.us> wrote: > You will most likely have a very high relocation percentage. Backfills > always are more impactful on smaller clusters, but "osd max backfills" > should be what you need to help reduce the impact. The default is 10, > you will want to use 1. > > I didn't catch which version of Ceph you are running, but I think > there was some priority work done in firefly to help make backfills > lower priority. I think it has gotten better in later versions. > > On Wed, Mar 4, 2015 at 1:35 AM, Andrija Panic <andrija.pa...@gmail.com> > wrote: > > Thank you Rober - I'm wondering when I do remove total of 7 OSDs from > crush > > map - weather that will cause more than 37% of data moved (80% or > whatever) > > > > I'm also wondering if the thortling that I applied is fine or not - I > will > > introduce the osd_recovery_delay_start 10sec as Irek said. > > > > I'm just wondering hom much will be the performance impact, because: > > - when stoping OSD, the impact while backfilling was fine more or a less > - I > > can leave with this > > - when I removed OSD from cursh map - first 1h or so, impact was > tremendous, > > and later on during recovery process impact was much less but still > > noticable... > > > > Thanks for the tip of course ! > > Andrija > > > > On 3 March 2015 at 18:34, Robert LeBlanc <rob...@leblancnet.us> wrote: > >> > >> I would be inclined to shut down both OSDs in a node, let the cluster > >> recover. Once it is recovered, shut down the next two, let it recover. > >> Repeat until all the OSDs are taken out of the cluster. Then I would > >> set nobackfill and norecover. Then remove the hosts/disks from the > >> CRUSH then unset nobackfill and norecover. > >> > >> That should give you a few small changes (when you shut down OSDs) and > >> then one big one to get everything in the final place. If you are > >> still adding new nodes, when nobackfill and norecover is set, you can > >> add them in so that the one big relocate fills the new drives too. > >> > >> On Tue, Mar 3, 2015 at 5:58 AM, Andrija Panic <andrija.pa...@gmail.com> > >> wrote: > >> > Thx Irek. Number of replicas is 3. > >> > > >> > I have 3 servers with 2 OSDs on them on 1g switch (1 OSD already > >> > decommissioned), which is further connected to a new 10G > switch/network > >> > with > >> > 3 servers on it with 12 OSDs each. > >> > I'm decommissioning old 3 nodes on 1G network... > >> > > >> > So you suggest removing whole node with 2 OSDs manually from crush > map? > >> > Per my knowledge, ceph never places 2 replicas on 1 node, all 3 > replicas > >> > were originally been distributed over all 3 nodes. So anyway It could > be > >> > safe to remove 2 OSDs at once together with the node itself...since > >> > replica > >> > count is 3... > >> > ? > >> > > >> > Thx again for your time > >> > > >> > On Mar 3, 2015 1:35 PM, "Irek Fasikhov" <malm...@gmail.com> wrote: > >> >> > >> >> Once you have only three nodes in the cluster. > >> >> I recommend you add new nodes to the cluster, and then delete the > old. > >> >> > >> >> 2015-03-03 15:28 GMT+03:00 Irek Fasikhov <malm...@gmail.com>: > >> >>> > >> >>> You have a number of replication? > >> >>> > >> >>> 2015-03-03 15:14 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: > >> >>>> > >> >>>> Hi Irek, > >> >>>> > >> >>>> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data > >> >>>> degraded and moved/recovered. > >> >>>> When I after that removed it from Crush map "ceph osd crush rm id", > >> >>>> that's when the stuff with 37% happened. > >> >>>> > >> >>>> And thanks Irek for help - could you kindly just let me know of the > >> >>>> prefered steps when removing whole node? > >> >>>> Do you mean I first stop all OSDs again, or just remove each OSD > from > >> >>>> crush map, or perhaps, just decompile cursh map, delete the node > >> >>>> completely, > >> >>>> compile back in, and let it heal/recover ? > >> >>>> > >> >>>> Do you think this would result in less data missplaces and moved > >> >>>> arround > >> >>>> ? > >> >>>> > >> >>>> Sorry for bugging you, I really appreaciate your help. > >> >>>> > >> >>>> Thanks > >> >>>> > >> >>>> On 3 March 2015 at 12:58, Irek Fasikhov <malm...@gmail.com> wrote: > >> >>>>> > >> >>>>> A large percentage of the rebuild of the cluster map (But low > >> >>>>> percentage degradation). If you had not made "ceph osd crush rm > id", > >> >>>>> the > >> >>>>> percentage would be low. > >> >>>>> In your case, the correct option is to remove the entire node, > >> >>>>> rather > >> >>>>> than each disk individually > >> >>>>> > >> >>>>> 2015-03-03 14:27 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com > >: > >> >>>>>> > >> >>>>>> Another question - I mentioned here 37% of objects being moved > >> >>>>>> arround > >> >>>>>> - this is MISPLACED object (degraded objects were 0.001%, after I > >> >>>>>> removed 1 > >> >>>>>> OSD from cursh map (out of 44 OSD or so). > >> >>>>>> > >> >>>>>> Can anybody confirm this is normal behaviour - and are there any > >> >>>>>> workarrounds ? > >> >>>>>> > >> >>>>>> I understand this is because of the object placement algorithm of > >> >>>>>> CEPH, but still 37% of object missplaces just by removing 1 OSD > >> >>>>>> from crush > >> >>>>>> maps out of 44 make me wonder why this large percentage ? > >> >>>>>> > >> >>>>>> Seems not good to me, and I have to remove another 7 OSDs (we are > >> >>>>>> demoting some old hardware nodes). This means I can potentialy go > >> >>>>>> with 7 x > >> >>>>>> the same number of missplaced objects...? > >> >>>>>> > >> >>>>>> Any thoughts ? > >> >>>>>> > >> >>>>>> Thanks > >> >>>>>> > >> >>>>>> On 3 March 2015 at 12:14, Andrija Panic <andrija.pa...@gmail.com > > > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Thanks Irek. > >> >>>>>>> > >> >>>>>>> Does this mean, that after peering for each PG, there will be > >> >>>>>>> delay > >> >>>>>>> of 10sec, meaning that every once in a while, I will have 10sec > od > >> >>>>>>> the > >> >>>>>>> cluster NOT being stressed/overloaded, and then the recovery > takes > >> >>>>>>> place for > >> >>>>>>> that PG, and then another 10sec cluster is fine, and then > stressed > >> >>>>>>> again ? > >> >>>>>>> > >> >>>>>>> I'm trying to understand process before actually doing stuff > >> >>>>>>> (config > >> >>>>>>> reference is there on ceph.com but I don't fully understand the > >> >>>>>>> process) > >> >>>>>>> > >> >>>>>>> Thanks, > >> >>>>>>> Andrija > >> >>>>>>> > >> >>>>>>> On 3 March 2015 at 11:32, Irek Fasikhov <malm...@gmail.com> > wrote: > >> >>>>>>>> > >> >>>>>>>> Hi. > >> >>>>>>>> > >> >>>>>>>> Use value "osd_recovery_delay_start" > >> >>>>>>>> example: > >> >>>>>>>> [root@ceph08 ceph]# ceph --admin-daemon > >> >>>>>>>> /var/run/ceph/ceph-osd.94.asok config show | grep > >> >>>>>>>> osd_recovery_delay_start > >> >>>>>>>> "osd_recovery_delay_start": "10" > >> >>>>>>>> > >> >>>>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic > >> >>>>>>>> <andrija.pa...@gmail.com>: > >> >>>>>>>>> > >> >>>>>>>>> HI Guys, > >> >>>>>>>>> > >> >>>>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and > it > >> >>>>>>>>> caused over 37% od the data to rebalance - let's say this is > >> >>>>>>>>> fine (this is > >> >>>>>>>>> when I removed it frm Crush Map). > >> >>>>>>>>> > >> >>>>>>>>> I'm wondering - I have previously set some throtling > mechanism, > >> >>>>>>>>> but > >> >>>>>>>>> during first 1h of rebalancing, my rate of recovery was going > up > >> >>>>>>>>> to 1500 > >> >>>>>>>>> MB/s - and VMs were unusable completely, and then last 4h of > the > >> >>>>>>>>> duration of > >> >>>>>>>>> recover this recovery rate went down to, say, 100-200 MB.s and > >> >>>>>>>>> during this > >> >>>>>>>>> VM performance was still pretty impacted, but at least I could > >> >>>>>>>>> work more or > >> >>>>>>>>> a less > >> >>>>>>>>> > >> >>>>>>>>> So my question, is this behaviour expected, is throtling here > >> >>>>>>>>> working as expected, since first 1h was almoust no throtling > >> >>>>>>>>> applied if I > >> >>>>>>>>> check the recovery rate 1500MB/s and the impact on Vms. > >> >>>>>>>>> And last 4h seemed pretty fine (although still lot of impact > in > >> >>>>>>>>> general) > >> >>>>>>>>> > >> >>>>>>>>> I changed these throtling on the fly with: > >> >>>>>>>>> > >> >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1' > >> >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1' > >> >>>>>>>>> ceph tell osd.* injectargs '--osd_max_backfills 1' > >> >>>>>>>>> > >> >>>>>>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 > journals > >> >>>>>>>>> on > >> >>>>>>>>> one SSD, 6 journals on another SSD) - I have 3 of these > hosts. > >> >>>>>>>>> > >> >>>>>>>>> Any thought are welcome. > >> >>>>>>>>> -- > >> >>>>>>>>> > >> >>>>>>>>> Andrija Panić > >> >>>>>>>>> > >> >>>>>>>>> _______________________________________________ > >> >>>>>>>>> ceph-users mailing list > >> >>>>>>>>> ceph-users@lists.ceph.com > >> >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> -- > >> >>>>>>>> С уважением, Фасихов Ирек Нургаязович > >> >>>>>>>> Моб.: +79229045757 > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> -- > >> >>>>>>> > >> >>>>>>> Andrija Panić > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> -- > >> >>>>>> > >> >>>>>> Andrija Panić > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> С уважением, Фасихов Ирек Нургаязович > >> >>>>> Моб.: +79229045757 > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> > >> >>>> Andrija Panić > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> С уважением, Фасихов Ирек Нургаязович > >> >>> Моб.: +79229045757 > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> С уважением, Фасихов Ирек Нургаязович > >> >> Моб.: +79229045757 > >> > > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > > > > > > > > -- > > > > Andrija Panić > -- Andrija Panić
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com