Re: [ceph-users] recovery process stops

Craig Lewis Mon, 20 Oct 2014 16:12:45 -0700

I've been in a state where reweight-by-utilization was deadlocked (not the
daemons, but the remap scheduling).  After successive osd reweight
commands, two OSDs wanted to swap PGs, but they were both toofull.  I ended
up temporarily increasing mon_osd_nearfull_ratio to 0.87.  That removed the
impediment, and everything finished remapping.  Everything went smoothly,
and I changed it back when all the remapping finished.


Just be careful if you need to get close to mon_osd_full_ratio.  Ceph does
greater-than on these percentages, not greater-than-equal.  You really
don't want the disks to get greater-than mon_osd_full_ratio, because all
external IO will stop until you resolve that.


On Mon, Oct 20, 2014 at 10:18 AM, Leszek Master <keks...@gmail.com> wrote:

> You can set lower weight on full osds, or try changing the
> osd_near_full_ratio parameter in your cluster from 85 to for example 89.
> But i don't know what can go wrong when you do that.
>
>
> 2014-10-20 17:12 GMT+02:00 Wido den Hollander <w...@42on.com>:
>
>> On 10/20/2014 05:10 PM, Harald Rößler wrote:
>> > yes, tomorrow I will get the replacement of the failed disk, to get a
>> new node with many disk will take a few days.
>> > No other idea?
>> >
>>
>> If the disks are all full, then, no.
>>
>> Sorry to say this, but it came down to poor capacity management. Never
>> let any disk in your cluster fill over 80% to prevent these situations.
>>
>> Wido
>>
>> > Harald Rößler
>> >
>> >
>> >> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <w...@42on.com>:
>> >>
>> >> On 10/20/2014 04:43 PM, Harald Rößler wrote:
>> >>> Yes, I had some OSD which was near full, after that I tried to fix
>> the problem with "ceph osd reweight-by-utilization", but this does not
>> help. After that I set the near full ratio to 88% with the idea that the
>> remapping would fix the issue. Also a restart of the OSD doesn’t help. At
>> the same time I had a hardware failure of on disk. :-(. After that failure
>> the recovery process start at "degraded ~ 13%“ and stops at 7%.
>> >>> Honestly I am scared in the moment I am doing the wrong operation.
>> >>>
>> >>
>> >> Any chance of adding a new node with some fresh disks? Seems like you
>> >> are operating on the storage capacity limit of the nodes and that your
>> >> only remedy would be adding more spindles.
>> >>
>> >> Wido
>> >>
>> >>> Regards
>> >>> Harald Rößler
>> >>>
>> >>>
>> >>>
>> >>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <w...@42on.com>:
>> >>>>
>> >>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:
>> >>>>> Dear All
>> >>>>>
>> >>>>> I have in them moment a issue with my cluster. The recovery process
>> stops.
>> >>>>>
>> >>>>
>> >>>> See this: 2 active+degraded+remapped+backfill_toofull
>> >>>>
>> >>>> 156 pgs backfill_toofull
>> >>>>
>> >>>> You have one or more OSDs which are to full and that causes recovery
>> to
>> >>>> stop.
>> >>>>
>> >>>> If you add more capacity to the cluster recovery will continue and
>> finish.
>> >>>>
>> >>>>> ceph -s
>> >>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4
>> pgs backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck
>> unclean; recovery 111487/1488290 degraded (7.491%)
>> >>>>>  monmap e2: 3 mons at {0=
>> 10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, election
>> epoch 332, quorum 0,1,2 0,12,6
>> >>>>>  osdmap e6748: 24 osds: 23 up, 23 in
>> >>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43
>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96
>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19
>> active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3
>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6
>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped,
>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1
>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2
>> active+degraded+remapped+backfill_toofull, 2
>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB
>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290
>> degraded (7.491%)
>> >>>>>
>> >>>>>
>> >>>>> I have tried to restart all OSD in the cluster, but does not help
>> to finish the recovery of the cluster.
>> >>>>>
>> >>>>> Have someone any idea
>> >>>>>
>> >>>>> Kind Regards
>> >>>>> Harald Rößler
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> ceph-users mailing list
>> >>>>> ceph-users@lists.ceph.com
>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Wido den Hollander
>> >>>> Ceph consultant and trainer
>> >>>> 42on B.V.
>> >>>>
>> >>>> Phone: +31 (0)20 700 9902
>> >>>> Skype: contact42on
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@lists.ceph.com
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>
>> >>
>> >> --
>> >> Wido den Hollander
>> >> Ceph consultant and trainer
>> >> 42on B.V.
>> >>
>> >> Phone: +31 (0)20 700 9902
>> >> Skype: contact42on
>> >
>>
>>
>> --
>> Wido den Hollander
>> Ceph consultant and trainer
>> 42on B.V.
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> 2014-10-20 17:12 GMT+02:00 Wido den Hollander <w...@42on.com>:
>
>> On 10/20/2014 05:10 PM, Harald Rößler wrote:
>> > yes, tomorrow I will get the replacement of the failed disk, to get a
>> new node with many disk will take a few days.
>> > No other idea?
>> >
>>
>> If the disks are all full, then, no.
>>
>> Sorry to say this, but it came down to poor capacity management. Never
>> let any disk in your cluster fill over 80% to prevent these situations.
>>
>> Wido
>>
>> > Harald Rößler
>> >
>> >
>> >> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <w...@42on.com>:
>> >>
>> >> On 10/20/2014 04:43 PM, Harald Rößler wrote:
>> >>> Yes, I had some OSD which was near full, after that I tried to fix
>> the problem with "ceph osd reweight-by-utilization", but this does not
>> help. After that I set the near full ratio to 88% with the idea that the
>> remapping would fix the issue. Also a restart of the OSD doesn’t help. At
>> the same time I had a hardware failure of on disk. :-(. After that failure
>> the recovery process start at "degraded ~ 13%“ and stops at 7%.
>> >>> Honestly I am scared in the moment I am doing the wrong operation.
>> >>>
>> >>
>> >> Any chance of adding a new node with some fresh disks? Seems like you
>> >> are operating on the storage capacity limit of the nodes and that your
>> >> only remedy would be adding more spindles.
>> >>
>> >> Wido
>> >>
>> >>> Regards
>> >>> Harald Rößler
>> >>>
>> >>>
>> >>>
>> >>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <w...@42on.com>:
>> >>>>
>> >>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:
>> >>>>> Dear All
>> >>>>>
>> >>>>> I have in them moment a issue with my cluster. The recovery process
>> stops.
>> >>>>>
>> >>>>
>> >>>> See this: 2 active+degraded+remapped+backfill_toofull
>> >>>>
>> >>>> 156 pgs backfill_toofull
>> >>>>
>> >>>> You have one or more OSDs which are to full and that causes recovery
>> to
>> >>>> stop.
>> >>>>
>> >>>> If you add more capacity to the cluster recovery will continue and
>> finish.
>> >>>>
>> >>>>> ceph -s
>> >>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4
>> pgs backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck
>> unclean; recovery 111487/1488290 degraded (7.491%)
>> >>>>>  monmap e2: 3 mons at {0=
>> 10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, election
>> epoch 332, quorum 0,1,2 0,12,6
>> >>>>>  osdmap e6748: 24 osds: 23 up, 23 in
>> >>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43
>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96
>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19
>> active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3
>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6
>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped,
>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1
>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2
>> active+degraded+remapped+backfill_toofull, 2
>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB
>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290
>> degraded (7.491%)
>> >>>>>
>> >>>>>
>> >>>>> I have tried to restart all OSD in the cluster, but does not help
>> to finish the recovery of the cluster.
>> >>>>>
>> >>>>> Have someone any idea
>> >>>>>
>> >>>>> Kind Regards
>> >>>>> Harald Rößler
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> ceph-users mailing list
>> >>>>> ceph-users@lists.ceph.com
>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Wido den Hollander
>> >>>> Ceph consultant and trainer
>> >>>> 42on B.V.
>> >>>>
>> >>>> Phone: +31 (0)20 700 9902
>> >>>> Skype: contact42on
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@lists.ceph.com
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>
>> >>
>> >> --
>> >> Wido den Hollander
>> >> Ceph consultant and trainer
>> >> 42on B.V.
>> >>
>> >> Phone: +31 (0)20 700 9902
>> >> Skype: contact42on
>> >
>>
>>
>> --
>> Wido den Hollander
>> Ceph consultant and trainer
>> 42on B.V.
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] recovery process stops

Reply via email to