any thoughts ? On Tue, Mar 14, 2017 at 10:22 PM, Alejandro Comisario <alejan...@nubeliu.com > wrote:
> Greg, thanks for the reply. > True that i cant provide enough information to know what happened since > the pool is gone. > > But based on your experience, can i please take some of your time, and > give me the TOP 5 fo what could happen / would be the reason to happen what > hapened to that pool (or any pool) that makes Ceph (maybe hapened > specifically in Hammer ) to behave like that ? > > Information that i think will be of value, is that the cluster was 5 nodes > large, running "0.94.6-1trusty" i added two nodes running the latest > "0.94.9-1trusty" and replication into those new disks never ended, since i > saw WEIRD errors on the new OSDs, so i thought that packages needed to be > the same, so i "apt-get upgraded" the 5 old nodes without restrting > nothing, so rebalancing started to happen without errors (WEIRD). > > after these two nodes reached 100% of the disks weight, the cluster worked > perfectly for about two weeks, till this happened. > After the resolution from my first email, everything has been working > perfect. > > thanks for the responses. > > > On Fri, Mar 10, 2017 at 4:23 PM, Gregory Farnum <gfar...@redhat.com> > wrote: > >> >> >> On Tue, Mar 7, 2017 at 10:18 AM Alejandro Comisario < >> alejan...@nubeliu.com> wrote: >> >>> Gregory, thanks for the response, what you've said is by far, the most >>> enlightneen thing i know about ceph in a long time. >>> >>> What brings even greater doubt, which is, this "non-functional" pool, >>> was only 1.5GB large, vs 50-150GB on the other effected pools, the tiny >>> pool was still being used, and just because that pool was blovking >>> requests, the whole cluster was unresponsive. >>> >>> So , what do you mean by "non-functional" pool ? how a pool can become >>> non-functional ? and what asures me that tomorrow (just becaue i deleted >>> the 1.5GB pool to fix the whole problem) another pool doesnt becomes >>> non-functional ? >>> >> >> Well, you said there were a bunch of slow requests. That can happen any >> number of ways, if you're overloading the OSDs or something. >> When there are slow requests, those ops take up OSD memory and throttle, >> and so they don't let in new messages until the old ones are serviced. This >> can cascade across a cluster -- because everything is interconnected, >> clients and OSDs end up with all their requests targeted at the slow OSDs >> which aren't letting in new IO quickly enough. It's one of the weaknesses >> of the standard deployment patterns, but it usually doesn't come up unless >> something else has gone pretty wrong first. >> As for what actually went wrong here, you haven't provided near enough >> information and probably can't now that the pool has been deleted. *shrug* >> -Greg >> >> >> >> >>> Ceph Bug ? >>> Another Bug ? >>> Something than can be avoided ? >>> >>> >>> On Tue, Mar 7, 2017 at 2:11 PM, Gregory Farnum <gfar...@redhat.com> >>> wrote: >>> >>> Some facts: >>> The OSDs use a lot of gossip protocols to distribute information. >>> The OSDs limit how many client messages they let in to the system at a >>> time. >>> The OSDs do not distinguish between client ops for different pools (the >>> blocking happens before they have any idea what the target is). >>> >>> So, yes: if you have a non-functional pool and clients keep trying to >>> access it, those requests can fill up the OSD memory queues and block >>> access to other pools as it cascades across the system. >>> >>> On Sun, Mar 5, 2017 at 6:22 PM Alejandro Comisario < >>> alejan...@nubeliu.com> wrote: >>> >>> Hi, we have a 7 nodes ubuntu ceph hammer pool (78 OSD to be exact). >>> This weekend we'be experienced a huge outage from our customers vms >>> (located on pool CUSTOMERS, replica size 3 ) when lots of OSD's >>> started to slow request/block PG's on pool PRIVATE ( replica size 1 ) >>> basically all PG's blocked where just one OSD in the acting set, but >>> all customers on the other pool got their vms almost freezed. >>> >>> while trying to do basic troubleshooting like doing noout and then >>> bringing down the OSD that slowed/blocked the most, inmediatelly >>> another OSD slowed/locked iops on pgs from the same PRIVATE pool, so >>> we rolled back that change and started to move data around with the >>> same logic (reweighting down those OSD) with exactly the same result. >>> >>> So, me made a decition, we decided to delete the pool where all PGS >>> where slowed/locked allways despite the osd. >>> >>> Not even 10 secconds passes after the pool deletion, where not only >>> there were no more degraded PGs, bit also ALL slow iops dissapeared >>> for ever, and performance from hundreds of vms came to normal >>> immediately. >>> >>> I must say that i was kinda scared to see that happen, bascally >>> because there was only ONE POOL's PGS always slowed, but performance >>> hit the another pool, so ... did not the PGS that exists on one pool >>> are not shared by the other ? >>> If my assertion is true, why OSD's locking iops from one pool's pg >>> slowed down all other pgs from other pools ? >>> >>> again, i just deleted a pool that has almost no traffic, because its >>> pgs were locked and affected pgs on another pool, and as soon as that >>> happened, the whole cluster came back to normal (and of course, >>> HEALTH_OK and no slow transaction whatsoever) >>> >>> please, someone help me understand the gap where i miss something, >>> since this , as long as my ceph knowledge is concerned, makes no >>> sense. >>> >>> PS: i have found someone that , looks like went through the same here: >>> https://forum.proxmox.com/threads/ceph-osd-failure-causing- >>> proxmox-node-to-crash.20781/ >>> but i still dont understand what happened. >>> >>> hoping to get the help from the community. >>> >>> -- >>> Alejandrito. >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> >>> -- >>> *Alejandro Comisario* >>> *CTO | NUBELIU* >>> E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857 >>> _ >>> www.nubeliu.com >>> >> > > > -- > *Alejandro Comisario* > *CTO | NUBELIU* > E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857 > _ > www.nubeliu.com > -- *Alejandro Comisario* *CTO | NUBELIU* E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857 _ www.nubeliu.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com