Deffinitelly in our case OSD were not the guilty ones, since all osd that
where blocking requests allways from the same pool, worked flawlesly (and
still do) after we deleted the pool where we always saw the blocked PG's.
Since the pool was accesed by just one client, and had almost no ops to it,
I think Greg (who appears to be a ceph committer) basically said he was
interested in looking at it, if only you had the pool that failed this way.
Why not try to reproduce it, and make a log of your procedure so he can
reproduce it too? What caused the slow requests... copy on write from
snapshot
any thoughts ?
On Tue, Mar 14, 2017 at 10:22 PM, Alejandro Comisario wrote:
> Greg, thanks for the reply.
> True that i cant provide enough information to know what happened since
> the pool is gone.
>
> But based on your experience, can i please take some of your time, and
> give me the TOP 5 f
Greg, thanks for the reply.
True that i cant provide enough information to know what happened since the
pool is gone.
But based on your experience, can i please take some of your time, and give
me the TOP 5 fo what could happen / would be the reason to happen what
hapened to that pool (or any pool
On Tue, Mar 7, 2017 at 10:18 AM Alejandro Comisario
wrote:
> Gregory, thanks for the response, what you've said is by far, the most
> enlightneen thing i know about ceph in a long time.
>
> What brings even greater doubt, which is, this "non-functional" pool, was
> only 1.5GB large, vs 50-150GB o
Any thoughts ?
On Tue, Mar 7, 2017 at 3:17 PM, Alejandro Comisario
wrote:
> Gregory, thanks for the response, what you've said is by far, the most
> enlightneen thing i know about ceph in a long time.
>
> What brings even greater doubt, which is, this "non-functional" pool, was
> only 1.5GB larg
Gregory, thanks for the response, what you've said is by far, the most
enlightneen thing i know about ceph in a long time.
What brings even greater doubt, which is, this "non-functional" pool, was
only 1.5GB large, vs 50-150GB on the other effected pools, the tiny pool
was still being used, and ju
Some facts:
The OSDs use a lot of gossip protocols to distribute information.
The OSDs limit how many client messages they let in to the system at a time.
The OSDs do not distinguish between client ops for different pools (the
blocking happens before they have any idea what the target is).
So, yes
Hi, we have a 7 nodes ubuntu ceph hammer pool (78 OSD to be exact).
This weekend we'be experienced a huge outage from our customers vms
(located on pool CUSTOMERS, replica size 3 ) when lots of OSD's
started to slow request/block PG's on pool PRIVATE ( replica size 1 )
basically all PG's blocked wh
Hi, we have a 7 nodes ubuntu ceph hammer pool (78 OSD to be exact).
This weekend we'be experienced a huge outage from our customers vms
(located on pool CUSTOMERS, replica size 3 ) when lots of OSD's
started to slow request/block PG's on pool PRIVATE ( replica size 1 )
basically all PG's blocked wh
10 matches
Mail list logo