Re: [ceph-users] Odd object blocking IO on PG

2017-12-15 Thread Gregory Farnum
For those following along at home, already done:
http://tracker.ceph.com/issues/22440
<http://tracker.ceph.com/issues/22440#change-103779>


On Fri, Dec 15, 2017 at 1:57 AM Brad Hubbard <bhubb...@redhat.com> wrote:

> On Wed, Dec 13, 2017 at 11:39 PM, Nick Fisk <n...@fisk.me.uk> wrote:
> > Boom!! Fixed it. Not sure if the behavior I stumbled from is correct, but
> > this has a potential to break a few things for people moving from Jewel
> to
> > Luminous if they potentially had a few too many PG’s.
> >
> >
> >
> > Firstly, how I stumbled across it. I whacked the logging up to max on
> OSD 68
> > and saw this mentioned in the logs
> >
> >
> >
> > osd.68 106454 maybe_wait_for_max_pg withhold creation of pg 0.1cf: 403 >=
> > 400
> >
> >
> >
> > This made me search through the code for this warning string
> >
> >
> >
> > https://github.com/ceph/ceph/blob/master/src/osd/OSD.cc#L4221
> >
> >
> >
> > Which jogged my memory about the changes in Luminous regarding max PG’s
> > warning, and in particular these two config options
> >
> > mon_max_pg_per_osd
> >
> > osd_max_pg_per_osd_hard_ratio
> >
> >
> >
> > In my cluster I have just over 200 PG’s per OSD, but the node with OSD.68
> > in, has 8TB disks instead of 3TB for the rest of the cluster. This means
> > these OSD’s were taking a lot more PG’s than the average would suggest.
> So
> > in Luminous 200x2 gives a hard limit of 400, which is what that error
> > message in the log suggests is the limit. I set the
> > osd_max_pg_per_osd_hard_ratio  option to 3 and restarted the OSD and hey
> > presto everything fell into line.
> >
> >
> >
> > Now a question. I get the idea around these settings to stop making too
> many
> > or pools with too many PG’s. But is it correct they can break an existing
> > pool which is maybe making the new PG on an OSD due to CRUSH layout being
> > modified?
>
> It would be good to capture this in a tracker Nick so it can be
> explored in  more depth.
>
> >
> >
> >
> > Nick
> >
> >
> >
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Nick Fisk
> > Sent: 13 December 2017 11:14
> > To: 'Gregory Farnum' <gfar...@redhat.com>
> > Cc: 'ceph-users' <ceph-users@lists.ceph.com>
> > Subject: Re: [ceph-users] Odd object blocking IO on PG
> >
> >
> >
> >
> >
> > On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk <n...@fisk.me.uk> wrote:
> >
> >
> >> That doesn't look like an RBD object -- any idea who is
> >> "client.34720596.1:212637720"?
> >
> > So I think these might be proxy ops from the cache tier, as there are
> also
> > block ops on one of the cache tier OSD's, but this time it actually lists
> > the object name. Block op on cache tier.
> >
> >"description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
> > 17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head
> [set-alloc-hint
> > object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
> > RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
> > "initiated_at": "2017-12-12 16:25:32.435718",
> > "age": 13996.681147,
> > "duration": 13996.681203,
> > "type_data": {
> > "flag_point": "reached pg",
> > "client_info": {
> > "client": "client.34720596",
> > "client_addr": "10.3.31.41:0/2600619462",
> > "tid": 212637720
> >
> > I'm a bit baffled at the moment what's going. The pg query (attached) is
> not
> > showing in the main status that it has been blocked from peering or that
> > there are any missing objects. I've tried restarting all OSD's I can see
> > relating to the PG in case they needed a bit of a nudge.
> >
> >
> >
> > Did that fix anything? I don't see anything immediately obvious but I'm
> not
> > practiced in quickly reading that pg state output.
> >
> >
> >
> > What's the output of "ceph -s"?
> >
> >
> >
> > Hi Greg,
> >
> >
> >
> > No restarting OSD’s didn’t seem to help. But I did make some progress
> late
> > last night. By stopping OSD.68 the cluster unlocks its

Re: [ceph-users] Odd object blocking IO on PG

2017-12-14 Thread Brad Hubbard
On Wed, Dec 13, 2017 at 11:39 PM, Nick Fisk <n...@fisk.me.uk> wrote:
> Boom!! Fixed it. Not sure if the behavior I stumbled from is correct, but
> this has a potential to break a few things for people moving from Jewel to
> Luminous if they potentially had a few too many PG’s.
>
>
>
> Firstly, how I stumbled across it. I whacked the logging up to max on OSD 68
> and saw this mentioned in the logs
>
>
>
> osd.68 106454 maybe_wait_for_max_pg withhold creation of pg 0.1cf: 403 >=
> 400
>
>
>
> This made me search through the code for this warning string
>
>
>
> https://github.com/ceph/ceph/blob/master/src/osd/OSD.cc#L4221
>
>
>
> Which jogged my memory about the changes in Luminous regarding max PG’s
> warning, and in particular these two config options
>
> mon_max_pg_per_osd
>
> osd_max_pg_per_osd_hard_ratio
>
>
>
> In my cluster I have just over 200 PG’s per OSD, but the node with OSD.68
> in, has 8TB disks instead of 3TB for the rest of the cluster. This means
> these OSD’s were taking a lot more PG’s than the average would suggest. So
> in Luminous 200x2 gives a hard limit of 400, which is what that error
> message in the log suggests is the limit. I set the
> osd_max_pg_per_osd_hard_ratio  option to 3 and restarted the OSD and hey
> presto everything fell into line.
>
>
>
> Now a question. I get the idea around these settings to stop making too many
> or pools with too many PG’s. But is it correct they can break an existing
> pool which is maybe making the new PG on an OSD due to CRUSH layout being
> modified?

It would be good to capture this in a tracker Nick so it can be
explored in  more depth.

>
>
>
> Nick
>
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: 13 December 2017 11:14
> To: 'Gregory Farnum' <gfar...@redhat.com>
> Cc: 'ceph-users' <ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] Odd object blocking IO on PG
>
>
>
>
>
> On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk <n...@fisk.me.uk> wrote:
>
>
>> That doesn't look like an RBD object -- any idea who is
>> "client.34720596.1:212637720"?
>
> So I think these might be proxy ops from the cache tier, as there are also
> block ops on one of the cache tier OSD's, but this time it actually lists
> the object name. Block op on cache tier.
>
>"description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
> 17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head [set-alloc-hint
> object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
> RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
> "initiated_at": "2017-12-12 16:25:32.435718",
> "age": 13996.681147,
> "duration": 13996.681203,
> "type_data": {
> "flag_point": "reached pg",
> "client_info": {
> "client": "client.34720596",
> "client_addr": "10.3.31.41:0/2600619462",
> "tid": 212637720
>
> I'm a bit baffled at the moment what's going. The pg query (attached) is not
> showing in the main status that it has been blocked from peering or that
> there are any missing objects. I've tried restarting all OSD's I can see
> relating to the PG in case they needed a bit of a nudge.
>
>
>
> Did that fix anything? I don't see anything immediately obvious but I'm not
> practiced in quickly reading that pg state output.
>
>
>
> What's the output of "ceph -s"?
>
>
>
> Hi Greg,
>
>
>
> No restarting OSD’s didn’t seem to help. But I did make some progress late
> last night. By stopping OSD.68 the cluster unlocks itself and IO can
> progress. However as soon as it starts back up, 0.1cf and a couple of other
> PG’s again get stuck in an activating state. If I out the OSD, either with
> it up or down, then some other PG’s seem to get hit by the same problem as
> CRUSH moves PG mappings around to other OSD’s.
>
>
>
> So there definitely seems to be some sort of weird peering issue somewhere.
> I have seen a very similar issue before on this cluster where after running
> the crush reweight script to balance OSD utilization, the weight got set too
> low and PG’s were unable to peer. I’m not convinced this is what’s happening
> here as all the weights haven’t changed, but I’m intending to explore this
> further just in case.
>
>
>
> With 68 down
>
> pgs: 1071783/48650631 objects degraded (2.203%)

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
Boom!! Fixed it. Not sure if the behavior I stumbled from is correct, but this 
has a potential to break a few things for people moving from Jewel to Luminous 
if they potentially had a few too many PG’s.

 

Firstly, how I stumbled across it. I whacked the logging up to max on OSD 68 
and saw this mentioned in the logs

 

osd.68 106454 maybe_wait_for_max_pg withhold creation of pg 0.1cf: 403 >= 400

 

This made me search through the code for this warning string

 

https://github.com/ceph/ceph/blob/master/src/osd/OSD.cc#L4221

 

Which jogged my memory about the changes in Luminous regarding max PG’s 
warning, and in particular these two config options

mon_max_pg_per_osd

osd_max_pg_per_osd_hard_ratio

 

In my cluster I have just over 200 PG’s per OSD, but the node with OSD.68 in, 
has 8TB disks instead of 3TB for the rest of the cluster. This means these 
OSD’s were taking a lot more PG’s than the average would suggest. So in 
Luminous 200x2 gives a hard limit of 400, which is what that error message in 
the log suggests is the limit. I set the osd_max_pg_per_osd_hard_ratio  option 
to 3 and restarted the OSD and hey presto everything fell into line.

 

Now a question. I get the idea around these settings to stop making too many or 
pools with too many PG’s. But is it correct they can break an existing pool 
which is maybe making the new PG on an OSD due to CRUSH layout being modified?

 

Nick

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Nick 
Fisk
Sent: 13 December 2017 11:14
To: 'Gregory Farnum' <gfar...@redhat.com>
Cc: 'ceph-users' <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Odd object blocking IO on PG

 

 

On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk <n...@fisk.me.uk 
<mailto:n...@fisk.me.uk> > wrote:


> That doesn't look like an RBD object -- any idea who is
> "client.34720596.1:212637720"?

So I think these might be proxy ops from the cache tier, as there are also
block ops on one of the cache tier OSD's, but this time it actually lists
the object name. Block op on cache tier.

   "description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head [set-alloc-hint
object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
"initiated_at": "2017-12-12 16:25:32.435718",
"age": 13996.681147,
"duration": 13996.681203,
"type_data": {
"flag_point": "reached pg",
"client_info": {
"client": "client.34720596",
"client_addr": "10.3.31.41:0/2600619462 
<http://10.3.31.41:0/2600619462> ",
"tid": 212637720

I'm a bit baffled at the moment what's going. The pg query (attached) is not
showing in the main status that it has been blocked from peering or that
there are any missing objects. I've tried restarting all OSD's I can see
relating to the PG in case they needed a bit of a nudge.

 

Did that fix anything? I don't see anything immediately obvious but I'm not 
practiced in quickly reading that pg state output.

 

What's the output of "ceph -s"?

 

Hi Greg,

 

No restarting OSD’s didn’t seem to help. But I did make some progress late last 
night. By stopping OSD.68 the cluster unlocks itself and IO can progress. 
However as soon as it starts back up, 0.1cf and a couple of other PG’s again 
get stuck in an activating state. If I out the OSD, either with it up or down, 
then some other PG’s seem to get hit by the same problem as CRUSH moves PG 
mappings around to other OSD’s.

 

So there definitely seems to be some sort of weird peering issue somewhere. I 
have seen a very similar issue before on this cluster where after running the 
crush reweight script to balance OSD utilization, the weight got set too low 
and PG’s were unable to peer. I’m not convinced this is what’s happening here 
as all the weights haven’t changed, but I’m intending to explore this further 
just in case.

 

With 68 down

pgs: 1071783/48650631 objects degraded (2.203%)

 5923 active+clean

 399  active+undersized+degraded

 7active+clean+scrubbing+deep

 7active+clean+remapped

 

With it up

pgs: 0.047% pgs not active

 67271/48651279 objects degraded (0.138%)

 15602/48651279 objects misplaced (0.032%)

 6051 active+clean

 273  active+recovery_wait+degraded

 4active+clean+scrubbing+deep

 4active+remapped+backfill_wait

3activating+remapped

1.  active+recovering+degraded

 

PG Dump

ceph pg dump | grep activatin

dumped all

2.

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
 

On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk  > wrote:


> That doesn't look like an RBD object -- any idea who is
> "client.34720596.1:212637720"?

So I think these might be proxy ops from the cache tier, as there are also
block ops on one of the cache tier OSD's, but this time it actually lists
the object name. Block op on cache tier.

   "description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head [set-alloc-hint
object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
"initiated_at": "2017-12-12 16:25:32.435718",
"age": 13996.681147,
"duration": 13996.681203,
"type_data": {
"flag_point": "reached pg",
"client_info": {
"client": "client.34720596",
"client_addr": "10.3.31.41:0/2600619462 
 ",
"tid": 212637720

I'm a bit baffled at the moment what's going. The pg query (attached) is not
showing in the main status that it has been blocked from peering or that
there are any missing objects. I've tried restarting all OSD's I can see
relating to the PG in case they needed a bit of a nudge.

 

Did that fix anything? I don't see anything immediately obvious but I'm not 
practiced in quickly reading that pg state output.

 

What's the output of "ceph -s"?

 

Hi Greg,

 

No restarting OSD’s didn’t seem to help. But I did make some progress late last 
night. By stopping OSD.68 the cluster unlocks itself and IO can progress. 
However as soon as it starts back up, 0.1cf and a couple of other PG’s again 
get stuck in an activating state. If I out the OSD, either with it up or down, 
then some other PG’s seem to get hit by the same problem as CRUSH moves PG 
mappings around to other OSD’s.

 

So there definitely seems to be some sort of weird peering issue somewhere. I 
have seen a very similar issue before on this cluster where after running the 
crush reweight script to balance OSD utilization, the weight got set too low 
and PG’s were unable to peer. I’m not convinced this is what’s happening here 
as all the weights haven’t changed, but I’m intending to explore this further 
just in case.

 

With 68 down

pgs: 1071783/48650631 objects degraded (2.203%)

 5923 active+clean

 399  active+undersized+degraded

 7active+clean+scrubbing+deep

 7active+clean+remapped

 

With it up

pgs: 0.047% pgs not active

 67271/48651279 objects degraded (0.138%)

 15602/48651279 objects misplaced (0.032%)

 6051 active+clean

 273  active+recovery_wait+degraded

 4active+clean+scrubbing+deep

 4active+remapped+backfill_wait

3activating+remapped

1.  active+recovering+degraded

 

PG Dump

ceph pg dump | grep activatin

dumped all

2.389 0  00 0   0   0 1500  
   1500   activating+remapped 2017-12-13 11:08:50.990526  
76271'34230106239:160310 [68,60,58,59,29,23] 68 [62,60,58,59,29,23] 
62  76271'34230 2017-12-13 09:00:08.359690  76271'34230 
2017-12-10 10:05:10.931366

0.1cf  3947  00 0   0 16472186880 1577  
   1577   activating+remapped 2017-12-13 11:08:50.641034   
106236'7512915   106239:6176548   [34,68,8] 34   
[34,8,53] 34   106138'7512682 2017-12-13 10:27:37.400613   
106138'7512682 2017-12-13 10:27:37.400613

2.210 0  00 0   0   0 1500  
   1500   activating+remapped 2017-12-13 11:08:50.686193  
76271'33304 106239:96797 [68,67,34,36,16,15] 68 [62,67,34,36,16,15] 
62  76271'33304 2017-12-12 00:49:21.038437  76271'33304 
2017-12-10 16:05:12.751425

 

 


>
> On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk   > wrote:
> > Does anyone know what this object (0.ae78c1cf) might be, it's not your
> > normal run of the mill RBD object and I can't seem to find it in the
> > pool using rados --all ls . It seems to be leaving the 0.1cf PG stuck
> > in an
> > activating+remapped state and blocking IO. Pool 0 is just a pure RBD
> > activating+pool
> > with a cache tier above it. There is no current mention of unfound
> > objects or any other obvious issues.
> >
> > There is some backfilling going on, on another OSD which was upgraded
> > to bluestore, which was when the issue started. But I can't see any
> > link in the PG dump with upgraded OSD. My only thought so far is to
> > wait for this backfilling to finish and then deep-scrub this PG and
> > see if that reveals anything?
> >
> > Thanks,
> > 

Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Gregory Farnum
On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk  wrote:

>
> > That doesn't look like an RBD object -- any idea who is
> > "client.34720596.1:212637720"?
>
> So I think these might be proxy ops from the cache tier, as there are also
> block ops on one of the cache tier OSD's, but this time it actually lists
> the object name. Block op on cache tier.
>
>"description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
> 17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head
> [set-alloc-hint
> object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
> RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
> "initiated_at": "2017-12-12 16:25:32.435718",
> "age": 13996.681147,
> "duration": 13996.681203,
> "type_data": {
> "flag_point": "reached pg",
> "client_info": {
> "client": "client.34720596",
> "client_addr": "10.3.31.41:0/2600619462",
> "tid": 212637720
>
> I'm a bit baffled at the moment what's going. The pg query (attached) is
> not
> showing in the main status that it has been blocked from peering or that
> there are any missing objects. I've tried restarting all OSD's I can see
> relating to the PG in case they needed a bit of a nudge.
>

Did that fix anything? I don't see anything immediately obvious but I'm not
practiced in quickly reading that pg state output.

What's the output of "ceph -s"?


>
> >
> > On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk  wrote:
> > > Does anyone know what this object (0.ae78c1cf) might be, it's not your
> > > normal run of the mill RBD object and I can't seem to find it in the
> > > pool using rados --all ls . It seems to be leaving the 0.1cf PG stuck
> > > in an
> > > activating+remapped state and blocking IO. Pool 0 is just a pure RBD
> > > activating+pool
> > > with a cache tier above it. There is no current mention of unfound
> > > objects or any other obvious issues.
> > >
> > > There is some backfilling going on, on another OSD which was upgraded
> > > to bluestore, which was when the issue started. But I can't see any
> > > link in the PG dump with upgraded OSD. My only thought so far is to
> > > wait for this backfilling to finish and then deep-scrub this PG and
> > > see if that reveals anything?
> > >
> > > Thanks,
> > > Nick
> > >
> > >  "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
> > > (undecoded)
> > > ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
> > > e105014)",
> > > "initiated_at": "2017-12-12 17:10:50.030660",
> > > "age": 335.948290,
> > > "duration": 335.948383,
> > > "type_data": {
> > > "flag_point": "delayed",
> > > "events": [
> > > {
> > > "time": "2017-12-12 17:10:50.030660",
> > > "event": "initiated"
> > > },
> > > {
> > > "time": "2017-12-12 17:10:50.030692",
> > > "event": "queued_for_pg"
> > > },
> > > {
> > > "time": "2017-12-12 17:10:50.030719",
> > > "event": "reached_pg"
> > > },
> > > {
> > > "time": "2017-12-12 17:10:50.030727",
> > > "event": "waiting for peered"
> > > },
> > > {
> > > "time": "2017-12-12 17:10:50.197353",
> > > "event": "reached_pg"
> > > },
> > > {
> > > "time": "2017-12-12 17:10:50.197355",
> > > "event": "waiting for peered"
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Jason
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
 
> That doesn't look like an RBD object -- any idea who is
> "client.34720596.1:212637720"?

So I think these might be proxy ops from the cache tier, as there are also
block ops on one of the cache tier OSD's, but this time it actually lists
the object name. Block op on cache tier.

   "description": "osd_op(client.34720596.1:212637720 17.ae78c1cf
17:f3831e75:::rbd_data.15a5e20238e1f29.000388ad:head [set-alloc-hint
object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[]
RETRY=2 ondisk+retry+write+known_if_redirected e104841)",
"initiated_at": "2017-12-12 16:25:32.435718",
"age": 13996.681147,
"duration": 13996.681203,
"type_data": {
"flag_point": "reached pg",
"client_info": {
"client": "client.34720596",
"client_addr": "10.3.31.41:0/2600619462",
"tid": 212637720

I'm a bit baffled at the moment what's going. The pg query (attached) is not
showing in the main status that it has been blocked from peering or that
there are any missing objects. I've tried restarting all OSD's I can see
relating to the PG in case they needed a bit of a nudge.

> 
> On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk  wrote:
> > Does anyone know what this object (0.ae78c1cf) might be, it's not your
> > normal run of the mill RBD object and I can't seem to find it in the
> > pool using rados --all ls . It seems to be leaving the 0.1cf PG stuck
> > in an
> > activating+remapped state and blocking IO. Pool 0 is just a pure RBD
> > activating+pool
> > with a cache tier above it. There is no current mention of unfound
> > objects or any other obvious issues.
> >
> > There is some backfilling going on, on another OSD which was upgraded
> > to bluestore, which was when the issue started. But I can't see any
> > link in the PG dump with upgraded OSD. My only thought so far is to
> > wait for this backfilling to finish and then deep-scrub this PG and
> > see if that reveals anything?
> >
> > Thanks,
> > Nick
> >
> >  "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
> > (undecoded)
> > ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
> > e105014)",
> > "initiated_at": "2017-12-12 17:10:50.030660",
> > "age": 335.948290,
> > "duration": 335.948383,
> > "type_data": {
> > "flag_point": "delayed",
> > "events": [
> > {
> > "time": "2017-12-12 17:10:50.030660",
> > "event": "initiated"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030692",
> > "event": "queued_for_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030719",
> > "event": "reached_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030727",
> > "event": "waiting for peered"
> > },
> > {
> > "time": "2017-12-12 17:10:50.197353",
> > "event": "reached_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.197355",
> > "event": "waiting for peered"
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> --
> Jason
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
{
"state": "activating+remapped",
"snap_trimq": "[]",
"epoch": 105385,
"up": [
34,
68,
8
],
"acting": [
34,
8,
53
],
"backfill_targets": [
"68"
],
"actingbackfill": [
"8",
"34",
"53",
"68"
],
"info": {
"pgid": "0.1cf",
"last_update": "104744'7509486",
"last_complete": "104744'7509486",
"log_tail": "104532'7507938",
"last_user_version": 132304619,
"last_backfill": "MAX",
"last_backfill_bitwise": 0,
"purged_snaps": [
{
"start": "1",
"length": "d"
}
],
"history": {
"epoch_created": 1,
"epoch_pool_created": 1,
"last_epoch_started": 104744,
"last_interval_started": 104743,
"last_epoch_clean": 104744,
"last_interval_clean": 104743,
"last_epoch_split": 0,
"last_epoch_marked_full": 0,
"same_up_since": 105371,

Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Gregory Farnum
Jason was more diligent than me and dug enough to realize that we print out
the "raw pg", which we are printing out because we haven't gotten far
enough in the pipeline to decode the actual object name. You'll note that
it ends with the same characters as the PG does, and unlike a pgid, the raw
pg is a full-length hash and resistant to operations like pg split. :)
-Greg

On Tue, Dec 12, 2017 at 11:12 AM Jason Dillaman  wrote:

> That doesn't look like an RBD object -- any idea who is
> "client.34720596.1:212637720"?
>
> On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk  wrote:
> > Does anyone know what this object (0.ae78c1cf) might be, it's not your
> > normal run of the mill RBD object and I can't seem to find it in the pool
> > using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
> > activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool
> > with a cache tier above it. There is no current mention of unfound
> objects
> > or any other obvious issues.
> >
> > There is some backfilling going on, on another OSD which was upgraded to
> > bluestore, which was when the issue started. But I can't see any link in
> the
> > PG dump with upgraded OSD. My only thought so far is to wait for this
> > backfilling to finish and then deep-scrub this PG and see if that reveals
> > anything?
> >
> > Thanks,
> > Nick
> >
> >  "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
> > (undecoded)
> > ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
> > e105014)",
> > "initiated_at": "2017-12-12 17:10:50.030660",
> > "age": 335.948290,
> > "duration": 335.948383,
> > "type_data": {
> > "flag_point": "delayed",
> > "events": [
> > {
> > "time": "2017-12-12 17:10:50.030660",
> > "event": "initiated"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030692",
> > "event": "queued_for_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030719",
> > "event": "reached_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.030727",
> > "event": "waiting for peered"
> > },
> > {
> > "time": "2017-12-12 17:10:50.197353",
> > "event": "reached_pg"
> > },
> > {
> > "time": "2017-12-12 17:10:50.197355",
> > "event": "waiting for peered"
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Jason
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Jason Dillaman
That doesn't look like an RBD object -- any idea who is
"client.34720596.1:212637720"?

On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk  wrote:
> Does anyone know what this object (0.ae78c1cf) might be, it's not your
> normal run of the mill RBD object and I can't seem to find it in the pool
> using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
> activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool
> with a cache tier above it. There is no current mention of unfound objects
> or any other obvious issues.
>
> There is some backfilling going on, on another OSD which was upgraded to
> bluestore, which was when the issue started. But I can't see any link in the
> PG dump with upgraded OSD. My only thought so far is to wait for this
> backfilling to finish and then deep-scrub this PG and see if that reveals
> anything?
>
> Thanks,
> Nick
>
>  "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
> (undecoded)
> ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
> e105014)",
> "initiated_at": "2017-12-12 17:10:50.030660",
> "age": 335.948290,
> "duration": 335.948383,
> "type_data": {
> "flag_point": "delayed",
> "events": [
> {
> "time": "2017-12-12 17:10:50.030660",
> "event": "initiated"
> },
> {
> "time": "2017-12-12 17:10:50.030692",
> "event": "queued_for_pg"
> },
> {
> "time": "2017-12-12 17:10:50.030719",
> "event": "reached_pg"
> },
> {
> "time": "2017-12-12 17:10:50.030727",
> "event": "waiting for peered"
> },
> {
> "time": "2017-12-12 17:10:50.197353",
> "event": "reached_pg"
> },
> {
> "time": "2017-12-12 17:10:50.197355",
> "event": "waiting for peered"
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Gregory Farnum
On Tue, Dec 12, 2017 at 9:37 AM Nick Fisk  wrote:

> Does anyone know what this object (0.ae78c1cf) might be, it's not your
> normal run of the mill RBD object and I can't seem to find it in the pool
> using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
> activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool
> with a cache tier above it. There is no current mention of unfound objects
> or any other obvious issues.
>
> There is some backfilling going on, on another OSD which was upgraded to
> bluestore, which was when the issue started. But I can't see any link in
> the
> PG dump with upgraded OSD. My only thought so far is to wait for this
> backfilling to finish and then deep-scrub this PG and see if that reveals
> anything?
>
> Thanks,
> Nick
>
>  "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
> (undecoded)
> ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
> e105014)",
> "initiated_at": "2017-12-12 17:10:50.030660",
> "age": 335.948290,
> "duration": 335.948383,
> "type_data": {
> "flag_point": "delayed",
> "events": [
> {
> "time": "2017-12-12 17:10:50.030660",
> "event": "initiated"
> },
> {
> "time": "2017-12-12 17:10:50.030692",
> "event": "queued_for_pg"
> },
> {
> "time": "2017-12-12 17:10:50.030719",
> "event": "reached_pg"
> },
> {
> "time": "2017-12-12 17:10:50.030727",
> "event": "waiting for peered"
> },
> {
> "time": "2017-12-12 17:10:50.197353",
> "event": "reached_pg"
> },
> {
> "time": "2017-12-12 17:10:50.197355",
> "event": "waiting for peered"
>
>
Is there some other evidence this object is the one causing the PG to be
stuck? This trace is just what you get when a PG isn't peering and has
nothing to do with the object involved. You'll need to figure out what is
keeping the PG from peering.

(PG listing operations also require an active PG, so I expect "rados ls" is
just skipping that PG — though I'm surprised it doesn't throw a warning or
error.)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
Does anyone know what this object (0.ae78c1cf) might be, it's not your
normal run of the mill RBD object and I can't seem to find it in the pool
using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool
with a cache tier above it. There is no current mention of unfound objects
or any other obvious issues.  

There is some backfilling going on, on another OSD which was upgraded to
bluestore, which was when the issue started. But I can't see any link in the
PG dump with upgraded OSD. My only thought so far is to wait for this
backfilling to finish and then deep-scrub this PG and see if that reveals
anything?

Thanks,
Nick

 "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf
(undecoded)
ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected
e105014)",
"initiated_at": "2017-12-12 17:10:50.030660",
"age": 335.948290,
"duration": 335.948383,
"type_data": {
"flag_point": "delayed",
"events": [
{
"time": "2017-12-12 17:10:50.030660",
"event": "initiated"
},
{
"time": "2017-12-12 17:10:50.030692",
"event": "queued_for_pg"
},
{
"time": "2017-12-12 17:10:50.030719",
"event": "reached_pg"
},
{
"time": "2017-12-12 17:10:50.030727",
"event": "waiting for peered"
},
{
"time": "2017-12-12 17:10:50.197353",
"event": "reached_pg"
},
{
"time": "2017-12-12 17:10:50.197355",
"event": "waiting for peered"

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com