Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-11 Thread Chad Seys
Thanks Craig,

I'll jiggle the OSDs around to see if that helps.

Otherwise, I'm almost certain removing the pool will work. :/

Have a good one,
Chad.

> I had the same experience with force_create_pg too.
> 
> I ran it, and the PGs sat there in creating state.  I left the cluster
> overnight, and sometime in the middle of the night, they created.  The
> actual transition from creating to active+clean happened during the
> recovery after a single OSD was kicked out.  I don't recall if that single
> OSD was responsible for the creating PGs.  I really can't say what
> un-jammed my creating.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-10 Thread Craig Lewis
I had the same experience with force_create_pg too.

I ran it, and the PGs sat there in creating state.  I left the cluster
overnight, and sometime in the middle of the night, they created.  The
actual transition from creating to active+clean happened during the
recovery after a single OSD was kicked out.  I don't recall if that single
OSD was responsible for the creating PGs.  I really can't say what
un-jammed my creating.


On Mon, Nov 10, 2014 at 12:33 PM, Chad Seys  wrote:

> Hi Craig,
>
> > If all of your PGs now have an empty down_osds_we_would_probe, I'd run
> > through this discussion again.
>
> Yep, looks to be true.
>
> So I ran:
>
> # ceph pg force_create_pg 2.5
>
> and it has been creating for about 3 hours now. :/
>
>
> # ceph health detail | grep creating
> pg 2.5 is stuck inactive since forever, current state creating, last
> acting []
> pg 2.5 is stuck unclean since forever, current state creating, last acting
> []
>
> Then I restart all OSDs.  The "creating" label disapears and I'm back with
> same number of incomplete PGs.  :(
>
> is the 'force_create_pg' the right command?  The 'mark_unfound_lost'
> complains
> that 'pg has no unfound objects' .
>
> I shall start the 'force_create_pg' again and wait longer.  Unless there
> is a
> different command to use. ?
>
> Thanks!
> Chad.
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-10 Thread Chad Seys
Hi Craig,

> If all of your PGs now have an empty down_osds_we_would_probe, I'd run
> through this discussion again.

Yep, looks to be true.

So I ran:

# ceph pg force_create_pg 2.5

and it has been creating for about 3 hours now. :/


# ceph health detail | grep creating
pg 2.5 is stuck inactive since forever, current state creating, last acting []
pg 2.5 is stuck unclean since forever, current state creating, last acting []

Then I restart all OSDs.  The "creating" label disapears and I'm back with 
same number of incomplete PGs.  :(

is the 'force_create_pg' the right command?  The 'mark_unfound_lost' complains 
that 'pg has no unfound objects' .

I shall start the 'force_create_pg' again and wait longer.  Unless there is a 
different command to use. ?

Thanks!
Chad.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-10 Thread Craig Lewis
If all of your PGs now have an empty down_osds_we_would_probe, I'd run
through this discussion again.  The commands to tell Ceph to give up on
lost data should have an effect now.

That's my experience anyway.  Nothing progressed until I took care of
down_osds_we_would_probe.
After that was empty, I was able to repair.  It wasn't immediate though.
It still took ~24 hours, and a few OSD restarts, for the cluster to get
itself healthy.  You might try sequentially restarting OSDs.  It shouldn't
be necessary, but it shouldn't make anything worse.



On Mon, Nov 10, 2014 at 7:17 AM, Chad Seys  wrote:

> Hi Craig and list,
>
> > > > If you create a real osd.20, you might want to leave it OUT until you
> > > > get things healthy again.
>
> I created a real osd.20 (and it turns out I needed an osd.21 also).
>
> ceph pg x.xx query no longer lists down osds for probing:
> "down_osds_we_would_probe": [],
>
> But I cannot find the magic command line which will remove these incomplete
> PGs.
>
> Anyone know how to remove incomplete PGs ?
>
> Thanks!
> Chad.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-10 Thread Chad Seys
Hi Craig and list,

> > > If you create a real osd.20, you might want to leave it OUT until you
> > > get things healthy again.

I created a real osd.20 (and it turns out I needed an osd.21 also).  

ceph pg x.xx query no longer lists down osds for probing:
"down_osds_we_would_probe": [],

But I cannot find the magic command line which will remove these incomplete 
PGs.

Anyone know how to remove incomplete PGs ?

Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-07 Thread Craig Lewis
ceph-disk-prepare will give you the next unused number.  So this will work
only if the osd you remove is greater than 20.

On Thu, Nov 6, 2014 at 12:12 PM, Chad Seys  wrote:

> Hi Craig,
>
> > You'll have trouble until osd.20 exists again.
> >
> > Ceph really does not want to lose data.  Even if you tell it the osd is
> > gone, ceph won't believe you.  Once ceph can probe any osd that claims to
> > be 20, it might let you proceed with your recovery.  Then you'll probably
> > need to use ceph pg  mark_unfound_lost.
> >
> > If you don't have a free bay to create a real osd.20, it's possible to
> fake
> > it with some small loop-back filesystems.  Bring it up and mark it OUT.
> It
> > will probably cause some remapping.  I would keep it around until you get
> > things healthy.
> >
> > If you create a real osd.20, you might want to leave it OUT until you get
> > things healthy again.
>
> Thanks for the recovery tip!
>
> I would guess I could safely remove an OSD (mark OUT, wait for migration to
> stop, then crush osd rm) and then add back in as osd.20 would work?
>
> New switch:
> --yes-i-really-REALLY-mean-it
>
> ;)
> Chad.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Chad Seys
Hi Craig,

> You'll have trouble until osd.20 exists again.
> 
> Ceph really does not want to lose data.  Even if you tell it the osd is
> gone, ceph won't believe you.  Once ceph can probe any osd that claims to
> be 20, it might let you proceed with your recovery.  Then you'll probably
> need to use ceph pg  mark_unfound_lost.
> 
> If you don't have a free bay to create a real osd.20, it's possible to fake
> it with some small loop-back filesystems.  Bring it up and mark it OUT.  It
> will probably cause some remapping.  I would keep it around until you get
> things healthy.
> 
> If you create a real osd.20, you might want to leave it OUT until you get
> things healthy again.

Thanks for the recovery tip!

I would guess I could safely remove an OSD (mark OUT, wait for migration to 
stop, then crush osd rm) and then add back in as osd.20 would work?

New switch:
--yes-i-really-REALLY-mean-it

;)
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Craig Lewis
On Thu, Nov 6, 2014 at 11:27 AM, Chad Seys
>
>
> > Also, are you certain that osd 20 is not up?
> > -Sam
>
> Yep.
>
> # ceph osd metadata 20
> Error ENOENT: osd.20 does not exist
>
> So part of ceph thinks osd.20 doesn't exist, but another part (the
> down_osds_we_would_probe) thinks the osd exists and is down?
>

You'll have trouble until osd.20 exists again.

Ceph really does not want to lose data.  Even if you tell it the osd is
gone, ceph won't believe you.  Once ceph can probe any osd that claims to
be 20, it might let you proceed with your recovery.  Then you'll probably
need to use ceph pg  mark_unfound_lost.

If you don't have a free bay to create a real osd.20, it's possible to fake
it with some small loop-back filesystems.  Bring it up and mark it OUT.  It
will probably cause some remapping.  I would keep it around until you get
things healthy.

If you create a real osd.20, you might want to leave it OUT until you get
things healthy again.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Chad Seys
Hi Sam,

> > Amusingly, that's what I'm working on this week.
> > 
> > http://tracker.ceph.com/issues/7862

Well, thanks for any bugfixes in advance!  :)

> Also, are you certain that osd 20 is not up?
> -Sam

Yep.

# ceph osd metadata 20
Error ENOENT: osd.20 does not exist

So part of ceph thinks osd.20 doesn't exist, but another part (the 
down_osds_we_would_probe) thinks the osd exists and is down?

In other news, my min_size was set to 1, so the same fix might not apply to 
me.  Instead I set the pool size from 2 to 1, then back again.  Looks like the 
end result is merely going to be that the down+incomplete get converted to 
incomplete.  :/  I'll let you (and future googlers) know.

Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Samuel Just
Also, are you certain that osd 20 is not up?
-Sam

On Thu, Nov 6, 2014 at 10:52 AM, Samuel Just  wrote:
> Amusingly, that's what I'm working on this week.
>
> http://tracker.ceph.com/issues/7862
>
> There are pretty good reasons for why it works the way it does right
> now, but it certainly is unexpected.
> -Sam
>
> On Thu, Nov 6, 2014 at 7:18 AM, Chad William Seys
>  wrote:
>> Hi Sam,
>>
>>> Sounds like you needed osd 20.  You can mark osd 20 lost.
>>> -Sam
>>
>> Does not work:
>>
>> # ceph osd lost 20 --yes-i-really-mean-it
>> osd.20 is not down or doesn't exist
>>
>>
>> Also, here is an interesting post which I will follow from October:
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-October/044059.html
>>
>> "
>> Hello, all. I got some advice from the IRC channel (thanks bloodice!) that I
>> temporarily reduce the min_size of my cluster (size = 2) from 2 down to 1.
>> That immediately caused all of my incomplete PGs to start recovering and
>> everything seemed to come back OK. I was serving out and RBD from here and
>> xfs_repair reported no problems. So... happy ending?
>>
>> What started this all was that I was altering my CRUSH map causing 
>> significant
>> rebalancing on my cluster which had size = 2. During this process I lost an
>> OSD (osd.10) and eventually ended up with incomplete PGs. Knowing that I only
>> lost 1 osd I was pretty sure that I hadn't lost any data I just couldn't get
>> the PGs to recover without changing the min_size.
>> "
>>
>> It is good that this worked for him, but it also seems like a bug that it
>> worked!  (I.e. ceph should have been able to recover on its own without weird
>> workarounds.)
>>
>> I'll let you know if this works for me!
>>
>> Thanks,
>> Chad.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Samuel Just
Amusingly, that's what I'm working on this week.

http://tracker.ceph.com/issues/7862

There are pretty good reasons for why it works the way it does right
now, but it certainly is unexpected.
-Sam

On Thu, Nov 6, 2014 at 7:18 AM, Chad William Seys
 wrote:
> Hi Sam,
>
>> Sounds like you needed osd 20.  You can mark osd 20 lost.
>> -Sam
>
> Does not work:
>
> # ceph osd lost 20 --yes-i-really-mean-it
> osd.20 is not down or doesn't exist
>
>
> Also, here is an interesting post which I will follow from October:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-October/044059.html
>
> "
> Hello, all. I got some advice from the IRC channel (thanks bloodice!) that I
> temporarily reduce the min_size of my cluster (size = 2) from 2 down to 1.
> That immediately caused all of my incomplete PGs to start recovering and
> everything seemed to come back OK. I was serving out and RBD from here and
> xfs_repair reported no problems. So... happy ending?
>
> What started this all was that I was altering my CRUSH map causing significant
> rebalancing on my cluster which had size = 2. During this process I lost an
> OSD (osd.10) and eventually ended up with incomplete PGs. Knowing that I only
> lost 1 osd I was pretty sure that I hadn't lost any data I just couldn't get
> the PGs to recover without changing the min_size.
> "
>
> It is good that this worked for him, but it also seems like a bug that it
> worked!  (I.e. ceph should have been able to recover on its own without weird
> workarounds.)
>
> I'll let you know if this works for me!
>
> Thanks,
> Chad.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-06 Thread Chad William Seys
Hi Sam,

> Sounds like you needed osd 20.  You can mark osd 20 lost.
> -Sam

Does not work:

# ceph osd lost 20 --yes-i-really-mean-it   

osd.20 is not down or doesn't exist


Also, here is an interesting post which I will follow from October:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-October/044059.html

"
Hello, all. I got some advice from the IRC channel (thanks bloodice!) that I 
temporarily reduce the min_size of my cluster (size = 2) from 2 down to 1. 
That immediately caused all of my incomplete PGs to start recovering and 
everything seemed to come back OK. I was serving out and RBD from here and 
xfs_repair reported no problems. So... happy ending?

What started this all was that I was altering my CRUSH map causing significant 
rebalancing on my cluster which had size = 2. During this process I lost an 
OSD (osd.10) and eventually ended up with incomplete PGs. Knowing that I only 
lost 1 osd I was pretty sure that I hadn't lost any data I just couldn't get 
the PGs to recover without changing the min_size.
"

It is good that this worked for him, but it also seems like a bug that it 
worked!  (I.e. ceph should have been able to recover on its own without weird 
workarounds.)

I'll let you know if this works for me!

Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-05 Thread Samuel Just
Sounds like you needed osd 20.  You can mark osd 20 lost.
-Sam

On Wed, Nov 5, 2014 at 9:41 AM, Gregory Farnum  wrote:
> On Wed, Nov 5, 2014 at 7:24 AM, Chad Seys  wrote:
>> Hi Sam,
>>
>>> Incomplete usually means the pgs do not have any complete copies.  Did
>>> you previously have more osds?
>>
>> No.  But could have OSDs quitting after hitting assert(0 == "we got a bad
>> state machine event"), or interacting with kernel 3.14 clients have caused 
>> the
>> incomplete copies?
>>
>> How can I probe the fate of one of the incomplete PGs? e.g.
>> pg 4.152 is incomplete, acting [1,11]
>>
>> Also, how can I investigate why one osd has a blocked request?  The hardware
>> appears normal and the OSD is performing other requests like scrubs without
>> problems.  From its log:
>>
>> 2014-11-05 00:57:26.870867 7f7686331700  0 log [WRN] : 1 slow requests, 1
>> included below; oldest blocked for > 61440.449534 secs
>> 2014-11-05 00:57:26.870873 7f7686331700  0 log [WRN] : slow request
>> 61440.449534 seconds old, received at 2014-11-04 07:53:26.421301:
>> osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512]
>> 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg
>> 2014-11-05 00:57:31.816534 7f7665e4a700  0 -- 192.168.164.187:6800/7831 >>
>> 192.168.164.191:6806/30336 pipe(0x44a98780 sd=89 :6800 s=0 pgs=0 c
>> s=0 l=0 c=0x42f482c0).accept connect_seq 14 vs existing 13 state standby
>> 2014-11-05 00:59:10.749429 7f7666e5a700  0 -- 192.168.164.187:6800/7831 >>
>> 192.168.164.191:6800/20375 pipe(0x44a99900 sd=169 :6800 s=2 pgs=44
>> 3 cs=29 l=0 c=0x42528b00).fault with nothing to send, going to standby
>> 2014-11-05 01:02:09.746857 7f7664d39700  0 -- 192.168.164.187:6800/7831 >>
>> 192.168.164.192:6802/9779 pipe(0x44a98280 sd=63 :6800 s=0 pgs=0 cs
>> =0 l=0 c=0x42f48c60).accept connect_seq 26 vs existing 25 state standby
>>
>> Greg, I attempted to copy/paste you 'ceph scrub' output.  Did I get the
>> releveant bits?
>
> Looks like you provided the monitor log, which is actually distinct
> from the central log. I don't think it matters, though — I was looking
> for a very specific type of corruption that would have put them into a
> HEALTH_WARN or HEALTH_FAIL state if they detected it. At this point
> Sam is going to be a lot more help than I am. :)
> -Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-05 Thread Gregory Farnum
On Wed, Nov 5, 2014 at 7:24 AM, Chad Seys  wrote:
> Hi Sam,
>
>> Incomplete usually means the pgs do not have any complete copies.  Did
>> you previously have more osds?
>
> No.  But could have OSDs quitting after hitting assert(0 == "we got a bad
> state machine event"), or interacting with kernel 3.14 clients have caused the
> incomplete copies?
>
> How can I probe the fate of one of the incomplete PGs? e.g.
> pg 4.152 is incomplete, acting [1,11]
>
> Also, how can I investigate why one osd has a blocked request?  The hardware
> appears normal and the OSD is performing other requests like scrubs without
> problems.  From its log:
>
> 2014-11-05 00:57:26.870867 7f7686331700  0 log [WRN] : 1 slow requests, 1
> included below; oldest blocked for > 61440.449534 secs
> 2014-11-05 00:57:26.870873 7f7686331700  0 log [WRN] : slow request
> 61440.449534 seconds old, received at 2014-11-04 07:53:26.421301:
> osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512]
> 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg
> 2014-11-05 00:57:31.816534 7f7665e4a700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.191:6806/30336 pipe(0x44a98780 sd=89 :6800 s=0 pgs=0 c
> s=0 l=0 c=0x42f482c0).accept connect_seq 14 vs existing 13 state standby
> 2014-11-05 00:59:10.749429 7f7666e5a700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.191:6800/20375 pipe(0x44a99900 sd=169 :6800 s=2 pgs=44
> 3 cs=29 l=0 c=0x42528b00).fault with nothing to send, going to standby
> 2014-11-05 01:02:09.746857 7f7664d39700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.192:6802/9779 pipe(0x44a98280 sd=63 :6800 s=0 pgs=0 cs
> =0 l=0 c=0x42f48c60).accept connect_seq 26 vs existing 25 state standby
>
> Greg, I attempted to copy/paste you 'ceph scrub' output.  Did I get the
> releveant bits?

Looks like you provided the monitor log, which is actually distinct
from the central log. I don't think it matters, though — I was looking
for a very specific type of corruption that would have put them into a
HEALTH_WARN or HEALTH_FAIL state if they detected it. At this point
Sam is going to be a lot more help than I am. :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-05 Thread Samuel Just
The incomplete pgs are not processing requests.  That's where the
blocked requests are coming from.  You can query the pg state using
'ceph pg  query'.  Full osds can also block requests.
-Sam

On Wed, Nov 5, 2014 at 7:24 AM, Chad Seys  wrote:
> Hi Sam,
>
>> Incomplete usually means the pgs do not have any complete copies.  Did
>> you previously have more osds?
>
> No.  But could have OSDs quitting after hitting assert(0 == "we got a bad
> state machine event"), or interacting with kernel 3.14 clients have caused the
> incomplete copies?
>
> How can I probe the fate of one of the incomplete PGs? e.g.
> pg 4.152 is incomplete, acting [1,11]
>
> Also, how can I investigate why one osd has a blocked request?  The hardware
> appears normal and the OSD is performing other requests like scrubs without
> problems.  From its log:
>
> 2014-11-05 00:57:26.870867 7f7686331700  0 log [WRN] : 1 slow requests, 1
> included below; oldest blocked for > 61440.449534 secs
> 2014-11-05 00:57:26.870873 7f7686331700  0 log [WRN] : slow request
> 61440.449534 seconds old, received at 2014-11-04 07:53:26.421301:
> osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512]
> 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg
> 2014-11-05 00:57:31.816534 7f7665e4a700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.191:6806/30336 pipe(0x44a98780 sd=89 :6800 s=0 pgs=0 c
> s=0 l=0 c=0x42f482c0).accept connect_seq 14 vs existing 13 state standby
> 2014-11-05 00:59:10.749429 7f7666e5a700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.191:6800/20375 pipe(0x44a99900 sd=169 :6800 s=2 pgs=44
> 3 cs=29 l=0 c=0x42528b00).fault with nothing to send, going to standby
> 2014-11-05 01:02:09.746857 7f7664d39700  0 -- 192.168.164.187:6800/7831 >>
> 192.168.164.192:6802/9779 pipe(0x44a98280 sd=63 :6800 s=0 pgs=0 cs
> =0 l=0 c=0x42f48c60).accept connect_seq 26 vs existing 25 state standby
>
> Greg, I attempted to copy/paste you 'ceph scrub' output.  Did I get the
> releveant bits?
>
> Thanks,
> Chad.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-05 Thread Chad Seys
Hi Sam,
> 'ceph pg  query'.

Thanks.

Looks like ceph is looking for and osd.20 which no longer exists:

 "probing_osds": [
"1",
"7",
"15",
"16"],
  "down_osds_we_would_probe": [
20],

So perhaps during my attempts to rehabilitate the cluster after the upgrade I 
removed this OSD before it was fully drained. ?  

What way forward?
Should I
ceph osd lost {id} [--yes-i-really-mean-it]
and move on? 

Thanks for your help!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-05 Thread Chad Seys
Hi Sam,

> Incomplete usually means the pgs do not have any complete copies.  Did
> you previously have more osds?

No.  But could have OSDs quitting after hitting assert(0 == "we got a bad 
state machine event"), or interacting with kernel 3.14 clients have caused the 
incomplete copies?

How can I probe the fate of one of the incomplete PGs? e.g.
pg 4.152 is incomplete, acting [1,11]

Also, how can I investigate why one osd has a blocked request?  The hardware 
appears normal and the OSD is performing other requests like scrubs without 
problems.  From its log:

2014-11-05 00:57:26.870867 7f7686331700  0 log [WRN] : 1 slow requests, 1 
included below; oldest blocked for > 61440.449534 secs
2014-11-05 00:57:26.870873 7f7686331700  0 log [WRN] : slow request 
61440.449534 seconds old, received at 2014-11-04 07:53:26.421301: 
osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512] 
4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg
2014-11-05 00:57:31.816534 7f7665e4a700  0 -- 192.168.164.187:6800/7831 >> 
192.168.164.191:6806/30336 pipe(0x44a98780 sd=89 :6800 s=0 pgs=0 c
s=0 l=0 c=0x42f482c0).accept connect_seq 14 vs existing 13 state standby
2014-11-05 00:59:10.749429 7f7666e5a700  0 -- 192.168.164.187:6800/7831 >> 
192.168.164.191:6800/20375 pipe(0x44a99900 sd=169 :6800 s=2 pgs=44
3 cs=29 l=0 c=0x42528b00).fault with nothing to send, going to standby
2014-11-05 01:02:09.746857 7f7664d39700  0 -- 192.168.164.187:6800/7831 >> 
192.168.164.192:6802/9779 pipe(0x44a98280 sd=63 :6800 s=0 pgs=0 cs
=0 l=0 c=0x42f48c60).accept connect_seq 26 vs existing 25 state standby

Greg, I attempted to copy/paste you 'ceph scrub' output.  Did I get the 
releveant bits?

Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-04 Thread Samuel Just
Incomplete usually means the pgs do not have any complete copies.  Did
you previously have more osds?
-Sam

On Tue, Nov 4, 2014 at 7:37 AM, Chad Seys  wrote:
> On Monday, November 03, 2014 17:34:06 you wrote:
>> If you have osds that are close to full, you may be hitting 9626.  I
>> pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
>> -Sam
>
> Thanks Sam  I may have been hitting that as well.  I certainly hit too_full
> conditions often.  I am able to squeeze PGs off of the too_full OSD by
> reweighting and then eventually all PGs get to where they want to be.  Kind of
> silly that I have to do this manually though.  Could Ceph order the PG
> movements better? (Is this what your bug fix does in effect?)
>
>
> So, at the moment there are no PG moving around the cluster, but all are not
> in active+clean. Also, there is one OSD which has blocked requests.  The OSD
> seems idle and restarting the OSD just results in a younger blocked request.
>
> ~# ceph -s
> cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6
>  health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck
> inactive; 210 pgs stuck unclean; 1 requests are blocked > 32 sec
>  monmap e3: 3 mons at
> {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67
> 89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03
>  osdmap e115306: 24 osds: 24 up, 24 in
>   pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects
> 12747 GB used, 7848 GB / 20596 GB avail
>2 inactive
> 8494 active+clean
>  173 incomplete
>   35 down+incomplete
>
> # ceph health detail
> ...
> 1 ops are blocked > 8388.61 sec
> 1 ops are blocked > 8388.61 sec on osd.15
> 1 osds have slow requests
>
> from the log of the osd with the blocked request (osd.15):
> 2014-11-04 08:57:26.851583 7f7686331700  0 log [WRN] : 1 slow requests, 1
> included below; oldest blocked for > 3840.430247 secs
> 2014-11-04 08:57:26.851593 7f7686331700  0 log [WRN] : slow request
> 3840.430247 seconds old, received at 2014-11-04 07:53:26.421301:
> osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512]
> 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg
>
>
> Other requests (like PG scrubs) are happening without taking a long time on
> this OSD.
> Also, this was one of the OSDs which I completely drained, removed from ceph,
> reformatted, and created again using ceph-deploy.  So it is completely created
> by firefly 0.80.7 code.
>
>
> As Greg requested, output of ceph scrub:
>
> 2014-11-04 09:25:58.761602 7f6c0e20b700  0 mon.mon01@0(leader) e3
> handle_command mon_command({"prefix": "scrub"} v 0) v1
> 2014-11-04 09:26:21.320043 7f6c0ea0c700  1 mon.mon01@0(leader).paxos(paxos
> updating c 11563072..11563575) accept timeout, calling fresh elect
> ion
> 2014-11-04 09:26:31.264873 7f6c0ea0c700  0
> mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572
> used 3891232 avail 268
> 1328
> 2014-11-04 09:26:33.529403 7f6c0e20b700  0 log [INF] : mon.mon01 calling new
> monitor election
> 2014-11-04 09:26:33.538286 7f6c0e20b700  1 mon.mon01@0(electing).elector(2996)
> init, last seen epoch 2996
> 2014-11-04 09:26:38.809212 7f6c0ea0c700  0 log [INF] : mon.mon01@0 won leader
> election with quorum 0,2
> 2014-11-04 09:26:40.215095 7f6c0e20b700  0 log [INF] : monmap e3: 3 mons at
> {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=
> 144.92.180.139:6789/0}
> 2014-11-04 09:26:40.215754 7f6c0e20b700  0 log [INF] : pgmap v6630201: 8704
> pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
> plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
> 2014-11-04 09:26:40.215913 7f6c0e20b700  0 log [INF] : mdsmap e1: 0/0/1 up
> 2014-11-04 09:26:40.216621 7f6c0e20b700  0 log [INF] : osdmap e115306: 24
> osds: 24 up, 24 in
> 2014-11-04 09:26:41.227010 7f6c0e20b700  0 log [INF] : pgmap v6630202: 8704
> pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
> plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
> 2014-11-04 09:26:41.367373 7f6c0e20b700  1 mon.mon01@0(leader).osd e115307
> e115307: 24 osds: 24 up, 24 in
> 2014-11-04 09:26:41.437706 7f6c0e20b700  0 log [INF] : osdmap e115307: 24
> osds: 24 up, 24 in
> 2014-11-04 09:26:41.471558 7f6c0e20b700  0 log [INF] : pgmap v6630203: 8704
> pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
> plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
> 2014-11-04 09:26:41.497318 7f6c0e20b700  1 mon.mon01@0(leader).osd e115308
> e115308: 24 osds: 24 up, 24 in
> 2014-11-04 09:26:41.533965 7f6c0e20b700  0 log [INF] : osdmap e115308: 24
> osds: 24 up, 24 in
> 2014-11-04 09:26:41.553161 7f6c0e20b700  0 log [INF] : pgmap v6630204: 8704
> pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
> plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
> 2014-11-04 09:26:42.701720 7f6c0e20b700  1 mon.mon0

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-04 Thread Chad Seys
On Monday, November 03, 2014 17:34:06 you wrote:
> If you have osds that are close to full, you may be hitting 9626.  I
> pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
> -Sam

Thanks Sam  I may have been hitting that as well.  I certainly hit too_full 
conditions often.  I am able to squeeze PGs off of the too_full OSD by 
reweighting and then eventually all PGs get to where they want to be.  Kind of 
silly that I have to do this manually though.  Could Ceph order the PG 
movements better? (Is this what your bug fix does in effect?)


So, at the moment there are no PG moving around the cluster, but all are not 
in active+clean. Also, there is one OSD which has blocked requests.  The OSD 
seems idle and restarting the OSD just results in a younger blocked request.

~# ceph -s
cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6
 health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck 
inactive; 210 pgs stuck unclean; 1 requests are blocked > 32 sec
 monmap e3: 3 mons at 
{mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67
89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03
 osdmap e115306: 24 osds: 24 up, 24 in
  pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects
12747 GB used, 7848 GB / 20596 GB avail
   2 inactive
8494 active+clean
 173 incomplete
  35 down+incomplete

# ceph health detail
...
1 ops are blocked > 8388.61 sec
1 ops are blocked > 8388.61 sec on osd.15
1 osds have slow requests

from the log of the osd with the blocked request (osd.15):
2014-11-04 08:57:26.851583 7f7686331700  0 log [WRN] : 1 slow requests, 1 
included below; oldest blocked for > 3840.430247 secs
2014-11-04 08:57:26.851593 7f7686331700  0 log [WRN] : slow request 
3840.430247 seconds old, received at 2014-11-04 07:53:26.421301: 
osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512] 
4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg


Other requests (like PG scrubs) are happening without taking a long time on 
this OSD.
Also, this was one of the OSDs which I completely drained, removed from ceph, 
reformatted, and created again using ceph-deploy.  So it is completely created 
by firefly 0.80.7 code.


As Greg requested, output of ceph scrub:

2014-11-04 09:25:58.761602 7f6c0e20b700  0 mon.mon01@0(leader) e3 
handle_command mon_command({"prefix": "scrub"} v 0) v1
2014-11-04 09:26:21.320043 7f6c0ea0c700  1 mon.mon01@0(leader).paxos(paxos 
updating c 11563072..11563575) accept timeout, calling fresh elect
ion
2014-11-04 09:26:31.264873 7f6c0ea0c700  0 
mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572 
used 3891232 avail 268
1328
2014-11-04 09:26:33.529403 7f6c0e20b700  0 log [INF] : mon.mon01 calling new 
monitor election
2014-11-04 09:26:33.538286 7f6c0e20b700  1 mon.mon01@0(electing).elector(2996) 
init, last seen epoch 2996
2014-11-04 09:26:38.809212 7f6c0ea0c700  0 log [INF] : mon.mon01@0 won leader 
election with quorum 0,2
2014-11-04 09:26:40.215095 7f6c0e20b700  0 log [INF] : monmap e3: 3 mons at 
{mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=
144.92.180.139:6789/0}
2014-11-04 09:26:40.215754 7f6c0e20b700  0 log [INF] : pgmap v6630201: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:40.215913 7f6c0e20b700  0 log [INF] : mdsmap e1: 0/0/1 up
2014-11-04 09:26:40.216621 7f6c0e20b700  0 log [INF] : osdmap e115306: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.227010 7f6c0e20b700  0 log [INF] : pgmap v6630202: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:41.367373 7f6c0e20b700  1 mon.mon01@0(leader).osd e115307 
e115307: 24 osds: 24 up, 24 in
2014-11-04 09:26:41.437706 7f6c0e20b700  0 log [INF] : osdmap e115307: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.471558 7f6c0e20b700  0 log [INF] : pgmap v6630203: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:41.497318 7f6c0e20b700  1 mon.mon01@0(leader).osd e115308 
e115308: 24 osds: 24 up, 24 in
2014-11-04 09:26:41.533965 7f6c0e20b700  0 log [INF] : osdmap e115308: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.553161 7f6c0e20b700  0 log [INF] : pgmap v6630204: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:42.701720 7f6c0e20b700  1 mon.mon01@0(leader).osd e115309 
e115309: 24 osds: 24 up, 24 in
2014-11-04 09:26:42.953977 7f6c0e20b700  0 log [INF] : osdmap e115309: 24 
osds: 24 up, 24 in
2014-11-04 09:26:45.776411 7f6c0e20b700  0 log [INF] : pgmap v6630205: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 G

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Samuel Just
If you have osds that are close to full, you may be hitting 9626.  I
pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
-Sam

On Mon, Nov 3, 2014 at 2:09 PM, Chad Seys  wrote:
>>
>> No, it is a change, I just want to make sure I understand the
>> scenario. So you're reducing CRUSH weights on full OSDs, and then
>> *other* OSDs are crashing on these bad state machine events?
>
> That is right.  The other OSDs shutdown sometime later.  (Not immediately.)
>
> I really haven't tested to see if the OSDs will stay up with if there are no
> manipulations.  Need to wait with the PGs to settle for awhile, which I
> haven't done yet.
>
>>
>> >> I don't think it should matter, although I confess I'm not sure how
>> >> much monitor load the scrubbing adds. (It's a monitor check; doesn't
>> >> hit the OSDs at all.)
>> >
>> > $ ceph scrub
>> > No output.
>>
>> Oh, yeah, I think that output goes to the central log at a later time.
>> (Will show up in ceph -w if you're watching, or can be accessed from
>> the monitor nodes; in their data directory I think?)
>
> OK.  Will doing ceph scrub again result in the same output? If so, I'll run it
> again and look for output in ceph -w when the migrations have stopped.
>
> Thanks!
> Chad.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
> 
> No, it is a change, I just want to make sure I understand the
> scenario. So you're reducing CRUSH weights on full OSDs, and then
> *other* OSDs are crashing on these bad state machine events?

That is right.  The other OSDs shutdown sometime later.  (Not immediately.)

I really haven't tested to see if the OSDs will stay up with if there are no 
manipulations.  Need to wait with the PGs to settle for awhile, which I 
haven't done yet.

> 
> >> I don't think it should matter, although I confess I'm not sure how
> >> much monitor load the scrubbing adds. (It's a monitor check; doesn't
> >> hit the OSDs at all.)
> > 
> > $ ceph scrub
> > No output.
> 
> Oh, yeah, I think that output goes to the central log at a later time.
> (Will show up in ceph -w if you're watching, or can be accessed from
> the monitor nodes; in their data directory I think?)

OK.  Will doing ceph scrub again result in the same output? If so, I'll run it 
again and look for output in ceph -w when the migrations have stopped.

Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 12:28 PM, Chad Seys  wrote:
> On Monday, November 03, 2014 13:50:05 you wrote:
>> On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys  wrote:
>> > On Monday, November 03, 2014 13:22:47 you wrote:
>> >> Okay, assuming this is semi-predictable, can you start up one of the
>> >> OSDs that is going to fail with "debug osd = 20", "debug filestore =
>> >> 20", and "debug ms = 1" in the config file and then put the OSD log
>> >> somewhere accessible after it's crashed?
>> >
>> > Alas, I have not yet noticed a pattern.  Only thing I think is true is
>> > that they go down when I first make CRUSH changes.  Then after
>> > restarting, they run without going down again.
>> > All the OSDs are running at the moment.
>>
>> Oh, interesting. What CRUSH changes exactly are you making that are
>> spawning errors?
>
> Maybe I miswrote:  I've been marking OUT OSDs with blocked requests.  Then if
> a OSD becomes too_full I use 'ceph osd reweight' to squeeze blocks off of the
> too_full OSD.  (Maybe that is not technically a CRUSH map change?)

No, it is a change, I just want to make sure I understand the
scenario. So you're reducing CRUSH weights on full OSDs, and then
*other* OSDs are crashing on these bad state machine events?

>
>
>> I don't think it should matter, although I confess I'm not sure how
>> much monitor load the scrubbing adds. (It's a monitor check; doesn't
>> hit the OSDs at all.)
>
> $ ceph scrub
> No output.

Oh, yeah, I think that output goes to the central log at a later time.
(Will show up in ceph -w if you're watching, or can be accessed from
the monitor nodes; in their data directory I think?)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
On Monday, November 03, 2014 13:50:05 you wrote:
> On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys  wrote:
> > On Monday, November 03, 2014 13:22:47 you wrote:
> >> Okay, assuming this is semi-predictable, can you start up one of the
> >> OSDs that is going to fail with "debug osd = 20", "debug filestore =
> >> 20", and "debug ms = 1" in the config file and then put the OSD log
> >> somewhere accessible after it's crashed?
> > 
> > Alas, I have not yet noticed a pattern.  Only thing I think is true is
> > that they go down when I first make CRUSH changes.  Then after
> > restarting, they run without going down again.
> > All the OSDs are running at the moment.
> 
> Oh, interesting. What CRUSH changes exactly are you making that are
> spawning errors?

Maybe I miswrote:  I've been marking OUT OSDs with blocked requests.  Then if 
a OSD becomes too_full I use 'ceph osd reweight' to squeeze blocks off of the 
too_full OSD.  (Maybe that is not technically a CRUSH map change?)


> I don't think it should matter, although I confess I'm not sure how
> much monitor load the scrubbing adds. (It's a monitor check; doesn't
> hit the OSDs at all.)

$ ceph scrub
No output.

Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys  wrote:
> On Monday, November 03, 2014 13:22:47 you wrote:
>> Okay, assuming this is semi-predictable, can you start up one of the
>> OSDs that is going to fail with "debug osd = 20", "debug filestore =
>> 20", and "debug ms = 1" in the config file and then put the OSD log
>> somewhere accessible after it's crashed?
>
> Alas, I have not yet noticed a pattern.  Only thing I think is true is that
> they go down when I first make CRUSH changes.  Then after restarting, they run
> without going down again.
> All the OSDs are running at the moment.

Oh, interesting. What CRUSH changes exactly are you making that are
spawning errors?

> What I've been doing is marking OUT the OSDs on which a request is blocked,
> letting the PGs recover, (drain the OSD of PGs completely), then remove and
> readd the OSD.
>
> So far OSDs treated this way no longer have blocked requests.
>
> Also, seems as though that slowly decreases the number of incomplete and
> down+incomplete PGs .
>
>>
>> Can you also verify that all of your monitors are running firefly, and
>> then issue the command "ceph scrub" and report the output?
>
> Sure, should I wait until the current rebalancing is finished?

I don't think it should matter, although I confess I'm not sure how
much monitor load the scrubbing adds. (It's a monitor check; doesn't
hit the OSDs at all.)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
On Monday, November 03, 2014 13:22:47 you wrote:
> Okay, assuming this is semi-predictable, can you start up one of the
> OSDs that is going to fail with "debug osd = 20", "debug filestore =
> 20", and "debug ms = 1" in the config file and then put the OSD log
> somewhere accessible after it's crashed?

Alas, I have not yet noticed a pattern.  Only thing I think is true is that 
they go down when I first make CRUSH changes.  Then after restarting, they run 
without going down again.
All the OSDs are running at the moment.

What I've been doing is marking OUT the OSDs on which a request is blocked, 
letting the PGs recover, (drain the OSD of PGs completely), then remove and 
readd the OSD.

So far OSDs treated this way no longer have blocked requests.

Also, seems as though that slowly decreases the number of incomplete and 
down+incomplete PGs .

> 
> Can you also verify that all of your monitors are running firefly, and
> then issue the command "ceph scrub" and report the output?

Sure, should I wait until the current rebalancing is finished?

Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
Okay, assuming this is semi-predictable, can you start up one of the
OSDs that is going to fail with "debug osd = 20", "debug filestore =
20", and "debug ms = 1" in the config file and then put the OSD log
somewhere accessible after it's crashed?

Can you also verify that all of your monitors are running firefly, and
then issue the command "ceph scrub" and report the output?
-Greg

On Mon, Nov 3, 2014 at 11:07 AM, Chad Seys  wrote:
>
>> There's a "ceph osd metadata" command, but i don't recall if it's in
>> Firefly or only giant. :)
>
> It's in firefly.  Thanks, very handy.
>
> All the OSDs are running 0.80.7 at the moment.
>
> What next?
>
> Thanks again,
> Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys

> There's a "ceph osd metadata" command, but i don't recall if it's in
> Firefly or only giant. :)

It's in firefly.  Thanks, very handy.

All the OSDs are running 0.80.7 at the moment.

What next?

Thanks again,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
[ Re-adding the list. ]

On Mon, Nov 3, 2014 at 10:49 AM, Chad Seys  wrote:
>
>> >   Next I executed
>> >
>> > 'ceph osd crush tunables optimal'
>> >
>> >   to upgrade CRUSH mapping.
>>
>> Okay...you know that's a data movement command, right?
>
> Yes.
>
>> So you should expect it to impact operations.
>
>
>> These failures are usually the result of adjusting tunables without
>> having upgraded all the machines in the cluster — although they should
>> also be fixed in v0.80.7. Are you still seeing crashes, or just the PG
>> state issues?
>
> Still getting crashes. I believe all nodes are running 0.80.7 .  Does ceph
> have a command to check this?  (Otherwise I'll do an ssh-many to check.)

There's a "ceph osd metadata" command, but i don't recall if it's in
Firefly or only giant. :)

>
> Thanks!
> C.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 7:46 AM, Chad Seys  wrote:
> Hi All,
>I upgraded from emperor to firefly.  Initial upgrade went smoothly and all
> placement groups were active+clean .
>   Next I executed
> 'ceph osd crush tunables optimal'
>   to upgrade CRUSH mapping.

Okay...you know that's a data movement command, right? So you should
expect it to impact operations. (Although not the crashes you're
witnessing.)

>   Now I keep having OSDs go down or have requests blocked for long periods of
> time.
>   I start back up the down OSDs and recovery eventually stops, but with 100s
> of "incomplete" and "down+incomplete" pgs remaining.
>   The ceph web page says "If you see this state [incomplete], report a bug,
> and try to start any failed OSDs that may contain the needed information."
> Well, all the OSDs are up, though some have blocked requests.
>
> Also, the logs of the OSDs which go down have this message:
> 2014-11-02 21:46:33.615829 7ffcf0421700  0 -- 192.168.164.192:6810/31314 >>
> 192.168.164.186:6804/20934 pipe(0x2faa0280 sd=261 :6810 s=2 pgs=9
> 19 cs=25 l=0 c=0x2ed022c0).fault with nothing to send, going to standby
> 2014-11-02 21:49:11.440142 7ffce4cf3700  0 -- 192.168.164.192:6810/31314 >>
> 192.168.164.186:6804/20934 pipe(0xe512a00 sd=249 :6810 s=0 pgs=0
> cs=0 l=0 c=0x2a308b00).accept connect_seq 26 vs existing 25 state standby
> 2014-11-02 21:51:20.085676 7ffcf6e3e700 -1 osd/PG.cc: In function
> 'PG::RecoveryState::Crashed::Crashed(boost::statechart::state tate::Crashed, PG::RecoveryState::RecoveryMachine>::my_context)' thread
> 7ffcf6e3e700 time 2014-11-02 21:51:20.052242
> osd/PG.cc: 5424: FAILED assert(0 == "we got a bad state machine event")

These failures are usually the result of adjusting tunables without
having upgraded all the machines in the cluster — although they should
also be fixed in v0.80.7. Are you still seeing crashes, or just the PG
state issues?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
P.S.  The OSDs interacted with some 3.14 krbd clients before I realized that 
kernel version was too old for the firefly CRUSH map.

Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com