Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1

That's this code.

3114   switch (alg) {
3115   case CRUSH_BUCKET_UNIFORM:
3116 size = sizeof(crush_bucket_uniform);
3117 break;
3118   case CRUSH_BUCKET_LIST:
3119 size = sizeof(crush_bucket_list);
3120 break;
3121   case CRUSH_BUCKET_TREE:
3122 size = sizeof(crush_bucket_tree);
3123 break;
3124   case CRUSH_BUCKET_STRAW:
3125 size = sizeof(crush_bucket_straw);
3126 break;
3127   case CRUSH_BUCKET_STRAW2:
3128 size = sizeof(crush_bucket_straw2);
3129 break;
3130   default:
3131 {
3132   char str[128];
3133   snprintf(str, sizeof(str), "unsupported bucket algorithm:
%d", alg);
3134   throw buffer::malformed_input(str);
3135 }
3136   }

CRUSH_BUCKET_UNIFORM = 1
CRUSH_BUCKET_LIST = 2
CRUSH_BUCKET_TREE = 3
CRUSH_BUCKET_STRAW = 4
CRUSH_BUCKET_STRAW2 = 5

So valid values for bucket algorithms are 1 through 5 but, for
whatever reason, at least one of yours is being interpreted as "-1"

this doesn't seem like something that would just happen spontaneously
with no changes to the cluster.

What recent changes have you made to the osdmap? What recent changes
have you made to the crushmap? Have you recently upgraded?

> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Chee

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Troy Ablan



On 8/18/19 6:43 PM, Brad Hubbard wrote:

That's this code.

3114   switch (alg) {
3115   case CRUSH_BUCKET_UNIFORM:
3116 size = sizeof(crush_bucket_uniform);
3117 break;
3118   case CRUSH_BUCKET_LIST:
3119 size = sizeof(crush_bucket_list);
3120 break;
3121   case CRUSH_BUCKET_TREE:
3122 size = sizeof(crush_bucket_tree);
3123 break;
3124   case CRUSH_BUCKET_STRAW:
3125 size = sizeof(crush_bucket_straw);
3126 break;
3127   case CRUSH_BUCKET_STRAW2:
3128 size = sizeof(crush_bucket_straw2);
3129 break;
3130   default:
3131 {
3132   char str[128];
3133   snprintf(str, sizeof(str), "unsupported bucket algorithm:
%d", alg);
3134   throw buffer::malformed_input(str);
3135 }
3136   }

CRUSH_BUCKET_UNIFORM = 1
CRUSH_BUCKET_LIST = 2
CRUSH_BUCKET_TREE = 3
CRUSH_BUCKET_STRAW = 4
CRUSH_BUCKET_STRAW2 = 5

So valid values for bucket algorithms are 1 through 5 but, for
whatever reason, at least one of yours is being interpreted as "-1"

this doesn't seem like something that would just happen spontaneously
with no changes to the cluster.

What recent changes have you made to the osdmap? What recent changes
have you made to the crushmap? Have you recently upgraded?



Brad,

There were no recent changes to the cluster/osd config to my knowledge. 
The only person who would make any such changes should have been me.  A 
few weeks ago, we added 90 new HDD OSDs all at once and the cluster was 
still backfilling onto those, but none of the pools on the now-affected 
OSDs were involved in that.


It seems that all of the SSDs are likely to be in this same state, but I 
haven't checked every single one.


I sent a complete image of one of the 1TB OSDs (compressed to about 
41GB) via ceph-post-file.  I put it the id in the tracker issue I opened 
for this, https://tracker.ceph.com/issues/41240


I don't know if you or any other devs could use that for further 
insight, but I'm hopeful.


Thanks,

-Troy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.

819   auto ch = store->open_collection(coll_t::meta());
 820   const ghobject_t full_oid = OSD::get_osdmap_pobject_name(e);
 821   if (!store->exists(ch, full_oid)) {
 822 cerr << "osdmap (" << full_oid << ") does not exist." << std::endl;
 823 if (!force) {
 824   return -ENOENT;
 825 }
 826 cout << "Creating a new epoch." << std::endl;
 827   }

Adding "--force"should get you past that error.

>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brett Chancellor
This sounds familiar. Do any of these pools on the SSD have fairly dense
placement group to object ratios? Like more than 500k objects per pg? (ceph
pg ls)

On Sun, Aug 18, 2019, 10:12 PM Brad Hubbard  wrote:

> On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
> >
> > Paul,
> >
> > Thanks for the reply.  All of these seemed to fail except for pulling
> > the osdmap from the live cluster.
> >
> > -Troy
> >
> > -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> > /var/lib/ceph/osd/ceph-45/ --file osdmap45
> > terminate called after throwing an instance of
> > 'ceph::buffer::malformed_input'
> >what():  buffer::malformed_input: unsupported bucket algorithm: -1
> > *** Caught signal (Aborted) **
> >   in thread 7f945ee04f00 thread_name:ceph-objectstor
> >   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> > (stable)
> >   1: (()+0xf5d0) [0x7f94531935d0]
> >   2: (gsignal()+0x37) [0x7f9451d80207]
> >   3: (abort()+0x148) [0x7f9451d818f8]
> >   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
> >   5: (()+0x5e746) [0x7f945268d746]
> >   6: (()+0x5e773) [0x7f945268d773]
> >   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
> >   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> > [0x7f94553218d8]
> >   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad)
> [0x7f94550ff4ad]
> >   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
> >   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> > ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
> >   12: (main()+0x5340) [0x55de1f8c8870]
> >   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
> >   14: (()+0x3adc10) [0x55de1f9a1c10]
> > Aborted
> >
> > -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> > /var/lib/ceph/osd/ceph-46/ --file osdmap46
> > terminate called after throwing an instance of
> > 'ceph::buffer::malformed_input'
> >what():  buffer::malformed_input: unsupported bucket algorithm: -1
> > *** Caught signal (Aborted) **
> >   in thread 7f9ce4135f00 thread_name:ceph-objectstor
> >   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> > (stable)
> >   1: (()+0xf5d0) [0x7f9cd84c45d0]
> >   2: (gsignal()+0x37) [0x7f9cd70b1207]
> >   3: (abort()+0x148) [0x7f9cd70b28f8]
> >   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
> >   5: (()+0x5e746) [0x7f9cd79be746]
> >   6: (()+0x5e773) [0x7f9cd79be773]
> >   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
> >   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> > [0x7f9cda6528d8]
> >   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad)
> [0x7f9cda4304ad]
> >   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
> >   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> > ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
> >   12: (main()+0x5340) [0x55cea25ea870]
> >   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
> >   14: (()+0x3adc10) [0x55cea26c3c10]
> > Aborted
> >
> > -[~:#]- ceph osd getmap -o osdmap
> > got osdmap epoch 81298
> >
> > -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> > /var/lib/ceph/osd/ceph-46/ --file osdmap
> > osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
> >
> > -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> > /var/lib/ceph/osd/ceph-45/ --file osdmap
> > osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> 819   auto ch = store->open_collection(coll_t::meta());
>  820   const ghobject_t full_oid = OSD::get_osdmap_pobject_name(e);
>  821   if (!store->exists(ch, full_oid)) {
>  822 cerr << "osdmap (" << full_oid << ") does not exist." <<
> std::endl;
>  823 if (!force) {
>  824   return -ENOENT;
>  825 }
>  826 cout << "Creating a new epoch." << std::endl;
>  827   }
>
> Adding "--force"should get you past that error.
>
> >
> >
> >
> > On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > > Starting point to debug/fix this would be to extract the osdmap from
> > > one of the dead OSDs:
> > >
> > > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> > >
> > > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > > set some --debug options (don't know which one off the top of my
> > > head).
> > > Does it also crash? How does it differ from the map retrieved with
> > > "ceph osd getmap"?
> > >
> > > You can also set the osdmap with "--op set-osdmap", does it help to
> > > set the osdmap retrieved by "ceph osd getmap"?
> > >
> > > Paul
> > >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Cheers,
> Brad
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Troy Ablan
Yes, it's possible that they do, but since all of the affected OSDs are 
still down and the monitors have been restarted since, all of those 
pools have pgs that are in unknown state and don't return anything in 
ceph pg ls.


There weren't that many placement groups for the SSDs, but also I don't 
know that there were that many objects.  There were of course a ton of 
omap key/values.


-Troy

On 8/18/19 10:57 PM, Brett Chancellor wrote:
This sounds familiar. Do any of these pools on the SSD have fairly dense 
placement group to object ratios? Like more than 500k objects per pg? 
(ceph pg ls)



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brett Chancellor
For me, it was the .rgw.meta pool that had very dense placement groups. The
OSDs would fail to start and would then commit suicide while trying to scan
the PGs. We had to remove all references of those placement groups just to
get the OSDs to start. It wasn't pretty.


On Mon, Aug 19, 2019, 2:09 AM Troy Ablan  wrote:

> Yes, it's possible that they do, but since all of the affected OSDs are
> still down and the monitors have been restarted since, all of those
> pools have pgs that are in unknown state and don't return anything in
> ceph pg ls.
>
> There weren't that many placement groups for the SSDs, but also I don't
> know that there were that many objects.  There were of course a ton of
> omap key/values.
>
> -Troy
>
> On 8/18/19 10:57 PM, Brett Chancellor wrote:
> > This sounds familiar. Do any of these pools on the SSD have fairly dense
> > placement group to object ratios? Like more than 500k objects per pg?
> > (ceph pg ls)
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com