[ceph-users] Cephalocon 2018 APAC
Hey Cephers, As many of you know, we've just announced the Cephalocon APAC 2018 which happens on Beijing on March 22 and 23 at the JW Marriott Hotel, as well the Call for Proposals, that is acceptig talks until January 31, 2018. This is a project we've been working for a long time now and that we unfortunately had to cancel in 2017 due several reasons. Fortunately this community is made by people driven by a great passion for this project and a brave group of volunteers in China decided to pursue the challenge and organize the very first Cephalocon the world will see! Having the Cephalacon happening in China will allow us to reach the vibrant community of contributors, users and companies which have been helping this project to grow fast, and develop a closer relationship with them. This will bring a lot of benefits for Ceph project and also provide a great experience to everyone joining us at the conference. At the moment we have a lot of work to be done, including help the organization team in Beijing on tasks like finding Sponsors (we have around 70% of all sponsorship packages sold so far, but we need to work in order to meet the goal and have the conference expenses covered), provide the travel information (visas, flights, hotels etc) and obviously promoting the event so we can reach more people and have a great attendance. The team in Beijing is working with a local company called DoIT which created a website[2] containing details about the conference and this information is being progressively synced to Ceph.com so we will have a lot of changes and announcements on upcoming weeks. If you have any questions about the conference feel free to contact me through this email, IRC (Lvaz on OFTC and Freenode) or on any social media channel use by Ceph. I will be happy to answer to all questions. Kindest regards, Leo [1] https://ceph.com/cephalocon [2] http://cephalocon.doit.com.cn/index_en.html -- Leonardo Vaz Ceph Community Manager Open Source and Standards Team ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous RGW Metadata Search
The errors you're seeing there don't look like related to elasticsearch. It's a generic radosgw related error that says that it failed to reach the rados (ceph) backend. You can try bumping up the messenger log (debug ms =1) and see if there's any hint in there. Yehuda On Fri, Jan 12, 2018 at 12:54 PM, Youzhong Yangwrote: > So I did the exact same thing using Kraken and the same set of VMs, no > issue. What is the magic to make it work in Luminous? Anyone lucky enough to > have this RGW ElasticSearch working using Luminous? > > On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang wrote: >> >> Hi Yehuda, >> >> Thanks for replying. >> >> >radosgw failed to connect to your ceph cluster. Does the rados command >> >with the same connection params work? >> >> I am not quite sure what to do by running rados command to test. >> >> So I tried again, could you please take a look and check what could have >> gone wrong? >> >> Here are what I did: >> >> On ceph admin node, I removed installation on ceph-rgw1 and >> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed all rgw >> pools. Elasticsearch is running on ceph-rgw2 node on port 9200. >> >> ceph-deploy purge ceph-rgw1 >> ceph-deploy purge ceph-rgw2 >> ceph-deploy purgedata ceph-rgw2 >> ceph-deploy purgedata ceph-rgw1 >> ceph-deploy install --release luminous ceph-rgw1 >> ceph-deploy admin ceph-rgw1 >> ceph-deploy rgw create ceph-rgw1 >> ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1 >> rados rmpool default.rgw.log default.rgw.log --yes-i-really-really-mean-it >> rados rmpool default.rgw.meta default.rgw.meta >> --yes-i-really-really-mean-it >> rados rmpool default.rgw.control default.rgw.control >> --yes-i-really-really-mean-it >> rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it >> >> On ceph-rgw1 node: >> >> export RGWHOST="ceph-rgw1" >> export ELASTICHOST="ceph-rgw2" >> export REALM="demo" >> export ZONEGRP="zone1" >> export ZONE1="zone1-a" >> export ZONE2="zone1-b" >> export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 | >> head -n 1 )" >> export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 | >> head -n 1 )" >> >> radosgw-admin realm create --rgw-realm=${REALM} --default >> radosgw-admin zonegroup create --rgw-realm=${REALM} >> --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master >> --default >> radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} >> --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000 >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default >> radosgw-admin user create --uid=sync --display-name="zone sync" >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system >> radosgw-admin period update --commit >> sudo systemctl start ceph-radosgw@rgw.${RGWHOST} >> >> radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} >> --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} >> --endpoints=http://${RGWHOST}:8002 >> radosgw-admin zone modify --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} >> --rgw-zone=${ZONE2} --tier-type=elasticsearch >> --tier-config=endpoint=http://${ELASTICHOST}:9200,num_replicas=1,num_shards=10 >> radosgw-admin period update --commit >> >> sudo systemctl restart ceph-radosgw@rgw.${RGWHOST} >> sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f >> --rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002" >> 2018-01-08 00:21:54.389432 7f0fe9cd2e80 -1 Couldn't init storage provider >> (RADOS) >> >> As you can see, starting rgw on port 8002 failed, but rgw on port >> 8000 was started successfully. >> Here are some more info which may be useful for diagnosis: >> >> $ cat /etc/ceph/ceph.conf >> [global] >> fsid = 3e5a32d4-e45e-48dd-a3c5-f6f28fef8edf >> mon_initial_members = ceph-mon1, ceph-osd1, ceph-osd2, ceph-osd3 >> mon_host = 172.30.212.226,172.30.212.227,172.30.212.228,172.30.212.250 >> auth_cluster_required = cephx >> auth_service_required = cephx >> auth_client_required = cephx >> osd_pool_default_size = 2 >> osd_pool_default_min_size = 2 >> osd_pool_default_pg_num = 100 >> osd_pool_default_pgp_num = 100 >> bluestore_compression_algorithm = zlib >> bluestore_compression_mode = force >> rgw_max_put_size = 21474836480 >> [osd] >> osd_max_object_size = 1073741824 >> [mon] >> mon_allow_pool_delete = true >> [client.rgw.ceph-rgw1] >> host = ceph-rgw1 >> rgw frontends = civetweb port=8000 >> >> $ wget -O - -q http://ceph-rgw2:9200/ >> { >> "name" : "Hippolyta", >> "cluster_name" : "elasticsearch", >> "version" : { >> "number" : "2.3.1", >> "build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39", >> "build_timestamp" : "2016-04-04T12:25:05Z", >> "build_snapshot" : false, >> "lucene_version" : "5.5.0" >> }, >> "tagline" : "You Know, for Search" >> } >> >> $ ceph df >> GLOBAL: >> SIZE AVAIL RAW USED %RAW USED >> 719G 705G 14473M
[ceph-users] Bluestore - possible to grow PV/LV and utilize additional space?
Hello, I'm wondering if it's possible to grow a volume (such as in a cloud/VM environment) and use pvresize/lvextend to utilize the extra space in my pool. I am testing with the following environment: * Running on cloud provider (Google Cloud) * 3 nodes, 1 OSD each * 1 storage pool with "size" of 3 (data replicated on all nodes) * Initial disk size of 100 GB on each node, initialized as bluestore OSDs I grew all three volumes (100 GB -> 150 GB) being used as OSDs in the Google console. Then used pvresize/lvextend on all devices and rebooted all nodes one-by-one. In the end, the nodes are somewhat recognizing the additional space, but it's showing up as being utilized. Before resize (there's ~1 GB of data in my pool): $ ceph -s cluster: id: 553ca7bd-925a-4dc5-a928-563b520842de health: HEALTH_OK services: mon: 3 daemons, quorum ceph01,ceph02,ceph03 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph01=up:active}, 2 up:standby osd: 3 osds: 3 up, 3 in data: pools: 2 pools, 200 pgs objects: 281 objects, 1024 MB usage: 6316 MB used, 293 GB / 299 GB avail pgs: 200 active+clean After resize: $ ceph -s cluster: id: 553ca7bd-925a-4dc5-a928-563b520842de health: HEALTH_OK services: mon: 3 daemons, quorum ceph01,ceph02,ceph03 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph02=up:active}, 2 up:standby osd: 3 osds: 3 up, 3 in data: pools: 2 pools, 200 pgs objects: 283 objects, 1024 MB usage: 156 GB used, 293 GB / 449 GB avail pgs: 200 active+clean So, after "growing" all OSDs by 50 GB (and object size remaining the same), the new 50 GB of additional space shows up as as used space. Also, the pool max available size stays the same. $ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 449G 293G 156G 34.70 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS cephfs_data 2 1024M 1.0992610M 261 cephfs_metadata 3 681k 092610M 22 I've tried searching around on the Internet and looked through documentation to see if/how growing bluestore volume OSDs is possible and haven't come up with anything. I'd greatly appreciate any help in this area if anyone has experience. Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous RGW Metadata Search
So I did the exact same thing using Kraken and the same set of VMs, no issue. What is the magic to make it work in Luminous? Anyone lucky enough to have this RGW ElasticSearch working using Luminous? On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yangwrote: > Hi Yehuda, > > Thanks for replying. > > >radosgw failed to connect to your ceph cluster. Does the rados command > >with the same connection params work? > > I am not quite sure what to do by running rados command to test. > > So I tried again, could you please take a look and check what could have > gone wrong? > > Here are what I did: > > On ceph admin node, I removed installation on ceph-rgw1 and > ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed all > rgw pools. Elasticsearch is running on ceph-rgw2 node on port 9200. > > *ceph-deploy purge ceph-rgw1* > *ceph-deploy purge ceph-rgw2* > *ceph-deploy purgedata ceph-rgw2* > *ceph-deploy purgedata ceph-rgw1* > *ceph-deploy install --release luminous ceph-rgw1* > *ceph-deploy admin ceph-rgw1* > *ceph-deploy rgw create ceph-rgw1* > *ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1* > *rados rmpool default.rgw.log default.rgw.log > --yes-i-really-really-mean-it* > *rados rmpool default.rgw.meta default.rgw.meta > --yes-i-really-really-mean-it* > *rados rmpool default.rgw.control default.rgw.control > --yes-i-really-really-mean-it* > *rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it* > > On ceph-rgw1 node: > > *export RGWHOST="ceph-rgw1"* > *export ELASTICHOST="ceph-rgw2"* > *export REALM="demo"* > *export ZONEGRP="zone1"* > *export ZONE1="zone1-a"* > *export ZONE2="zone1-b"* > *export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 | > head -n 1 )"* > *export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 | > head -n 1 )"* > > > *radosgw-admin realm create --rgw-realm=${REALM} --default* > *radosgw-admin zonegroup create --rgw-realm=${REALM} > --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master > --default* > *radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} > --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000 > --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default* > *radosgw-admin user create --uid=sync --display-name="zone sync" > --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system* > *radosgw-admin period update --commit* > > *sudo systemctl start ceph-radosgw@rgw.${RGWHOST}* > > *radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} > --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} > --endpoints=http://${RGWHOST}:8002* > *radosgw-admin zone modify --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP} > --rgw-zone=${ZONE2} --tier-type=elasticsearch > --tier-config=endpoint=http://${ELASTICHOST}:9200,num_replicas=1,num_shards=10* > *radosgw-admin period update --commit* > > *sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}* > > *sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f > --rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002"* > *2018-01-08 00:21:54.389432 7f0fe9cd2e80 -1 Couldn't init storage provider > (RADOS)* > > As you can see, starting rgw on port 8002 failed, but rgw on port > 8000 was started successfully. > Here are some more info which may be useful for diagnosis: > > $ cat /etc/ceph/ceph.conf > [global] > fsid = 3e5a32d4-e45e-48dd-a3c5-f6f28fef8edf > mon_initial_members = ceph-mon1, ceph-osd1, ceph-osd2, ceph-osd3 > mon_host = 172.30.212.226,172.30.212.227,172.30.212.228,172.30.212.250 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > osd_pool_default_size = 2 > osd_pool_default_min_size = 2 > osd_pool_default_pg_num = 100 > osd_pool_default_pgp_num = 100 > bluestore_compression_algorithm = zlib > bluestore_compression_mode = force > rgw_max_put_size = 21474836480 > [osd] > osd_max_object_size = 1073741824 > [mon] > mon_allow_pool_delete = true > [client.rgw.ceph-rgw1] > host = ceph-rgw1 > rgw frontends = civetweb port=8000 > > $ wget -O - -q http://ceph-rgw2:9200/ > { > "name" : "Hippolyta", > "cluster_name" : "elasticsearch", > "version" : { > "number" : "2.3.1", > "build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39", > "build_timestamp" : "2016-04-04T12:25:05Z", > "build_snapshot" : false, > "lucene_version" : "5.5.0" > }, > "tagline" : "You Know, for Search" > } > > $ ceph df > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 719G 705G 14473M 1.96 > POOLS: > NAMEID USED %USED MAX AVAIL OBJECTS > .rgw.root 17 6035 0 333G 19 > zone1-a.rgw.control 180 0 333G 8 > zone1-a.rgw.meta19 350 0 333G 2 > zone1-a.rgw.log 20 50 0 333G
Re: [ceph-users] Trying to increase number of PGs throws "Error E2BIG" though PGs/OSD < mon_max_pg_per_osd
Thank you for the explanation, Brad. I will change that setting and see how it goes. Subhachandra On Thu, Jan 11, 2018 at 10:38 PM, Brad Hubbardwrote: > On Fri, Jan 12, 2018 at 11:27 AM, Subhachandra Chandra > wrote: > > Hello, > > > > We are running experiments on a Ceph cluster before we move data on > it. > > While trying to increase the number of PGs on one of the pools it threw > the > > following error > > > > root@ctrl1:/# ceph osd pool set data pg_num 65536 > > Error E2BIG: specified pg_num 65536 is too large (creating 32768 new PGs > on > > ~540 OSDs exceeds per-OSD max of 32) > > That comes from here: > > https://github.com/ceph/ceph/blob/5d7813f612aea59239c8375aaa0091 > 9ae32f952f/src/mon/OSDMonitor.cc#L6027 > > So the warning is triggered because new_pgs (65536) > > g_conf->mon_osd_max_split_count (32) * expected_osds (540) > > > > > There are 2 pools named "data" and "metadata". "data" is an erasure coded > > pool (6,3) and "metadata" is a replicated pool with a replication factor > of > > 3. > > > > root@ctrl1:/# ceph osd lspools > > 1 metadata,2 data, > > root@ctrl1:/# ceph osd pool get metadata pg_num > > pg_num: 512 > > root@ctrl1:/# ceph osd pool get data pg_num > > pg_num: 32768 > > > > osd: 540 osds: 540 up, 540 in > > flags noout,noscrub,nodeep-scrub > > > > data: > > pools: 2 pools, 33280 pgs > > objects: 7090k objects, 1662 TB > > usage: 2501 TB used, 1428 TB / 3929 TB avail > > pgs: 33280 active+clean > > > > The current PG/OSD ratio according to my calculation should be 549 > (32768 * 9 + 512 * 3 ) / 540.0 > > 548.97778 > > > > Increasing the number of PGs in the "data" pool should increase the > PG/OSD > > ratio to about 1095 > (65536 * 9 + 512 * 3 ) / 540.0 > > 1095. > > > > In the config, settings related to PG/OSD ratio look like > > mon_max_pg_per_osd = 1500 > > osd_max_pg_per_osd_hard_ratio = 1.0 > > > > Trying to increase the number of PGs to 65536 throws the previously > > mentioned error. The new PG/OSD ratio is still under the configured > limit. > > Why do we see the error? Further, there seems to be a bug in the error > > message where it says "exceeds per-OSD max of 32" in terms of where does > > "32" comes from? > > Maybe the wording could be better. Perhaps "exceeds per-OSD max with > mon_osd_max_split_count of 32". I'll submit this and see how it goes. > > > > > P.S. I understand that the PG/OSD ratio configured on this cluster far > > exceeds the recommended values. The experiment is to find scaling limits > and > > try out expansion scenarios. > > > > Thanks > > Subhachandra > > > > > > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Cheers, > Brad > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 4 incomplete PGs causing RGW to go offline?
Rgw.buckets ( which is where the data is being sent ). I am just surprised that a few incomplete PGs would grind three gateways to a halt. Granted, the incomplete part of a large hardware failure situation we had and having a min_size setting of 1 didn’t help the situation. We are not completely innocent, but I would hope that the system as a whole would work together to skip those incomplete PGs. Fixing them doesn’t appear to be an easy task at this point, hence why we haven’t fixed them yet(I wish that were easier, but I understand the counter argument ). -Brent From: David Turner [mailto:drakonst...@gmail.com] Sent: Thursday, January 11, 2018 8:22 PM To: Brent KennedyCc: Ceph Users Subject: Re: [ceph-users] 4 incomplete PGs causing RGW to go offline? Which pools are the incomplete PGs a part of? I would say it's very likely that if some of the RGW metadata was incomplete that the daemons wouldn't be happy. On Thu, Jan 11, 2018, 6:17 PM Brent Kennedy > wrote: We have 3 RadosGW servers running behind HAProxy to enable clients to connect to the ceph cluster like an amazon bucket. After all the failures and upgrade issues were resolved, I cannot get the RadosGW servers to stay online. They were upgraded to luminous, I even upgraded the OS to Ubuntu 16 on them ( before upgrading to Luminous ). They used to have apache on them as they ran Hammer and before that firefly. I removed apache before upgrading to Luminous. The start up and run for about 4-6 hours before all three start to go offline. Client traffic is light right now as we are just testing file read/write before we reactivate them ( they switched back to amazon while we fix them ). Could the 4 incomplete PGs be causing them to go offline? The last time I saw an issue like this was when recovery wasn’t working 100%, so it seems related since they haven’t been stable since we upgraded( but that was also after the failures we had, which is why I am not trying to specifically blame the upgrade ). When I look at the radosgw log, this is what I see ( the first 2 lines show up plenty before this, they are health checks by the haproxy server, the next two are file requests that 404 fail I am guessing, then the last one is me restarting the service ): 2018-01-11 20:14:36.640577 7f5826aa3700 1 == req done req=0x7f5826a9d1f0 op status=0 http_status=200 == 2018-01-11 20:14:36.640602 7f5826aa3700 1 civetweb: 0x56202c567000: 192.168.120.21 - - [11/Jan/2018:20:14:36 +] "HEAD / HTTP/1.0" 1 0 - - 2018-01-11 20:14:36.640835 7f5816282700 1 == req done req=0x7f581627c1f0 op status=0 http_status=200 == 2018-01-11 20:14:36.640859 7f5816282700 1 civetweb: 0x56202c61: 192.168.120.22 - - [11/Jan/2018:20:14:36 +] "HEAD / HTTP/1.0" 1 0 - - 2018-01-11 20:14:36.761917 7f5835ac1700 1 == starting new request req=0x7f5835abb1f0 = 2018-01-11 20:14:36.763936 7f5835ac1700 1 == req done req=0x7f5835abb1f0 op status=0 http_status=404 == 2018-01-11 20:14:36.763983 7f5835ac1700 1 civetweb: 0x56202c4ce000: 192.168.120.21 - - [11/Jan/2018:20:14:36 +] "HEAD /Jobimages/vendor05/10/3962896/3962896_cover.pdf HTTP/1.1" 1 0 - aws-sdk-dotnet-35/2 .0.2.2 .NET Runtime/4.0 .NET Framework/4.0 OS/6.2.9200.0 FileIO 2018-01-11 20:14:36.772611 7f5808266700 1 == starting new request req=0x7f58082601f0 = 2018-01-11 20:14:36.773733 7f5808266700 1 == req done req=0x7f58082601f0 op status=0 http_status=404 == 2018-01-11 20:14:36.773769 7f5808266700 1 civetweb: 0x56202c6aa000: 192.168.120.21 - - [11/Jan/2018:20:14:36 +] "HEAD /Jobimages/vendor05/10/3962896/3962896_cover.pdf HTTP/1.1" 1 0 - aws-sdk-dotnet-35/2 .0.2.2 .NET Runtime/4.0 .NET Framework/4.0 OS/6.2.9200.0 FileIO 2018-01-11 20:14:38.163617 7f5836ac3700 1 == starting new request req=0x7f5836abd1f0 = 2018-01-11 20:14:38.165352 7f5836ac3700 1 == req done req=0x7f5836abd1f0 op status=0 http_status=404 == 2018-01-11 20:14:38.165401 7f5836ac3700 1 civetweb: 0x56202c4e2000: 192.168.120.21 - - [11/Jan/2018:20:14:38 +] "HEAD /Jobimages/vendor05/10/3445645/3445645_cover.pdf HTTP/1.1" 1 0 - aws-sdk-dotnet-35/2 .0.2.2 .NET Runtime/4.0 .NET Framework/4.0 OS/6.2.9200.0 FileIO 2018-01-11 20:14:38.170551 7f5807a65700 1 == starting new request req=0x7f5807a5f1f0 = 2018-01-11 20:14:40.322236 7f58352c0700 1 == starting new request req=0x7f58352ba1f0 = 2018-01-11 20:14:40.323468 7f5834abf700 1 == starting new request req=0x7f5834ab91f0 = 2018-01-11 20:14:41.643365 7f58342be700 1 == starting new request req=0x7f58342b81f0 = 2018-01-11 20:14:41.643358 7f58312b8700 1 == starting new request req=0x7f58312b21f0 = 2018-01-11 20:14:50.324196 7f5829aa9700 1 == starting new request req=0x7f5829aa31f0 = 2018-01-11
Re: [ceph-users] replace failed disk in Luminous v12.2.2
Hi, can someone, comment/confirm my planned OSD replacement procedure? It would be very helpful for me. Dietmar Am 11. Januar 2018 17:47:50 MEZ schrieb Dietmar Rieder: >Hi Alfredo, > >thanks for your coments, see my answers inline. > >On 01/11/2018 01:47 PM, Alfredo Deza wrote: >> On Thu, Jan 11, 2018 at 4:30 AM, Dietmar Rieder >> wrote: >>> Hello, >>> >>> we have failed OSD disk in our Luminous v12.2.2 cluster that needs >to >>> get replaced. >>> >>> The cluster was initially deployed using ceph-deploy on Luminous >>> v12.2.0. The OSDs were created using >>> >>> ceph-deploy osd create --bluestore cephosd-${osd}:/dev/sd${disk} >>> --block-wal /dev/nvme0n1 --block-db /dev/nvme0n1 >>> >>> Note we separated the bluestore data, wal and db. >>> >>> We updated to Luminous v12.2.1 and further to Luminous v12.2.2. >>> >>> With the last update we also let ceph-volume take over the OSDs >using >>> "ceph-volume simple scan /var/lib/ceph/osd/$osd" and "ceph-volume >>> simple activate ${osd} ${id}". All of this went smoothly. >> >> That is good to hear! >> >>> >>> Now wonder what is the correct way to replace a failed OSD block >disk? >>> >>> The docs for luminous [1] say: >>> >>> REPLACING AN OSD >>> >>> 1. Destroy the OSD first: >>> >>> ceph osd destroy {id} --yes-i-really-mean-it >>> >>> 2. Zap a disk for the new OSD, if the disk was used before for other >>> purposes. It’s not necessary for a new disk: >>> >>> ceph-disk zap /dev/sdX >>> >>> >>> 3. Prepare the disk for replacement by using the previously >destroyed >>> OSD id: >>> >>> ceph-disk prepare --bluestore /dev/sdX --osd-id {id} --osd-uuid >`uuidgen` >>> >>> >>> 4. And activate the OSD: >>> >>> ceph-disk activate /dev/sdX1 >>> >>> >>> Initially this seems to be straight forward, but >>> >>> 1. I'm not sure if there is something to do with the still existing >>> bluefs db and wal partitions on the nvme device for the failed OSD. >Do >>> they have to be zapped ? If yes, what is the best way? There is >nothing >>> mentioned in the docs. >> >> What is your concern here if the activation seems to work? > >I geuss on the nvme partitions for bluefs db and bluefs wal there is >still data related to the failed OSD block device. I was thinking that >this data might "interfere" with the new replacement OSD block device, >which is empty. > >So you are saying that this is no concern, right? >Are they automatically reused and assigned to the replacement OSD block >device, or do I have to specify them when running ceph-disk prepare? >If I need to specify the wal and db partition, how is this done? > >I'm asking this since from the logs of the initial cluster deployment I >got the following warning: > >[cephosd-02][WARNING] prepare_device: OSD will not be hot-swappable if >block.db is not the same device as the osd data >[...] >[cephosd-02][WARNING] prepare_device: OSD will not be hot-swappable if >block.wal is not the same device as the osd data > > >>> >>> 2. Since we already let "ceph-volume simple" take over our OSDs I'm >not >>> sure if we should now use ceph-volume or again ceph-disk (followed >by >>> "ceph-vloume simple" takeover) to prepare and activate the OSD? >> >> The `simple` sub-command is meant to help with the activation of OSDs >> at boot time, supporting ceph-disk (or manual) created OSDs. > >OK, got this... > >> >> There is no requirement to use `ceph-volume lvm` which is intended >for >> new OSDs using LVM as devices. > >Fine... > >>> >>> 3. If we should use ceph-volume, then by looking at the luminous >>> ceph-volume docs [2] I find for both, >>> >>> ceph-volume lvm prepare >>> ceph-volume lvm activate >>> >>> that the bluestore option is either NOT implemented or NOT supported >>> >>> activate: [–bluestore] filestore (IS THIS A TYPO???) objectstore >(not >>> yet implemented) >>> prepare: [–bluestore] Use the bluestore objectstore (not currently >>> supported) >> >> These might be a typo on the man page, will get that addressed. >Ticket >> opened at http://tracker.ceph.com/issues/22663 > >Thanks > >> bluestore as of 12.2.2 is fully supported and it is the default. The >> --help output in ceph-volume does have the flags updated and >correctly >> showing this. > >OK > >>> >>> >>> So, now I'm completely lost. How is all of this fitting together in >>> order to replace a failed OSD? >> >> You would need to keep using ceph-disk. Unless you want ceph-volume >to >> take over, in which case you would need to follow the steps to deploy >> a new OSD >> with ceph-volume. > >OK > >> Note that although --osd-id is supported, there is an issue with that >> on 12.2.2 that would prevent you from correctly deploying it >> http://tracker.ceph.com/issues/22642 >> >> The recommendation, if you want to use ceph-volume, would be to omit >> --osd-id and let the cluster give you the ID. >> >>> >>> 4. More after reading some a recent threads on this list >additional >>> questions are coming up:
[ceph-users] mons segmentation faults New 12.2.2 cluster
Hi all, I installed a new Luminous 12.2.2 cluster. The monitors were up at first, but quickly started failing, segfaulting. I only installed some mons, mgr, mds with ceph-deploy and osds with ceph volume. No pools or fs were created yet. When I start all mons again, there is a short window i can see the cluster state: [root@ceph001 ~]# ceph status cluster: id: 82766e04-585b-49a6-a0ac-c13d9ffd0a7d health: HEALTH_WARN 1/3 mons down, quorum ceph002,ceph003 services: mon: 3 daemons, quorum ceph002,ceph003, out of quorum: ceph001 mgr: ceph001(active), standbys: ceph002, ceph003 osd: 7 osds: 4 up, 4 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 4223 MB used, 14899 GB / 14904 GB avail pgs: But this is only until I lose quorum again. What could be the problem here? Thanks!! Kenneth 2018-01-12 13:08:36.912832 7f794f513e80 0 set uid:gid to 167:167 (ceph:ceph) 2018-01-12 13:08:36.912859 7f794f513e80 0 ceph version 12.2.2 (cf0baba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process (unknown), pid 28726 2018-01-12 13:08:36.913016 7f794f513e80 0 pidfile_write: ignore empty --pid-file 2018-01-12 13:08:36.951556 7f794f513e80 0 load: jerasure load: lrc load: isa 2018-01-12 13:08:36.951703 7f794f513e80 0 set rocksdb option compression = kNoCompression 2018-01-12 13:08:36.951716 7f794f513e80 0 set rocksdb option write_buffer_size = 33554432 2018-01-12 13:08:36.951742 7f794f513e80 0 set rocksdb option compression = kNoCompression 2018-01-12 13:08:36.951749 7f794f513e80 0 set rocksdb option write_buffer_size = 33554432 2018-01-12 13:08:36.951936 7f794f513e80 4 rocksdb: RocksDB version: 5.4.0 2018-01-12 13:08:36.951947 7f794f513e80 4 rocksdb: Git sha rocksdb_build_git_sha:@0@ 2018-01-12 13:08:36.951951 7f794f513e80 4 rocksdb: Compile date Nov 30 2017 2018-01-12 13:08:36.951954 7f794f513e80 4 rocksdb: DB SUMMARY 2018-01-12 13:08:36.952011 7f794f513e80 4 rocksdb: CURRENT file: CURRENT 2018-01-12 13:08:36.952016 7f794f513e80 4 rocksdb: IDENTITY file: IDENTITY 2018-01-12 13:08:36.952020 7f794f513e80 4 rocksdb: MANIFEST file: MANIFEST-64 size: 219 Bytes 2018-01-12 13:08:36.952023 7f794f513e80 4 rocksdb: SST files in /var/lib/ceph/mon/ceph-ceph001/store.db dir, Total Num: 3, files: 48.sst 50.sst 60.sst 2018-01-12 13:08:36.952025 7f794f513e80 4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-ceph001/store.db: 65.log size: 0 ; 2018-01-12 13:08:36.952028 7f794f513e80 4 rocksdb: Options.error_if_exists: 0 2018-01-12 13:08:36.952029 7f794f513e80 4 rocksdb: Options.create_if_missing: 0 2018-01-12 13:08:36.952031 7f794f513e80 4 rocksdb: Options.paranoid_checks: 1 2018-01-12 13:08:36.952032 7f794f513e80 4 rocksdb: Options.env: 0x5617a10fa040 2018-01-12 13:08:36.952033 7f794f513e80 4 rocksdb: Options.info_log: 0x5617a24ce1c0 2018-01-12 13:08:36.952034 7f794f513e80 4 rocksdb: Options.max_open_files: -1 2018-01-12 13:08:36.952035 7f794f513e80 4 rocksdb: Options.max_file_opening_threads: 16 2018-01-12 13:08:36.952035 7f794f513e80 4 rocksdb: Options.use_fsync: 0 2018-01-12 13:08:36.952037 7f794f513e80 4 rocksdb: Options.max_log_file_size: 0 2018-01-12 13:08:36.952038 7f794f513e80 4 rocksdb: Options.max_manifest_file_size: 18446744073709551615 2018-01-12 13:08:36.952039 7f794f513e80 4 rocksdb: Options.log_file_time_to_roll: 0 2018-01-12 13:08:36.952040 7f794f513e80 4 rocksdb: Options.keep_log_file_num: 1000 2018-01-12 13:08:36.952041 7f794f513e80 4 rocksdb: Options.recycle_log_file_num: 0 2018-01-12 13:08:36.952042 7f794f513e80 4 rocksdb: Options.allow_fallocate: 1 2018-01-12 13:08:36.952043 7f794f513e80 4 rocksdb: Options.allow_mmap_reads: 0 2018-01-12 13:08:36.952044 7f794f513e80 4 rocksdb: Options.allow_mmap_writes: 0 2018-01-12 13:08:36.952045 7f794f513e80 4 rocksdb: Options.use_direct_reads: 0 2018-01-12 13:08:36.952046 7f794f513e80 4 rocksdb: Options.use_direct_io_for_flush_and_compaction: 0 2018-01-12 13:08:36.952047 7f794f513e80 4 rocksdb: Options.create_missing_column_families: 0 2018-01-12 13:08:36.952048 7f794f513e80 4 rocksdb: Options.db_log_dir: 2018-01-12 13:08:36.952049 7f794f513e80 4 rocksdb: Options.wal_dir: /var/lib/ceph/mon/ceph-ceph001/store.db 2018-01-12 13:08:36.952050 7f794f513e80 4 rocksdb: Options.table_cache_numshardbits: 6 2018-01-12 13:08:36.952050 7f794f513e80 4 rocksdb: Options.max_subcompactions: 1 2018-01-12 13:08:36.952062
Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?
Hi David, To follow up on this I had a 4th drive fail (out of 12) and have opted to order the below disks as a replacement, I have an ongoing case with Intel via the supplier - Will report back anything useful - But I am going to avoid the Intel s4600 2TB SSD's for the moment. 1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND Regards Sean Redmond On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmondwrote: > Hi David, > > Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24 > disk model) in None RAID mode via a perc RAID card. > > The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04. > > Thanks for your feedback on the HGST disks. > > Thanks > > On Wed, Jan 10, 2018 at 10:55 PM, David Herselman wrote: > >> Hi Sean, >> >> >> >> No, Intel’s feedback has been… Pathetic… I have yet to receive anything >> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta >> firmware. No official answer as to whether or not one can logically unlock >> the drives, no answer to my question whether or not Intel publish serial >> numbers anywhere pertaining to recalled batches and no information >> pertaining to whether or not firmware updates would address any known >> issues. >> >> >> >> This with us being an accredited Intel Gold partner… >> >> >> >> >> >> We’ve returned the lot and ended up with 9/12 of the drives failing in >> the same manner. The replaced drives, which had different serial number >> ranges, also failed. Very frustrating is that the drives fail in a way that >> result in unbootable servers, unless one adds ‘rootdelay=240’ to the kernel. >> >> >> >> >> >> I would be interested to know what platform your drives were in and >> whether or not they were connected to a RAID module/card. >> >> >> >> PS: After much searching we’ve decided to order the NVMe conversion kit >> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD >> rating. >> >> >> >> >> >> Regards >> >> David Herselman >> >> >> >> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com] >> *Sent:* Thursday, 11 January 2018 12:45 AM >> *To:* David Herselman >> *Cc:* Christian Balzer ; ceph-users@lists.ceph.com >> >> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I >> activate pgs? >> >> >> >> Hi, >> >> >> >> I have a case where 3 out to 12 of these Intel S4600 2TB model failed >> within a matter of days after being burn-in tested then placed into >> production. >> >> >> >> I am interested to know, did you every get any further feedback from the >> vendor on your issue? >> >> >> >> Thanks >> >> >> >> On Thu, Dec 21, 2017 at 1:38 PM, David Herselman wrote: >> >> Hi, >> >> I assume this can only be a physical manufacturing flaw or a firmware >> bug? Do Intel publish advisories on recalled equipment? Should others be >> concerned about using Intel DC S4600 SSD drives? Could this be an >> electrical issue on the Hot Swap Backplane or BMC firmware issue? Either >> way, all pure Intel... >> >> The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped >> through images, file systems are subsequently severely damaged. >> >> Is it possible to get Ceph to read in partial data shards? It would >> provide between 25-75% more yield... >> >> >> Is there anything wrong with how we've proceeded thus far? Would be nice >> to reference examples of using ceph-objectstore-tool but documentation is >> virtually non-existent. >> >> We used another SSD drive to simulate bringing all the SSDs back online. >> We carved up the drive to provide equal partitions to essentially simulate >> the original SSDs: >> # Partition a drive to provide 12 x 150GB partitions, eg: >> sdd 8:48 0 1.8T 0 disk >> |-sdd18:49 0 140G 0 part >> |-sdd28:50 0 140G 0 part >> |-sdd38:51 0 140G 0 part >> |-sdd48:52 0 140G 0 part >> |-sdd58:53 0 140G 0 part >> |-sdd68:54 0 140G 0 part >> |-sdd78:55 0 140G 0 part >> |-sdd88:56 0 140G 0 part >> |-sdd98:57 0 140G 0 part >> |-sdd10 8:58 0 140G 0 part >> |-sdd11 8:59 0 140G 0 part >> +-sdd12 8:60 0 140G 0 part >> >> >> Pre-requisites: >> ceph osd set noout; >> apt-get install uuid-runtime; >> >> >> for ID in `seq 24 35`; do >> UUID=`uuidgen`; >> OSD_SECRET=`ceph-authtool --gen-print-key`; >> DEVICE='/dev/sdd'$[$ID-23]; # 24-23 = /dev/sdd1, 35-23 = /dev/sdd12 >> echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | ceph osd new $UUID $ID >> -i - -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring; >> mkdir /var/lib/ceph/osd/ceph-$ID; >> mkfs.xfs $DEVICE; >> mount $DEVICE /var/lib/ceph/osd/ceph-$ID; >> ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring >> --name osd.$ID --add-key $OSD_SECRET; >> ceph-osd -i $ID --mkfs
Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?
Well, if a stranger have access to my whole Ceph data (this, all my VMs & rgw's data), I don't mind if he gets root access too :) On 01/12/2018 10:18 AM, Van Leeuwen, Robert wrote: Ceph runs on a dedicated hardware, there is nothing there except Ceph, and the ceph daemons have already all power on ceph's data. And there is no random-code execution allowed on this node. Thus, spectre & meltdown are meaning-less for Ceph's node, and mitigations should be disabled Is this wrong ? In principle, I would say yes: This means if someone has half a foot between the door for whatever reason you will have to assume they will be able to escalate to root. Looking at meltdown and spectre is already a good indication of creativity in gaining (more) access. So I would not assume people are unable to ever gain access to your network or that the ceph/ssh/etc daemons have no bugs to exploit. I would more phrase it as: Is the performance decrease big enough that you are willing to risk running a less secure server. The answer to that depends on a lot of things like: Performance impact of the patch Costs of extra hardware to mitigate performance impact Impact of possible breach (e.g. GPDR fines or reputation damage can be extremely expensive) Who/what is allowed on your network How likely you are a hacker target How good will you sleep knowing there is a potential hole in security :) Etc. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?
> Ceph runs on a dedicated hardware, there is nothing there except Ceph, >and the ceph daemons have already all power on ceph's data. >And there is no random-code execution allowed on this node. > >Thus, spectre & meltdown are meaning-less for Ceph's node, and >mitigations should be disabled > >Is this wrong ? In principle, I would say yes: This means if someone has half a foot between the door for whatever reason you will have to assume they will be able to escalate to root. Looking at meltdown and spectre is already a good indication of creativity in gaining (more) access. So I would not assume people are unable to ever gain access to your network or that the ceph/ssh/etc daemons have no bugs to exploit. I would more phrase it as: Is the performance decrease big enough that you are willing to risk running a less secure server. The answer to that depends on a lot of things like: Performance impact of the patch Costs of extra hardware to mitigate performance impact Impact of possible breach (e.g. GPDR fines or reputation damage can be extremely expensive) Who/what is allowed on your network How likely you are a hacker target How good will you sleep knowing there is a potential hole in security :) Etc. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] issue adding OSDs
"ceph versions" returned all daemons as running 12.2.1. On Fri, Jan 12, 2018 at 8:00 AM, Janne Johanssonwrote: > Running "ceph mon versions" and "ceph osd versions" and so on as you do the > upgrades would have helped I guess. > > > 2018-01-11 17:28 GMT+01:00 Luis Periquito : >> >> this was a bit weird, but is now working... Writing for future >> reference if someone faces the same issue. >> >> this cluster was upgraded from jewel to luminous following the >> recommended process. When it was finished I just set the require_osd >> to luminous. However I hadn't restarted the daemons since. So just >> restarting all the OSDs made the problem go away. >> >> How to check if that was the case? The OSDs now have a "class" associated. >> >> >> >> On Wed, Jan 10, 2018 at 7:16 PM, Luis Periquito >> wrote: >> > Hi, >> > >> > I'm running a cluster with 12.2.1 and adding more OSDs to it. >> > Everything is running version 12.2.1 and require_osd is set to >> > luminous. >> > >> > one of the pools is replicated with size 2 min_size 1, and is >> > seemingly blocking IO while recovering. I have no slow requests, >> > looking at the output of "ceph osd perf" it seems brilliant (all >> > numbers are lower than 10). >> > >> > clients are RBD (OpenStack VM in KVM) and using (mostly) 10.2.7. I've >> > tagged those OSDs as out and the RBD just came back to life. I did >> > have some objects degraded: >> > >> > 2018-01-10 18:23:52.081957 mon.mon0 mon.0 x.x.x.x:6789/0 410414 : >> > cluster [WRN] Health check update: 9926354/49526500 objects misplaced >> > (20.043%) (OBJECT_MISPLACED) >> > 2018-01-10 18:23:52.081969 mon.mon0 mon.0 x.x.x.x:6789/0 410415 : >> > cluster [WRN] Health check update: Degraded data redundancy: >> > 5027/49526500 objects degraded (0.010%), 1761 pgs unclean, 27 pgs >> > degraded (PG_DEGRADED) >> > >> > any thoughts as to what might be happening? I've run such operations >> > many a times... >> > >> > thanks for all help, as I'm grasping as to figure out what's >> > happening... >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Rocksdb Segmentation fault during compaction (on OSD)
Hi, While trying to get an OSD back in the test cluster, which had been dropped out for unknown reason, we see a RocksDB Segmentation fault during "compaction". I increased debugging to 20/20 for OSD / RocksDB, see part of the logfile below: ... 49477, 49476, 49475, 49474, 49473, 49472, 49471, 49470, 49469, 49468, 49467], "files_L1": [49465], "score": 1138.25, "input_data_size": 82872298} -1> 2018-01-12 08:48:23.915753 7f91eaf89e40 1 freelist init 0> 2018-01-12 08:48:45.630418 7f91eaf89e40 -1 *** Caught signal (Segmentation fault) ** in thread 7f91eaf89e40 thread_name:ceph-osd ceph version 12.2.2 (cf0baba3b47f9427c6c97e2144b094b7e5ba) luminous (stable) 1: (()+0xa65824) [0x55a124693824] 2: (()+0x11390) [0x7f91e9238390] 3: (()+0x1f8af) [0x7f91eab658af] 4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, rocksdb::BlockBasedTable::CachableEntry*, rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned long, bool, rocksdb::Cache::Priority)+0x1d9) [0x55a124a64e49] 5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::Slice, rocksdb::BlockBasedTable::CachableEntry*, bool)+0x3b7) [0x55a124a66827] 6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::BlockIter*, bool, rocksdb::Status)+0x2ac) [0x55a124a66b6c] 7: (rocksdb::BlockBasedTable::BlockEntryIteratorState::NewSecondaryIterator(rocksdb::Slice const&)+0x97) [0x55a124a6f2e7] 8: (()+0xe6c48e) [0x55a124a9a48e] 9: (()+0xe6ca06) [0x55a124a9aa06] 10: (rocksdb::MergingIterator::Seek(rocksdb::Slice const&)+0x126) [0x55a124a7bc86] 11: (rocksdb::DBIter::Seek(rocksdb::Slice const&)+0x20a) [0x55a124b1bdaa] 12: (RocksDBStore::RocksDBWholeSpaceIteratorImpl::lower_bound(std::__cxx11::basic_stringconst&, std::__cxx11::basic_string const&)+0x46) [0x55a1245d4676] 13: (BitmapFreelistManager::init(unsigned long)+0x2dc) [0x55a12463976c] 14: (BlueStore::_open_fm(bool)+0xc00) [0x55a124526c50] 15: (BlueStore::_mount(bool)+0x3dc) [0x55a12459aa1c] 16: (OSD::init()+0x3e2) [0x55a1241064e2] 17: (main()+0x2f07) [0x55a1240181d7] 18: (__libc_start_main()+0xf0) [0x7f91e81be830] 19: (_start()+0x29) [0x55a1240a37f9] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. The disk in question is very old (powered on ~ 8 years), so it might be that part of the data is corrupt. Would RocksDB throw a similar error like this in that case? Gr. Stefan P.s. We're trying to learn as much as possible when things do not go according to plan. There is way more debug info available in case anyone is interested. -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] issue adding OSDs
Running "ceph mon versions" and "ceph osd versions" and so on as you do the upgrades would have helped I guess. 2018-01-11 17:28 GMT+01:00 Luis Periquito: > this was a bit weird, but is now working... Writing for future > reference if someone faces the same issue. > > this cluster was upgraded from jewel to luminous following the > recommended process. When it was finished I just set the require_osd > to luminous. However I hadn't restarted the daemons since. So just > restarting all the OSDs made the problem go away. > > How to check if that was the case? The OSDs now have a "class" associated. > > > > On Wed, Jan 10, 2018 at 7:16 PM, Luis Periquito > wrote: > > Hi, > > > > I'm running a cluster with 12.2.1 and adding more OSDs to it. > > Everything is running version 12.2.1 and require_osd is set to > > luminous. > > > > one of the pools is replicated with size 2 min_size 1, and is > > seemingly blocking IO while recovering. I have no slow requests, > > looking at the output of "ceph osd perf" it seems brilliant (all > > numbers are lower than 10). > > > > clients are RBD (OpenStack VM in KVM) and using (mostly) 10.2.7. I've > > tagged those OSDs as out and the RBD just came back to life. I did > > have some objects degraded: > > > > 2018-01-10 18:23:52.081957 mon.mon0 mon.0 x.x.x.x:6789/0 410414 : > > cluster [WRN] Health check update: 9926354/49526500 objects misplaced > > (20.043%) (OBJECT_MISPLACED) > > 2018-01-10 18:23:52.081969 mon.mon0 mon.0 x.x.x.x:6789/0 410415 : > > cluster [WRN] Health check update: Degraded data redundancy: > > 5027/49526500 objects degraded (0.010%), 1761 pgs unclean, 27 pgs > > degraded (PG_DEGRADED) > > > > any thoughts as to what might be happening? I've run such operations > > many a times... > > > > thanks for all help, as I'm grasping as to figure out what's happening... > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com