[ceph-users] Re: Ceph Pacific mon is not starting after host reboot
Wanted to respond to the original thread I saw archived on this topic but I wasn't subscribed to the mailing list yet so don't have the thread in my inbox to reply to. Hopefully, those involved in that thread still see this. This issue looks the same as https://tracker.ceph.com/issues/51027 which is being worked on. Essentially, it seems that hosts that were being rebooted were temporarily marked as offline and cephamd had an issue where it would try to remove all daemons (outside of osds I believe) from offline hosts. The pre-remove step for monitors was to remove it from the monmap, so this would happen, but then the daemon itself would not be removed since the host was temporarily inaccessible due to the reboot. When the host came back up, the mon was restarted but it had already been removed from the monmap so it gets stuck in a "stopped" state. A fix for this that stops cephadm from trying to remove daemons from offline hosts is in the works. A temporary workaround right now, as mentioned by Harry on that tracker, is to get cephadm to actually remove the mon daemon by changing the placement spec to not include the host with the broken mon. Then wait to see the mon daemon was removed, and finally put the placement spec back to how it was so the mon gets redeployed (and now hopefully runs normally). ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] very low RBD and Cephfs performance
Hello I have a 4 nodes Ceph cluster on Azure. Each node is a E32s_v4 VM ,which has 32vcpus and 256GB memory.The network between nodes is 15GBit/sec measured with iperf. The OS is CentOS 8.2 .Ceph version is Pacific and was deployed with ceph-ansible. Three nodes have the OSDs and the fourth node is acting as rbd client. In total there are 12 OSDs ,four per node , with each disk having 5000 IOPS for 4K writes. i have one pool with 512 PG and one rbd image.I am running the following fio command and i get only 1433 IOPS fio --filename=/dev/rbd0 --direct=1 --fsync=1 --rw=write --bs=4k --numjobs=16 --iodepth=8 --runtime=360 --time_based --group_reporting --name=4k-sync-write 4k-sync-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=8 ... fio-3.19 Starting 16 processes Jobs: 16 (f=16): [W(16)][100.0%][w=5734KiB/s][w=1433 IOPS][eta 00m:00s] 4k-sync-write: (groupid=0, jobs=16): err= 0: pid=12427: Mon Aug 9 16:18:38 2021 write: IOPS=1327, BW=5309KiB/s (5436kB/s)(1866MiB/360011msec); 0 zone resets clat (msec): min=2, max=365, avg=12.04, stdev= 7.79 lat (msec): min=2, max=365, avg=12.04, stdev= 7.79 clat percentiles (usec): | 1.00th=[ 3556], 5.00th=[ 4686], 10.00th=[ 5669], 20.00th=[ 6849], | 30.00th=[ 7767], 40.00th=[ 8717], 50.00th=[ 9896], 60.00th=[11338], | 70.00th=[13173], 80.00th=[15795], 90.00th=[20841], 95.00th=[26608], | 99.00th=[41157], 99.50th=[47449], 99.90th=[66323], 99.95th=[76022], | 99.99th=[96994] bw ( KiB/s): min= 1855, max=10240, per=100.00%, avg=5313.12, stdev=97.24, samples=11488 iops: min= 463, max= 2560, avg=1324.30, stdev=24.33, samples=11488 lat (msec) : 4=2.42%, 10=48.56%, 20=37.90%, 50=10.73%, 100=0.38% lat (msec) : 250=0.01%, 500=0.01% fsync/fdatasync/sync_file_range: sync (nsec): min=1100, max=114600, avg=5610.37, stdev=3387.10 sync percentiles (nsec): | 1.00th=[ 2192], 5.00th=[ 3312], 10.00th=[ 3408], 20.00th=[ 3408], | 30.00th=[ 3504], 40.00th=[ 3600], 50.00th=[ 3888], 60.00th=[ 6816], | 70.00th=[ 7712], 80.00th=[ 7776], 90.00th=[ 7904], 95.00th=[ 9408], | 99.00th=[18304], 99.50th=[23936], 99.90th=[41216], 99.95th=[45824], | 99.99th=[61696] cpu : usr=0.30%, sys=0.53%, ctx=477856, majf=0, minf=203 IO depths: 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,477811,0,477795 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=8 Run status group 0 (all jobs): WRITE: bw=5309KiB/s (5436kB/s), 5309KiB/s-5309KiB/s (5436kB/s-5436kB/s), io=1866MiB (1957MB), run=360011-360011msec Disk stats (read/write): rbd0: ios=0/469238, merge=0/4868, ticks=0/5598109, in_queue=5363153, util=38.89% ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph Pacific mon is not starting after host reboot
Hi, We are seeing very similar behavior on 16.2.5, and also have noticed that an undeploy/deploy cycle fixes things. Before we go rummaging through the source code trying to determine the root cause, has anybody else figured this out? It seems odd that a repeatable issue (I've seen other mailing list posts about this same issue) impacting 16.2.4/16.2.5, at least, on reboots hasn't been addressed yet, so wanted to check. Here's one of the other thread titles that appears related: "[ceph-users] mons assigned via orch label 'committing suicide' upon reboot." Respectfully, David On Sun, May 23, 2021 at 3:40 AM Adrian Nicolae wrote: > > Hi guys, > > I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put > it in production on a 1PB+ storage cluster with rgw-only access. > > I noticed a weird issue with my mons : > > - if I reboot a mon host, the ceph-mon container is not starting after > reboot > > - I can see with 'ceph orch ps' the following output : > > mon.node01 node01 running (20h) 4m ago > 20h 16.2.4 8d91d370c2b8 0a2e86af94b2 > mon.node02 node02 running (115m) 12s ago > 115m 16.2.4 8d91d370c2b8 51f4885a1b06 > mon.node03 node03 stopped 4m ago > 19h > > (where node03 is the host which was rebooted). > > - I tried to start the mon container manually on node03 with '/bin/bash > /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.run' > and I've got the following output : > > debug 2021-05-23T08:24:25.192+ 7f9a9e358700 0 > mon.node03@-1(???).osd e408 crush map has features 3314933069573799936, > adjusting msgr requires > debug 2021-05-23T08:24:25.192+ 7f9a9e358700 0 > mon.node03@-1(???).osd e408 crush map has features 43262930805112, > adjusting msgr requires > debug 2021-05-23T08:24:25.192+ 7f9a9e358700 0 > mon.node03@-1(???).osd e408 crush map has features 43262930805112, > adjusting msgr requires > debug 2021-05-23T08:24:25.192+ 7f9a9e358700 0 > mon.node03@-1(???).osd e408 crush map has features 43262930805112, > adjusting msgr requires > cluster 2021-05-23T08:07:12.189243+ mgr.node01.ksitls (mgr.14164) > 36380 : cluster [DBG] pgmap v36392: 417 pgs: 417 active+clean; 33 KiB > data, 605 MiB used, 651 GiB / 652 GiB avail; 9.6 KiB/s rd, 0 B/s wr, 15 op/s > debug 2021-05-23T08:24:25.196+ 7f9a9e358700 1 > mon.node03@-1(???).paxosservice(auth 1..51) refresh upgraded, format 0 -> 3 > debug 2021-05-23T08:24:25.208+ 7f9a88176700 1 heartbeat_map > reset_timeout 'Monitor::cpu_tp thread 0x7f9a88176700' had timed out > after 0.0s > debug 2021-05-23T08:24:25.208+ 7f9a9e358700 0 > mon.node03@-1(probing) e5 my rank is now 1 (was -1) > debug 2021-05-23T08:24:25.212+ 7f9a87975700 0 mon.node03@1(probing) > e6 removed from monmap, suicide. > > root@node03:/home/adrian# systemctl status > ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@mon.node03.service > ● ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@mon.node03.service - Ceph > mon.node03 for c2d41ac4-baf5-11eb-865d-2dc838a337a3 > Loaded: loaded > (/etc/systemd/system/ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@.service; > enabled; vendor preset: enabled) > Active: inactive (dead) since Sun 2021-05-23 08:10:00 UTC; 16min ago > Process: 1176 ExecStart=/bin/bash > /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.run > (code=exited, status=0/SUCCESS) > Process: 1855 ExecStop=/usr/bin/docker stop > ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3-mon.node03 (code=exited, > status=1/FAILURE) > Process: 1861 ExecStopPost=/bin/bash > /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.poststop > (code=exited, status=0/SUCCESS) > Main PID: 1176 (code=exited, status=0/SUCCESS) > > The only fix I could find was to redeploy the mon with : > > ceph orch daemon rm mon.node03 --force > ceph orch daemon add mon node03 > > However, even if it's working after redeploy, it's not giving me a lot > of trust to use it in a production environment having an issue like > that. I could reproduce it with 2 different mons so it's not just an > exception. > > My setup is based on Ubuntu 20.04 and docker instead of podman : > > root@node01:~# docker -v > Docker version 20.10.6, build 370c289 > > Do you know a workaround for this issue or is this a known bug ? I > noticed that there are some other complaints with the same behaviour in > Octopus as well and the solution at that time was to delete the > /var/lib/ceph/mon folder . > > > Thanks. > > > > > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd object mapping
Thank you Konstantin! Tony From: Konstantin Shalygin Sent: August 9, 2021 01:20 AM To: Tony Liu Cc: ceph-users; d...@ceph.io Subject: Re: [ceph-users] rbd object mapping On 8 Aug 2021, at 20:10, Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: That's what I thought. I am confused by this. # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4) It calls RBD image "object" and it shows the whole image maps to a single PG, while the image is actually split into many objects each of which maps to a PG. How am I supposed to understand the output of this command? You can execute `ceph osd map vm nonexist` and you will see mapping for 'nonexist' object. Future mapping... To achieve mappings for each object of your image, you need to find all objects by rbd_header and iterate over this list. k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Multiple cephfs MDS crashes with same assert_condition: state == LOCK_XLOCK || state == LOCK_XLOCKDONE
Hi Today we suddenly experience multiple MDS crashes during the day with an error we have not seen earlier. We run octopus 15.2.13 with 4 ranks and 4 standby-reply MDSes and 1 passive standby. Any input on how to troubleshot or resolve this would be most welcome. --- root@hk-cephnode-54:~# ceph crash ls 2021-08-09T08:06:41.573899Z_306a9a10-b9d7-4a68-83a9-f5bd3d700fd7 mds.hk-cephnode-58 2021-08-09T08:09:03.132838Z_9a62b1fc-6069-4576-974d-2e0464169bb5 mds.hk-cephnode-62 2021-08-09T11:20:23.776776Z_5a665d00-9862-4d8f-99b5-323cdf441966 mds.hk-cephnode-54 2021-08-09T11:25:14.213601Z_f47fa398-5582-4da6-8e18-9252bbb52805 mds.hk-cephnode-62 2021-08-09T12:44:34.190128Z_1e163bf2-6ddf-45ef-a80f-0bf42158da31 mds.hk-cephnode-60 --- *All the crashlogs have the same assert_condition/file/msg* root@hk-cephnode-54:~# ceph crash info 2021-08-09T12:44:34.190128Z_1e163bf2-6ddf-45ef-a80f-0bf42158da31 { "archived": "2021-08-09 12:53:01.429088", "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE", "assert_file": "/build/ceph/ceph-15.2.13/src/mds/ScatterLock.h", "assert_func": "void ScatterLock::set_xlock_snap_sync(MDSContext*)", "assert_line": 59, "assert_msg": "/build/ceph/ceph-15.2.13/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7f0f76853700 time 2021-08-09T14:44:34.185861+0200\n/build/ceph/ceph-15.2.13/src/mds/ScatterLock.h: 59: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)\n", "assert_thread_name": "MR_Finisher", "backtrace": [ "(()+0x12730) [0x7f0f8153d730]", "(gsignal()+0x10b) [0x7f0f80e027bb]", "(abort()+0x121) [0x7f0f80ded535]", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a5) [0x7f0f81f1d0f5]", "(()+0x28127c) [0x7f0f81f1d27c]", "(MDCache::truncate_inode(CInode*, LogSegment*)+0x305) [0x55ed3b243aa5]", "(C_MDS_inode_update_finish::finish(int)+0x14c) [0x55ed3b219dec]", "(MDSContext::complete(int)+0x52) [0x55ed3b4156d2]", "(MDSIOContextBase::complete(int)+0x9f) [0x55ed3b4158af]", "(MDSLogContextBase::complete(int)+0x40) [0x55ed3b415c30]", "(Finisher::finisher_thread_entry()+0x19d) [0x7f0f81fab73d]", "(()+0x7fa3) [0x7f0f81532fa3]", "(clone()+0x3f) [0x7f0f80ec44cf]" ], "ceph_version": "15.2.13", "crash_id": "2021-08-09T12:44:34.190128Z_1e163bf2-6ddf-45ef-a80f-0bf42158da31", "entity_name": "mds.hk-cephnode-60", "os_id": "10", "os_name": "Debian GNU/Linux 10 (buster)", "os_version": "10 (buster)", "os_version_id": "10", "process_name": "ceph-mds", "stack_sig": "5f310d14ffe4b2600195c874fba3761c268218711ee4a449413862bb5553fb4c", "timestamp": "2021-08-09T12:44:34.190128Z", "utsname_hostname": "hk-cephnode-60", "utsname_machine": "x86_64", "utsname_release": "5.4.114-1-pve", "utsname_sysname": "Linux", "utsname_version": "#1 SMP PVE 5.4.114-1 (Sun, 09 May 2021 17:13:05 +0200)» } --- root@hk-cephnode-54:~# ceph health detail HEALTH_WARN 1 daemons have recently crashed [WRN] RECENT_CRASH: 1 daemons have recently crashed mds.hk-cephnode-54 crashed on host hk-cephnode-54 at 2021-08-09T11:20:23.776776Z root@hk-cephnode-54:~# ceph status cluster: id: health: HEALTH_WARN 1 daemons have recently crashed services: mon: 3 daemons, quorum hk-cephnode-60,hk-cephnode-61,hk-cephnode-62 (age 4w) mgr: hk-cephnode-53(active, since 4h), standbys: hk-cephnode-51, hk-cephnode-52 mds: cephfs:4 {0=hk-cephnode-60=up:active,1=hk-cephnode-61=up:active,2=hk-cephnode-55=up:active,3=hk-cephnode-57=up:active} 4 up:standby-replay 1 up:standby osd: 180 osds: 180 up (since 5d), 180 in (since 2w) data: pools: 9 pools, 2433 pgs objects: 118.22M objects, 331 TiB usage: 935 TiB used, 990 TiB / 1.9 PiB avail pgs: 2433 active+clean io: client: 231 MiB/s rd, 146 MiB/s wr, 900 op/s rd, 4.07k op/s wr ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim
On Mon, Aug 9, 2021 at 5:14 PM Robert W. Eckert wrote: > > I have had the same issue with the windows client. > I had to issue > ceph config set mon auth_expose_insecure_global_id_reclaim false > Which allows the other clients to connect. > I think you need to restart the monitors as well, because the first few times > I tried this, I still couldn't connect. For archive's sake, I'd like to mention that disabling auth_expose_insecure_global_id_reclaim isn't right and it wasn't intended for this. Enabling auth_allow_insecure_global_id_reclaim should be enough to allow all (however old) clients to connect. The fact that it wasn't enough for the available Windows build suggests that there is some subtle breakage in it because all "expose" does is it forces the client to connect twice instead of just once. It doesn't actually refuse old unpatched clients. (The breakage isn't surprising given that the available build is more or less a random development snapshot with some pending at the time Windows-specific patches applied. I'll try to escalate issue and get the linked MSI bundle updated.) Thanks, Ilya > > -Original Message- > From: Richard Bade > Sent: Sunday, August 8, 2021 8:27 PM > To: Daniel Persson > Cc: Ceph Users > Subject: [ceph-users] Re: BUG #51821 - client is using insecure global_id > reclaim > > Hi Daniel, > I had a similar issue last week after upgrading my test cluster from > 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20. My > issue was a rados gw that I was re-deploying on the latest version. The > problem seemed to be related with cephx authentication. > It kept displaying the error message you have and the service wouldn't start. > I ended up stopping and removing the old rgw service, deleting all the keys > in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and re-deploying the > radosgw. This used the new rgw bootstrap keys and new key for this radosgw. > So, I would suggest you double and triple check which keys your clients are > using and that cephx is enabled correctly on your cluster. > Check your admin key in /etc/ceph as well, as that's what's being used for > ceph status. > > Regards, > Rich > > On Sun, 8 Aug 2021 at 05:01, Daniel Persson wrote: > > > > Hi everyone. > > > > I suggested asking for help here instead of in the bug tracker so that > > I will try it. > > > > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_i > > d=51824 > > > > I have a problem that I can't seem to figure out how to resolve the issue. > > > > AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id > > reclaim > > AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure > > global_id reclaim > > > > > > Both of these have to do with reclaiming ID and securing that no > > client could steal or reuse another client's ID. I understand the > > reason for this and want to resolve the issue. > > > > Currently, I have three different clients. > > > > * One Windows client using the latest Ceph-Dokan build. (ceph version > > 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea) > > pacific > > (rc)) > > * One Linux Debian build using the built packages for that kernel. ( > > 4.19.0-17-amd64) > > * And one client that I've built from source for a raspberry PI as > > there is no arm build for the Pacific release. (5.11.0-1015-raspi) > > > > If I switch over to not allow global id reclaim, none of these clients > > could connect, and using the command "ceph status" on one of my nodes > > will also fail. > > > > All of them giving the same error message: > > > > monclient(hunting): handle_auth_bad_method server allowed_methods [2] > > but i only support [2] > > > > > > Has anyone encountered this problem and have any suggestions? > > > > PS. The reason I have 3 different hosts is that this is a test > > environment where I try to resolve and look at issues before we > > upgrade our production environment to pacific. DS. > > > > Best regards > > Daniel > > ___ > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to > ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim
I have had the same issue with the windows client. I had to issue ceph config set mon auth_expose_insecure_global_id_reclaim false Which allows the other clients to connect. I think you need to restart the monitors as well, because the first few times I tried this, I still couldn't connect. -Original Message- From: Richard Bade Sent: Sunday, August 8, 2021 8:27 PM To: Daniel Persson Cc: Ceph Users Subject: [ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim Hi Daniel, I had a similar issue last week after upgrading my test cluster from 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20. My issue was a rados gw that I was re-deploying on the latest version. The problem seemed to be related with cephx authentication. It kept displaying the error message you have and the service wouldn't start. I ended up stopping and removing the old rgw service, deleting all the keys in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and re-deploying the radosgw. This used the new rgw bootstrap keys and new key for this radosgw. So, I would suggest you double and triple check which keys your clients are using and that cephx is enabled correctly on your cluster. Check your admin key in /etc/ceph as well, as that's what's being used for ceph status. Regards, Rich On Sun, 8 Aug 2021 at 05:01, Daniel Persson wrote: > > Hi everyone. > > I suggested asking for help here instead of in the bug tracker so that > I will try it. > > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_i > d=51824 > > I have a problem that I can't seem to figure out how to resolve the issue. > > AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id > reclaim > AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure > global_id reclaim > > > Both of these have to do with reclaiming ID and securing that no > client could steal or reuse another client's ID. I understand the > reason for this and want to resolve the issue. > > Currently, I have three different clients. > > * One Windows client using the latest Ceph-Dokan build. (ceph version > 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea) > pacific > (rc)) > * One Linux Debian build using the built packages for that kernel. ( > 4.19.0-17-amd64) > * And one client that I've built from source for a raspberry PI as > there is no arm build for the Pacific release. (5.11.0-1015-raspi) > > If I switch over to not allow global id reclaim, none of these clients > could connect, and using the command "ceph status" on one of my nodes > will also fail. > > All of them giving the same error message: > > monclient(hunting): handle_auth_bad_method server allowed_methods [2] > but i only support [2] > > > Has anyone encountered this problem and have any suggestions? > > PS. The reason I have 3 different hosts is that this is a test > environment where I try to resolve and look at issues before we > upgrade our production environment to pacific. DS. > > Best regards > Daniel > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Balanced use of HDD and SSD
Hello all, a year ago we started with a 3-node-Cluster for Ceph with 21 HDD and 3 SSD, which we installed with Cephadm, configuring the disks with `ceph orch apply osd --all-available-devices` Over the time the usage grew quite significantly: now we have another 5 nodes with 8-12 HDD and 1-2 SSD each, the integration worked without any problems with `ceph orch add host`. Now we wonder if the HDD and SSD are used as recommended, so that access is fast, but without My questions: how can I check what the data_devices and db_devices are? Can we still apply a setup as for example the second one in this documentation? https://docs.ceph.com/en/latest/cephadm/osd/#the-simple-case Some technical details: Xeans with plenty RAM and Cores, Ceph 16.2.5 with mostly default configuration, Ubuntu 20.04, separated cluster and public network (both 10 Gb), Usage as RBD (Qemu), Cephfs, and Ceph object gateway. (The latter is surprisingly slow, but I want to sort out the problem with the underlying configuration first.) Thanks for any helpful responses, Erich ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Size of cluster
Hello, this is my osd tree: ID CLASS WEIGHT TYPE NAME -1 312.14557 root default -3 68.97755 host pveceph01 3hdd 10.91409 osd.3 14hdd 16.37109 osd.14 15hdd 16.37109 osd.15 20hdd 10.91409 osd.20 23hdd 10.91409 osd.23 0ssd3.49309 osd.0 -5 68.97755 host pveceph02 4hdd 10.91409 osd.4 13hdd 16.37109 osd.13 16hdd 16.37109 osd.16 21hdd 10.91409 osd.21 24hdd 10.91409 osd.24 1ssd3.49309 osd.1 -7 68.97755 host pveceph03 6hdd 10.91409 osd.6 12hdd 16.37109 osd.12 17hdd 16.37109 osd.17 22hdd 10.91409 osd.22 25hdd 10.91409 osd.25 2ssd3.49309 osd.2 -13 52.60646 host pveceph04 9hdd 10.91409 osd.9 11hdd 16.37109 osd.11 18hdd 10.91409 osd.18 26hdd 10.91409 osd.26 5ssd3.49309 osd.5 -16 52.60646 host pveceph05 8hdd 10.91409 osd.8 10hdd 16.37109 osd.10 19hdd 10.91409 osd.19 27hdd 10.91409 osd.27 7ssd3.49309 osd.7 Sorry, but how I check the failure domain? I seem to remember that my failure domain is host. Regards. De: Robert Sander Enviado: lunes, 9 de agosto de 2021 13:40 Para: ceph-users@ceph.io Asunto: [ceph-users] Re: Size of cluster Hi, Am 09.08.21 um 12:56 schrieb Jorge JP: > 15 x 12TB = 180TB > 8 x 18TB = 144TB How are these distributed across your nodes and what is the failure domain? I.e. how will Ceph distribute data among them? > The raw size of this cluster (HDD) should be 295TB after format but the size > of my "primary" pool (2/1) in this moment is: A pool with a size of 2 and a min_size of 1 will lead to data loss. > 53.50% (65.49 TiB of 122.41 TiB) > > 122,41TiB multiplied by replication of 2 is 244TiB, not 295TiB. > > How can use all size of the class? If you have 3 nodes with each 5x 12TB (60TB) and 2 nodes with each 4x 18TB (72TB) the maximum usable capacity will not be the sum of all disks. Remember that Ceph tries to evenly distribute the data. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Size of cluster
Hi, Am 09.08.21 um 12:56 schrieb Jorge JP: > 15 x 12TB = 180TB > 8 x 18TB = 144TB How are these distributed across your nodes and what is the failure domain? I.e. how will Ceph distribute data among them? > The raw size of this cluster (HDD) should be 295TB after format but the size > of my "primary" pool (2/1) in this moment is: A pool with a size of 2 and a min_size of 1 will lead to data loss. > 53.50% (65.49 TiB of 122.41 TiB) > > 122,41TiB multiplied by replication of 2 is 244TiB, not 295TiB. > > How can use all size of the class? If you have 3 nodes with each 5x 12TB (60TB) and 2 nodes with each 4x 18TB (72TB) the maximum usable capacity will not be the sum of all disks. Remember that Ceph tries to evenly distribute the data. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: "ceph orch ls", "ceph orch daemon rm" fail with exception "'KeyError: 'not'" on 15.2.10
Hi, Might anyone have any insight for this issue? I have been unable to resolve it so far and it prevents many "ceph orch" commands and breaks many aspects of the Web user interface. -- _ / __// /__ __ Erkki Seppälä\ \ / /_ / // // /\ \/ /\ / /_/ /_/ \___/ /_/\_\@inside.orghttp://www.inside.org/~flux/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Size of cluster
Hello, I have a ceph cluster with 5 nodes. I have 23 osds distributed in these one with hdd class. The disk size are: 15 x 12TB = 180TB 8 x 18TB = 144TB Result of execute "ceph df" command: --- RAW STORAGE --- CLASS SIZE AVAILUSED RAW USED %RAW USED hdd295 TiB 163 TiB 131 TiB 131 TiB 44.55 ssd 17 TiB 17 TiB 316 GiB 324 GiB 1.81 TOTAL 312 TiB 181 TiB 131 TiB 132 TiB 42.16 --- POOLS --- POOLID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics11 13 MiB5 39 MiB 0 40 TiB .rgw.root44 1.5 KiB4 768 KiB 0 38 TiB default.rgw.meta 64 4.7 KiB 12 1.9 MiB 0 38 TiB rbd 8 512 1.4 KiB4 384 KiB 0 38 TiB default.rgw.buckets.data12 32 10 GiB2.61k 31 GiB 0.03 38 TiB default.rgw.log 13 128 35 KiB 2076 MiB 0 38 TiB default.rgw.control 144 0 B8 0 B 0 38 TiB default.rgw.buckets.non-ec 15 128 27 B1 192 KiB 0 38 TiB default.rgw.buckets.index 184 1.1 MiB2 3.3 MiB 0 5.4 TiB default.rgw.buckets.ssd.index 218 0 B0 0 B 0 5.4 TiB default.rgw.buckets.ssd.data228 0 B0 0 B 0 5.4 TiB default.rgw.buckets.ssd.non-ec 238 0 B0 0 B 0 5.4 TiB POOL-HDD32 512 65 TiB 17.28M 131 TiB 53.51 57 TiB POOL_SSD_2_134 32 157 GiB 296.94k 316 GiB 1.86 8.1 TiB The raw size of this cluster (HDD) should be 295TB after format but the size of my "primary" pool (2/1) in this moment is: 53.50% (65.49 TiB of 122.41 TiB) 122,41TiB multiplied by replication of 2 is 244TiB, not 295TiB. How can use all size of the class? Thanks a lot. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim
Hi Tobias and Richard. Thank you for answering my questions. I got the link suggested by Tobias on the issue report, which led me to further investigation. It was hard to see what version the kernel version on the system was using, but looking at the result of "ceph health detail" and ldd librados2.so could give me some information. It seemed that one of my Linux environments used the old buster kernel model, which was 12.2.* and not compatible with the new global ID reclaim. Another issue I got was that the windows client available for download uses a strange version 15.0.0 Pacific, which is just not correct. After reading and searching on GitHub, I realized that the windows executables could be built in a Linux environment using the ceph source code. So I've now built new binaries to windows that work just fine except for a libwnbd.dll which were never built. But adding it from the old installation, I got it to work. Now ceph-dokan reports a version of 16.2.5, which was the version I built. Building this was not straightforward, and something I think could be interesting for the community. So I'm planning to create an instruction video on the subject that I will publish next week. Again thank you for your help. Best regards Daniel On Mon, Aug 9, 2021 at 11:46 AM Tobias Urdin wrote: > Hello, > > Did you follow the fix/recommendation when applying patches as per > the documentation in the CVE security post [1] ? > > Best regards > > [1] https://docs.ceph.com/en/latest/security/CVE-2021-20288/ > > > On 9 Aug 2021, at 02:26, Richard Bade wrote: > > > > Hi Daniel, > > I had a similar issue last week after upgrading my test cluster from > > 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in > > .20. My issue was a rados gw that I was re-deploying on the latest > > version. The problem seemed to be related with cephx authentication. > > It kept displaying the error message you have and the service wouldn't > > start. > > I ended up stopping and removing the old rgw service, deleting all the > > keys in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and > > re-deploying the radosgw. This used the new rgw bootstrap keys and new > > key for this radosgw. > > So, I would suggest you double and triple check which keys your > > clients are using and that cephx is enabled correctly on your cluster. > > Check your admin key in /etc/ceph as well, as that's what's being used > > for ceph status. > > > > Regards, > > Rich > > > > On Sun, 8 Aug 2021 at 05:01, Daniel Persson > wrote: > >> > >> Hi everyone. > >> > >> I suggested asking for help here instead of in the bug tracker so that I > >> will try it. > >> > >> > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_id=51824 > >> > >> I have a problem that I can't seem to figure out how to resolve the > issue. > >> > >> AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id > reclaim > >> AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure > >> global_id reclaim > >> > >> > >> Both of these have to do with reclaiming ID and securing that no client > >> could steal or reuse another client's ID. I understand the reason for > this > >> and want to resolve the issue. > >> > >> Currently, I have three different clients. > >> > >> * One Windows client using the latest Ceph-Dokan build. (ceph version > >> 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea) > pacific > >> (rc)) > >> * One Linux Debian build using the built packages for that kernel. ( > >> 4.19.0-17-amd64) > >> * And one client that I've built from source for a raspberry PI as > there is > >> no arm build for the Pacific release. (5.11.0-1015-raspi) > >> > >> If I switch over to not allow global id reclaim, none of these clients > >> could connect, and using the command "ceph status" on one of my nodes > will > >> also fail. > >> > >> All of them giving the same error message: > >> > >> monclient(hunting): handle_auth_bad_method server allowed_methods [2] > >> but i only support [2] > >> > >> > >> Has anyone encountered this problem and have any suggestions? > >> > >> PS. The reason I have 3 different hosts is that this is a test > environment > >> where I try to resolve and look at issues before we upgrade our > production > >> environment to pacific. DS. > >> > >> Best regards > >> Daniel > >> ___ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim
Hello, Did you follow the fix/recommendation when applying patches as per the documentation in the CVE security post [1] ? Best regards [1] https://docs.ceph.com/en/latest/security/CVE-2021-20288/ > On 9 Aug 2021, at 02:26, Richard Bade wrote: > > Hi Daniel, > I had a similar issue last week after upgrading my test cluster from > 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in > .20. My issue was a rados gw that I was re-deploying on the latest > version. The problem seemed to be related with cephx authentication. > It kept displaying the error message you have and the service wouldn't > start. > I ended up stopping and removing the old rgw service, deleting all the > keys in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and > re-deploying the radosgw. This used the new rgw bootstrap keys and new > key for this radosgw. > So, I would suggest you double and triple check which keys your > clients are using and that cephx is enabled correctly on your cluster. > Check your admin key in /etc/ceph as well, as that's what's being used > for ceph status. > > Regards, > Rich > > On Sun, 8 Aug 2021 at 05:01, Daniel Persson wrote: >> >> Hi everyone. >> >> I suggested asking for help here instead of in the bug tracker so that I >> will try it. >> >> https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_id=51824 >> >> I have a problem that I can't seem to figure out how to resolve the issue. >> >> AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id reclaim >> AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure >> global_id reclaim >> >> >> Both of these have to do with reclaiming ID and securing that no client >> could steal or reuse another client's ID. I understand the reason for this >> and want to resolve the issue. >> >> Currently, I have three different clients. >> >> * One Windows client using the latest Ceph-Dokan build. (ceph version >> 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea) pacific >> (rc)) >> * One Linux Debian build using the built packages for that kernel. ( >> 4.19.0-17-amd64) >> * And one client that I've built from source for a raspberry PI as there is >> no arm build for the Pacific release. (5.11.0-1015-raspi) >> >> If I switch over to not allow global id reclaim, none of these clients >> could connect, and using the command "ceph status" on one of my nodes will >> also fail. >> >> All of them giving the same error message: >> >> monclient(hunting): handle_auth_bad_method server allowed_methods [2] >> but i only support [2] >> >> >> Has anyone encountered this problem and have any suggestions? >> >> PS. The reason I have 3 different hosts is that this is a test environment >> where I try to resolve and look at issues before we upgrade our production >> environment to pacific. DS. >> >> Best regards >> Daniel >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd object mapping
> On 8 Aug 2021, at 20:10, Tony Liu wrote: > > That's what I thought. I am confused by this. > > # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk > osdmap e18381 pool 'vm' (4) object > 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up > ([4,17,6], p4) acting ([4,17,6], p4) > > It calls RBD image "object" and it shows the whole image maps to a single PG, > while the image is actually split into many objects each of which maps to a > PG. > How am I supposed to understand the output of this command? You can execute `ceph osd map vm nonexist` and you will see mapping for 'nonexist' object. Future mapping... To achieve mappings for each object of your image, you need to find all objects by rbd_header and iterate over this list. k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io