[ceph-users] Ceph Octopus 15.2.11 - rbd diff --from-snap lists all objects
Hi, Has something change with 'rbd diff' in Octopus or have I hit a bug? I am no longer able to obtain the list of objects that have changed between two snapshots of an image, it always lists all allocated regions of the RBD image. This behaviour however only occurs when I add the '--whole-object' switch. Using KRBD client with kernel 5.11.7 and Ceph Octopus 15.2.11 as part of Proxmox PVE 6.4 which is based on Debian 10. Images have the following features and I've performed offline object map checks and rebuilds (no errors reported). To reproduce my issue I first create a new RBD image (default features are 63), map it using KRBD, write some data, create first snapshot, write a single object (4 MiB), create a second snapshot and then list the differences: [admin@kvm1a ~]# rbd create rbd_hdd/test --size 40G [admin@kvm1a ~]# rbd info rbd_hdd/test rbd image 'test': size 40 GiB in 10240 objects order 22 (4 MiB objects) snapshot_count: 0 id: 73363f8443987b block_name_prefix: rbd_data.73363f8443987b format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Wed May 12 23:01:11 2021 access_timestamp: Wed May 12 23:01:11 2021 modify_timestamp: Wed May 12 23:01:11 2021 [admin@kvm1a ~]# rbd map rbd_hdd/test /dev/rbd18 [admin@kvm1a ~]# dd if=/dev/zero of=/dev/rbd18 bs=64M count=1 1+0 records in 1+0 records out 67108864 bytes (67 MB, 64 MiB) copied, 0.668701 s, 100 MB/s [admin@kvm1a ~]# sync [admin@kvm1a ~]# rbd snap create rbd_hdd/test@snap1 [admin@kvm1a ~]# dd if=/dev/zero of=/dev/rbd18 bs=4M count=1 1+0 records in 1+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.265691 s, 15.8 MB/s [admin@kvm1a ~]# sync [admin@kvm1a ~]# rbd snap create rbd_hdd/test@snap2 [admin@kvm1a ~]# rbd diff --from-snap snap1 rbd_hdd/test@snap2 --format=json [{"offset":0,"length":4194304,"exists":"true"}] [admin@kvm1b ~]# rbd diff --from-snap snap1 rbd_hdd/test@snap2 --format=json --whole-object [{"offset":0,"length":4194304,"exists":"true"},{"offset":4194304,"length":4194304,"exists":"true"},{"offset":8388608,"length":4194304,"exists":"true"},{"offset":12582912,"length":4194304,"exists":"true"},{"offset":16777216,"length":4194304,"exists":"true"},{"offset":20971520,"length":4194304,"exists":"true"},{"offset":25165824,"length":4194304,"exists":"true"},{"offset":29360128,"length":4194304,"exists":"true"},{"offset":33554432,"length":4194304,"exists":"true"},{"offset":37748736,"length":4194304,"exists":"true"},{"offset":41943040,"length":4194304,"exists":"true"},{"offset":46137344,"length":4194304,"exists":"true"},{"offset":50331648,"length":4194304,"exists":"true"},{"offset":54525952,"length":4194304,"exists":"true"},{"offset":58720256,"length":4194304,"exists":"true"},{"offset":62914560,"length":4194304,"exists":"true"}] [admin@kvm1a ~]# rbd du rbd_hdd/test NAMEPROVISIONED USED test@snap1 40 GiB 64 MiB test@snap2 40 GiB 64 MiB test 40 GiB4 MiB 40 GiB 132 MiB My tests appear to confirm that adding the 'whole-object' option to rbd diff results in it listing every allocated extend instead of only the changes... Regards David Herselman ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Manager carries wrong information until killing it
Reed Dier writes: > I don't have a solution to offer, but I've seen this for years with no > solution. > Any time a MGR bounces, be it for upgrades, or a new daemon coming online, > etc, I'll see a scale spike like is reported below. Interesting to read that we are not the only ones. > Just out of curiosity, which MGR plugins are you using? [22:11:05] black2.place6:~# ceph mgr module ls { "always_on_modules": [ "balancer", "crash", "devicehealth", "orchestrator_cli", "progress", "rbd_support", "status", "volumes" ], "enabled_modules": [ "iostat", "pg_autoscaler", "prometheus", "restful" ], > I have historically used the influx plugin for stats exports, and it shows up > in those values as well, throwing everything off. So the problem is unlikely related to the prometheus plugin, but more to a statistics error somewhere else. > I don't see it in my Zabbix stats, albeit those are scraped at a > longer interval that may not catch this. For prometheus, we scrape every 10 or 15 seconds. But I wonder if this really flattens out or whether the logic is actually different. Out of curiosity from my side: the manager is a binary, but the plugins are actually python modules. I had a quick look at /usr/share/ceph/mgr/prometheus/module.py which seems to get the data from a monitor - so I wonder if the problem lies more in the architecture of ceph rather than the actual data export. Cheers, Nico -- Sustainable and modern Infrastructures by ungleich.ch ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster
I was able to figure out the solution with this rule: step take default step choose indep 0 type host step chooseleaf indep 1 type osd step emit step take default step choose indep 0 type host step chooseleaf indep 1 type osd step emit Now the data is spread how I want it to be: # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r '.pg_stats[].pgid'); do > echo $pg > for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do > ceph osd find $osd | jq -r '.host' > done | sort | uniq -c | sort -n -k1 > done 8.0 1 excalibur 1 harrahs 1 mandalaybay 1 mirage 2 aladdin 2 paris 8.1 1 aladdin 1 excalibur 1 harrahs 1 mirage 2 mandalaybay 2 paris 8.2 1 aladdin 1 excalibur 1 harrahs 1 mirage 2 mandalaybay 2 paris ... Hopefully someone else will find this useful. Bryan > On May 12, 2021, at 9:58 AM, Bryan Stillwell wrote: > > I'm trying to figure out a CRUSH rule that will spread data out across my > cluster as much as possible, but not more than 2 chunks per host. > > If I use the default rule with an osd failure domain like this: > > step take default > step choose indep 0 type osd > step emit > > I get clustering of 3-4 chunks on some of the hosts: > > # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r > '.pg_stats[].pgid'); do >> echo $pg >> for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do >>ceph osd find $osd | jq -r '.host' >> done | sort | uniq -c | sort -n -k1 > 8.0 > 1 harrahs > 3 paris > 4 aladdin > 8.1 > 1 aladdin > 1 excalibur > 2 mandalaybay > 4 paris > 8.2 > 1 harrahs > 2 aladdin > 2 mirage > 3 paris > ... > > However, if I change the rule to use: > > step take default > step choose indep 0 type host > step chooseleaf indep 2 type osd > step emit > > I get the data spread across 4 hosts with 2 chunks per host: > > # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r > '.pg_stats[].pgid'); do >> echo $pg >> for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do >>ceph osd find $osd | jq -r '.host' >> done | sort | uniq -c | sort -n -k1 >> done > 8.0 > 2 aladdin > 2 harrahs > 2 mandalaybay > 2 paris > 8.1 > 2 aladdin > 2 harrahs > 2 mandalaybay > 2 paris > 8.2 > 2 harrahs > 2 mandalaybay > 2 mirage > 2 paris > ... > > Is it possible to get the data to spread out over more hosts? I plan on > expanding the cluster in the near future and would like to see more hosts get > 1 chunk instead of 2. > > Also, before you recommend adding two more hosts and switching to a > host-based failure domain, the cluster is on a variety of hardware with > between 2-6 drives per host and drives that are 4TB-12TB in size (it's part > of my home lab). > > Thanks, > Bryan > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] CRUSH rule for EC 6+2 on 6-node cluster
I'm trying to figure out a CRUSH rule that will spread data out across my cluster as much as possible, but not more than 2 chunks per host. If I use the default rule with an osd failure domain like this: step take default step choose indep 0 type osd step emit I get clustering of 3-4 chunks on some of the hosts: # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r '.pg_stats[].pgid'); do > echo $pg > for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do > ceph osd find $osd | jq -r '.host' > done | sort | uniq -c | sort -n -k1 8.0 1 harrahs 3 paris 4 aladdin 8.1 1 aladdin 1 excalibur 2 mandalaybay 4 paris 8.2 1 harrahs 2 aladdin 2 mirage 3 paris ... However, if I change the rule to use: step take default step choose indep 0 type host step chooseleaf indep 2 type osd step emit I get the data spread across 4 hosts with 2 chunks per host: # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r '.pg_stats[].pgid'); do > echo $pg > for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do > ceph osd find $osd | jq -r '.host' > done | sort | uniq -c | sort -n -k1 > done 8.0 2 aladdin 2 harrahs 2 mandalaybay 2 paris 8.1 2 aladdin 2 harrahs 2 mandalaybay 2 paris 8.2 2 harrahs 2 mandalaybay 2 mirage 2 paris ... Is it possible to get the data to spread out over more hosts? I plan on expanding the cluster in the near future and would like to see more hosts get 1 chunk instead of 2. Also, before you recommend adding two more hosts and switching to a host-based failure domain, the cluster is on a variety of hardware with between 2-6 drives per host and drives that are 4TB-12TB in size (it's part of my home lab). Thanks, Bryan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] May 10 Upstream Lab Outage
Hi all, I wanted to provide an RCA for the outage you may have been affected by yesterday. Some services that went down: - All CI/testing - quay.ceph.io - telemetry.ceph.com (your cluster may have gone into HEALTH_WARN if you report telemetry data) - lists.ceph.io (so all mailing lists) All of our critical infra is running in a Red Hat Virtualization (RHV) instance backed by Red Hat Gluster Storage (RHGS) as the storage. Before you go, "wait.. Gluster?" Yes, this cluster was set up before Ceph was supported as backend storage for RHV/RHEV. The root cause for the outage is the Gluster volumes got 100% full. Once no writes were possible, RHV paused all the VMs. Why didn't monitoring catch this? I honestly don't know. # grep ssdstore01 nagios-05-*2021* | grep Disk nagios-05-01-2021-00.log:[1619740800] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-02-2021-00.log:[1619827200] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-03-2021-00.log:[1619913600] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-04-2021-00.log:[162000] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-05-2021-00.log:[1620086400] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-06-2021-00.log:[1620172800] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-07-2021-00.log:[1620259200] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-08-2021-00.log:[1620345600] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-09-2021-00.log:[1620432000] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-10-2021-00.log:[1620518400] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now nagios-05-11-2021-00.log:[1620604800] CURRENT SERVICE STATE: ssdstore01;Disk Space;OK;HARD;1;Disks are OK now Yet RHV knew we were running out of space. I don't have e-mail notifications set up in RHV, however. # zgrep "disk space" engine*202105*.gz | cut -d ',' -f4 | head -n 10 Low disk space. hosted_storage domain has 24 GB of free space. Low disk space. hosted_storage domain has 24 GB of free space. Low disk space. hosted_storage domain has 23 GB of free space. Low disk space. hosted_storage domain has 23 GB of free space. Low disk space. hosted_storage domain has 23 GB of free space. Low disk space. hosted_storage domain has 23 GB of free space. Low disk space. hosted_storage domain has 23 GB of free space. Low disk space. hosted_storage domain has 21 GB of free space. Low disk space. hosted_storage domain has 20 GB of free space. Low disk space. hosted_storage domain has 11 GB of free space. Our nagios instances runs this to check disk space: https://github.com/ceph/ceph-cm-ansible/blob/master/roles/common/files/libexec/diskusage.pl You can ignore the comment about it only working for EXT2. [root@ssdstore01 ~]# /usr/libexec/diskusage.pl 90 95 Disks are OK now I ran this manually on one of the storage hosts and intentionally set the WARN level to a number lower than the current usage percentage. [root@ssdstore01 ~]# df -h | grep 'Size\|gluster' Filesystem Size Used Avail Use% Mounted on /dev/md124 8.8T 6.7T 2.1T 77% /gluster [root@ssdstore01 ~]# /usr/libexec/diskusage.pl 95 70 /gluster is at 77% [root@ssdstore01 ~]# echo $? 2 When I logged in to the storage hosts yesterday morning, the /gluster mount was at 100%. So nagios should have known. How'd it get fixed? I happened to have some large capacity drives that fit the storage nodes lying around. They're being installed in a different project soon. However, I was able to add these drives, add "bricks" to the Gluster storage, then rebalance the data. Once that was done, I was able to restart all the VMs and delete old VMs and snapshots I no longer needed. How do we keep this from happening again? Well, as you may have been able to deduce... we were running out of space at a rate of 1-10 GB/day. As you can see now, the Gluster volume has 2.1TB of space left. So even if we grew by 10GB/day again, we'd be okay for 200ish days. I aim to have some (if not all) of these services moved off this platform and into an Openshift cluster backed by Ceph this year. Sadly, I just don't think I have enough logging enabled to nail down exactly what happened. -- David Galloway Senior Systems Administrator Ceph Engineering ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: monitor connection error
> -Original Message- > From: Eugen Block [mailto:ebl...@nde.ag] > Sent: Tuesday, May 11, 2021 11:39 PM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: monitor connection error > > Hi, > > > What is this error trying to tell me? TIA > > it tells you that the cluster is not reachable to the client, this can have > various > reasons. > > Can you show the output of your conf file? > > cat /etc/ceph/es-c1.conf [centos@cnode-01 ~]$ cat /etc/ceph/es-c1.conf [global] fsid = 3c5da069-2a03-4a5a-8396-53776286c858 mon_initial_members = cnode-01,cnode-02,cnode-03 mon_host = 192.168.122.39 public_network = 192.168.122.0/24 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx osd_journal_size = 1024 osd_pool_default_size = 3 osd_pool_default_min_size = 2 osd_pool_default_pg_num = 333 osd_pool_default_pgp_num = 333 osd_crush_chooseleaf_type = 1 [centos@cnode-01 ~]$ > Is the monitor service up running? I take it you don't use cephadm yet so > it's not > a containerized environment? Correct, this is bare metal and not a containerized environment. And I believe it is running: [centos@cnode-01 ~]$ sudo systemctl --all | grep ceph ceph-crash.service loadedactive running Ceph crash dump collector ceph-mon@cnode-01.service loadedactive running Ceph cluster monitor daemon system-ceph\x2dmon.slice loadedactive activesystem-ceph\x2dmon.slice ceph-mon.target loadedactive activeceph target allowing to start/stop all ceph-mon@.service instances at once ceph.target loadedactive activeceph target allowing to start/stop all ceph*@.service instances at once [centos@cnode-01 ~]$ > Regards, > Eugen > > > Zitat von "Tuffli, Chuck" : > > > Hi > > > > I'm new to ceph and have been following the Manual Deployment document > > [1]. The process seems to work correctly until step 18 ("Verify that > > the monitor is running"): > > > > [centos@cnode-01 ~]$ uname -a > > Linux cnode-01 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 > > UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > > [centos@cnode-01 ~]$ ceph -v > > ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) > > octopus (stable) > > [centos@cnode-01 ~]$ sudo ceph --cluster es-c1 -s [errno 2] RADOS > > object not found (error connecting to the cluster) > > [centos@cnode-01 ~]$ > > > > What is this error trying to tell me? TIA > > > > [1] > > INVALID URI REMOVED > > nual-deployment/__;!!NpxR!1-v_Ql6E-l3P_E8DvIfk_YtknPrVFeZ5sFaPHLlsJVY8 > > PmzP7kySRbr1rYqbFiZ1$ > ___ > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to > ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Manager carries wrong information until killing it
I don't have a solution to offer, but I've seen this for years with no solution. Any time a MGR bounces, be it for upgrades, or a new daemon coming online, etc, I'll see a scale spike like is reported below. Just out of curiosity, which MGR plugins are you using? I have historically used the influx plugin for stats exports, and it shows up in those values as well, throwing everything off. I don't see it in my Zabbix stats, albeit those are scraped at a longer interval that may not catch this. Just looking for any common threads. Reed > On May 4, 2021, at 3:46 AM, Nico Schottelius > wrote: > > > Hello, > > we have a recurring, funky problem with managers on Nautilus (and > probably also earlier versions): the manager displays incorrect > information. > > This is a recurring pattern and it also breaks the prometheus graphs, as > the I/O is described insanely incorrectly: "recovery: 43 TiB/s, 3.62k > keys/s, 11.40M objects/s" - which basically changes the scale of any > related graph to unusable. > > The latest example from today shows slow ops for an OSD > that has been down for 17h: > > > [09:50:31] black2.place6:~# ceph -s > cluster: >id: 1ccd84f6-e362-4c50-9ffe-59436745e445 >health: HEALTH_WARN >18 slow ops, oldest one blocked for 975 sec, osd.53 has slow ops > > services: >mon: 5 daemons, quorum server9,server2,server8,server6,server4 (age 2w) >mgr: server2(active, since 2w), standbys: server8, server4, server9, > server6, ciara3 >osd: 108 osds: 107 up (since 17h), 107 in (since 17h) > > data: >pools: 4 pools, 2624 pgs >objects: 42.52M objects, 162 TiB >usage: 486 TiB used, 298 TiB / 784 TiB avail >pgs: 2616 active+clean > 8active+clean+scrubbing+deep > > io: >client: 522 MiB/s rd, 22 MiB/s wr, 8.18k op/s rd, 689 op/s wr > > > Killing the manager on server2 changes the status to another temporary > incorrect status, because the rebalance finished hours ago, paired with > the incorrect rebalance speed that we see from time to time: > > > [09:51:59] black2.place6:~# ceph -s > cluster: >id: 1ccd84f6-e362-4c50-9ffe-59436745e445 >health: HEALTH_OK > > services: >mon: 5 daemons, quorum server9,server2,server8,server6,server4 (age 2w) >mgr: server8(active, since 11s), standbys: server4, server9, server6, > ciara3 >osd: 108 osds: 107 up (since 17h), 107 in (since 17h) > > data: >pools: 4 pools, 2624 pgs >objects: 42.52M objects, 162 TiB >usage: 486 TiB used, 298 TiB / 784 TiB avail >pgs: 2616 active+clean > 8active+clean+scrubbing+deep > > io: >client: 214 TiB/s rd, 54 TiB/s wr, 4.86G op/s rd, 1.06G op/s wr >recovery: 43 TiB/s, 3.62k keys/s, 11.40M objects/s > > progress: >Rebalancing after osd.53 marked out > [..] > > > Then a bit later, the status on the newly started manager is correct: > > > [09:52:18] black2.place6:~# ceph -s > cluster: >id: 1ccd84f6-e362-4c50-9ffe-59436745e445 >health: HEALTH_OK > > services: >mon: 5 daemons, quorum server9,server2,server8,server6,server4 (age 2w) >mgr: server8(active, since 47s), standbys: server4, server9, server6, > server2, ciara3 >osd: 108 osds: 107 up (since 17h), 107 in (since 17h) > > data: >pools: 4 pools, 2624 pgs >objects: 42.52M objects, 162 TiB >usage: 486 TiB used, 298 TiB / 784 TiB avail >pgs: 2616 active+clean > 8active+clean+scrubbing+deep > > io: >client: 422 MiB/s rd, 39 MiB/s wr, 7.91k op/s rd, 752 op/s wr > > > Question: is this a know bug, is anyone else seeing it or are we doing > something wrong? > > Best regards, > > Nico > > -- > Sustainable and modern Infrastructures by ungleich.ch > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Write Ops on CephFS Increasing exponentially
Hi Partick, Thanks for getting back to me. Looks like I found the issue. Its due to the fact that I had thought I had increased the max_file_size on ceph to 20TB turns out I missed a zero and set it to 1.89 TB. I had originally tried to fallocate the space for the 8TB volume which kept erroring. I then tried DD and DD the entire space needed without errors. What I dont understand is, what happens to cephFS when you do this. The files I'm writing into the pre-allocated volume in ceph are still there "luckily" but I thought that ceph would stop you from writing to cephFS if it hit the upper limit of max_file_size. Kind regards, Kyle From: Patrick Donnelly Sent: 11 May 2021 03:14 To: Kyle Dean Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Write Ops on CephFS Increasing exponentially Hi Kyle, On Thu, May 6, 2021 at 7:56 AM Kyle Dean wrote: > > Hi, hoping someone could help me get to the bottom of this particular issue > I'm having. > > I have ceph octopus installed using ceph-ansible. > > Currently, I have 3 MDS servers running, and one client connected to the > active MDS. I'm currently storing a very large encrypted container on the > CephFS file system, 8TB worth, and I'm writing data into it from the client > host. > > recently I have noticed a severe impact on performance, and the time take to > do processing on file within the container has increased from 1 minute to 11 > minutes. > > in the ceph dashboard, when I take a look at the performance tab on the file > system page, the Write Ops are increasing exponentially over time. > > At the end of April around the 22nd I had 49 write Ops on the performance > page for the MDS deamons. This is now at 266467 Write Ops and increasing. > > Also the client requests have gone from 14 to 67 to 117 and is now at 283 > > would someone be able to help me make sense of why the performance has > decreased and what is going on with the client requests and write operations. I suggest you look at the "perf dump" statistics from the MDS (via ceph tell or admin socket) over a period of time to get an idea what operations it's performing. It's probable your workload changed somehow and that is the cause. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph Month June 2021 Event
Hi everyone, Today is the last day to get your proposal in for the Ceph June Month event! The types of talks include: * Lightning talk - 5 minutes * Presentation - 20 minutes with q/a * Unconference (Bof) - 40 minutes We will be confirming with speakers for the date/time by May 16th. https://ceph.io/events/ceph-month-june-2021/cfp On Wed, Apr 21, 2021 at 6:30 AM Mike Perez wrote: > > Hi everyone, > > We're looking for presentations, lightning talks, and BoFs to schedule > for Ceph Month in June 2021. Please submit your proposals before May > 12th: > > https://ceph.io/events/ceph-month-june-2021/cfp > > On Wed, Apr 14, 2021 at 12:35 PM Mike Perez wrote: > > > > Hi everyone, > > > > In June 2021, we're hosting a month of Ceph presentations, lightning > > talks, and unconference sessions such as BOFs. There is no > > registration or cost to attend this event. > > > > The CFP is now open until May 12th. > > > > https://ceph.io/events/ceph-month-june-2021/cfp > > > > Speakers will receive confirmation that their presentation is accepted > > and further instructions for scheduling by May 16th. > > > > The schedule will be available on May 19th. > > > > Join the Ceph community as we discuss how Ceph, the massively > > scalable, open-source, software-defined storage system, can radically > > improve the economics and management of data storage for your > > enterprise. > > > > -- > > Mike Perez ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: RGW federated user cannot access created bucket
The federated user will be allowed to perform only those s3 actions that are explicitly allowed by the role's permission policy. The permission policy is there for someone to exercise finer grained control over what s3 action is allowed and what is not, hence it differs from what regular users are allowed to do. Thanks, Pritha On Wed, May 12, 2021 at 4:04 PM Daniel Iwan wrote: > Hi all > > Scenario is as follows > Federated user assumes a role via AssumeRoleWithWebIdentity, which gives > permission to create a bucket. > User creates a bucket and becomes an owner (this is visible in Ceph's web > ui as Owner $oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b). > User cannot list the content of the bucket however, because role's policy > does not give access to the bucket. > Later on when user re-authenticates and assumes the same role again. > At this point user cannot access a bucket it owns for the reason as above > I'm assuming. > Bucket's ACL after creation > > radosgw-admin policy --bucket my-bucket > { > "acl": { > "acl_user_map": [ > { > "user": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", > "acl": 15 > } > ], > "acl_group_map": [], > "grant_map": [ > { > "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", > "grant": { > "type": { > "type": 0 > }, > "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", > "email": "", > "permission": { > "flags": 15 > }, > "name": "", > "group": 0, > "url_spec": "" > } > } > ] > }, > "owner": { > "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", > "display_name": "" > } > } > > This seems inconsistent with buckets created by regular users > Is this expected behaviour? > > Regards > Daniel > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Using ID of a federated user in a bucket policy in RGW
Hi, Can you try with the following ARN: arn:aws:iam:::user/oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b The format of the user id is: $$ , and in $oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b, the '$' before oidc is a separator for a tenant which is empty here, and ARN for a user is of the format: arn:aws:iam:::user/, and hence the ARN here will be arn:aws:iam:::user/oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b Thanks, Pritha On Wed, May 12, 2021 at 4:02 PM Daniel Iwan wrote: > Hi all > > I'm working on the following scenario > User is authenticated with OIDC and tries to access a bucket which it does > not own. > How to specify user ID etc. to give access to such a user? > > By trial and error I found out that principal can be specified as > "Principal": {"Federated":["arn:aws:sts:::assumed-role/MySession"]}, > > but I want to use shadow user ID or something similar as the principal > > Docs > https://docs.ceph.com/en/latest/radosgw/STS/ > states: > 'A shadow user is created corresponding to every federated user. The user > id is derived from the ‘sub’ field of the incoming web token. The user is > created in a separate namespace - ‘oidc’ such that the user id doesn’t > clash with any other user ids in rgw. The format of the user id is - > $$ where user-namespace is ‘oidc’ for users > that authenticate with oidc providers.' > > I see a shadow user in Web UI as e.g. 7f71c7c5-c24f-418e-87ac-aa8fe271289b > but I cannot work out the syntax of a user id, I was expecting something > like > > "arn:aws:iam:::user/$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b" > > but when trying to list content of a bucket I get AccessDenied. > If bucket policy has Principal "*" the my authenticated user can access the > bucket > > Is this possible? > Regards > Daniel > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph stretch mode enabling
Hi, I just deployed a test cluster to try that out, too. I only deployed three MONs, but this should also apply. I tried to create the third datacenter and put the tiebreaker there but got the following error: root@ceph-node-01:/home/clouduser# ceph mon enable_stretch_mode ceph-node-05 stretch_rule datacenter Error EINVAL: there are 3datacenter's in the cluster but stretch mode currently only works with 2! You don't create a third datacenter within the osd tree, you just tell ceph that your tie-breaker is in a different dc. For me it worked, I have two DCs and put the third (tie-breaker) into (virtual) dc3: pacific1:~ # ceph mon set_location pacific3 datacenter=dc3 pacific1:~ # ceph mon enable_stretch_mode pacific3 stretch_rule datacenter This automatically triggered pool size 4 and distributed the PGs evenly across both DCs. Regards, Eugen Zitat von Felix O : Hello, I'm trying to deploy my test ceph cluster and enable stretch mode ( https://docs.ceph.com/en/latest/rados/operations/stretch-mode/). My problem is enabling the stretch mode. $ ceph mon enable_stretch_mode ceph-node-05 stretch_rule datacenter Error EINVAL: Could not find location entry for datacenter on monitor ceph-node-05 ceph-node-5 is the tiebreaker monitor I tried to create the third datacenter and put the tiebreaker there but got the following error: root@ceph-node-01:/home/clouduser# ceph mon enable_stretch_mode ceph-node-05 stretch_rule datacenter Error EINVAL: there are 3datacenter's in the cluster but stretch mode currently only works with 2! An additional info: Setup method: cephadm (https://docs.ceph.com/en/latest/cephadm/install/) # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.03998 root default -11 0.01999 datacenter site1 -5 0.00999 host ceph-node-01 0hdd 0.00999 osd.0 up 1.0 1.0 -3 0.00999 host ceph-node-02 1hdd 0.00999 osd.1 up 1.0 1.0 -12 0.01999 datacenter site2 -9 0.00999 host ceph-node-03 3hdd 0.00999 osd.3 up 1.0 1.0 -7 0.00999 host ceph-node-04 2hdd 0.00999 osd.2 up 1.0 1.0 stretch_rule is added to the crush # ceph mon set_location ceph-node-01 datacenter=site1 # ceph mon set_location ceph-node-02 datacenter=site1 # ceph mon set_location ceph-node-03 datacenter=site2 # ceph mon set_location ceph-node-04 datacenter=site2 # ceph versions { "mon": { "ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)": 5 }, "mgr": { "ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)": 2 }, "osd": { "ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)": 4 }, "mds": {}, "overall": { "ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)": 11 } } Thank you for your support. -- Best regards, ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] RGW segmentation fault on Pacific 16.2.1 with multipart upload
Hi I have started to see segfaults during multiplart upload to one of the buckets File is about 60MB in size Upload of the same file to a brand new bucket works OK Command used aws --profile=tester --endpoint=$HOST_S3_API --region="" s3 cp ./pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack s3://tester-bucket/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack For some reason log shows upload to tester-bucket-2 ??? Bucket tester-bucket-2 is owned by the same user TESTER. I'm using Ceph 16.2.1 (recently upgraded from Octopus). Installed with cephadm in Docker OS Ubuntu 18.04.5 LTS Logs show as below May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.891+ 7ffb0e25e700 1 == starting new request req=0x7ffa8e15d620 = May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.907+ 7ffb0b258700 1 == req done req=0x7ffa8e15d620 op status=0 http_status=200 latency=0.011999841s == May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.907+ 7ffb0b258700 1 beast: 0x7ffa8e15d620: 11.1.150.14 - TESTER [11/May/2021:11:00:46.891 +] "POST /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploads HTTP/1.1" 200 296 - "aws-cli/2.1.23 Python/3.7.3 Linux/4.19.128-microsoft-standard exe/x86_64.ubuntu.18 p May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.055+ 7ffb09254700 1 == starting new request req=0x7ffa8e15d620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+ 7ffb51ae5700 1 == starting new request req=0x7ffa8e0dc620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+ 7ffb4eadf700 1 == starting new request req=0x7ffa8e05b620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+ 7ffb46acf700 1 == starting new request req=0x7ffa8df59620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+ 7ffb44acb700 1 == starting new request req=0x7ffa8ded8620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+ 7ffb3dabd700 1 == starting new request req=0x7ffa8dfda620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.359+ 7ffb1d27c700 1 == starting new request req=0x7ffa8de57620 = May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.359+ 7ffb22a87700 1 == starting new request req=0x7ffa8ddd6620 = May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:48.275+ 7ffb2d29c700 1 == req done req=0x7ffa8e15d620 op status=0 http_status=200 latency=1.219983697s == May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:48.275+ 7ffb2d29c700 1 beast: 0x7ffa8e15d620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.055 +] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=8 HTTP/1.1" 200 2485288 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:54.695+ 7ffad89f3700 1 == req done req=0x7ffa8ddd6620 op status=0 http_status=200 latency=7.335902214s == May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:54.695+ 7ffad89f3700 1 beast: 0x7ffa8ddd6620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=6 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:56.871+ 7ffb11a65700 1 == req done req=0x7ffa8e0dc620 op status=0 http_status=200 latency=9.515872955s == May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:56.871+ 7ffb11a65700 1 beast: 0x7ffa8e0dc620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=7 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:59.491+ 7ffac89d3700 1 == req done req=0x7ffa8dfda620 op status=0 http_status=200 latency=12.135838509s == May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:59.491+ 7ffac89d3700 1 beast: 0x7ffa8dfda620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=2 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:02.891+ 7ffb68312700 1 == req done req=0x7ffa8e05b620 op status=0 http_status=200 latency=15.535793304s == May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:02.891+ 7ffb68312700 1
[ceph-users] RGW federated user cannot access created bucket
Hi all Scenario is as follows Federated user assumes a role via AssumeRoleWithWebIdentity, which gives permission to create a bucket. User creates a bucket and becomes an owner (this is visible in Ceph's web ui as Owner $oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b). User cannot list the content of the bucket however, because role's policy does not give access to the bucket. Later on when user re-authenticates and assumes the same role again. At this point user cannot access a bucket it owns for the reason as above I'm assuming. Bucket's ACL after creation radosgw-admin policy --bucket my-bucket { "acl": { "acl_user_map": [ { "user": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", "acl": 15 } ], "acl_group_map": [], "grant_map": [ { "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", "grant": { "type": { "type": 0 }, "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", "email": "", "permission": { "flags": 15 }, "name": "", "group": 0, "url_spec": "" } } ] }, "owner": { "id": "$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b", "display_name": "" } } This seems inconsistent with buckets created by regular users Is this expected behaviour? Regards Daniel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Using ID of a federated user in a bucket policy in RGW
Hi all I'm working on the following scenario User is authenticated with OIDC and tries to access a bucket which it does not own. How to specify user ID etc. to give access to such a user? By trial and error I found out that principal can be specified as "Principal": {"Federated":["arn:aws:sts:::assumed-role/MySession"]}, but I want to use shadow user ID or something similar as the principal Docs https://docs.ceph.com/en/latest/radosgw/STS/ states: 'A shadow user is created corresponding to every federated user. The user id is derived from the ‘sub’ field of the incoming web token. The user is created in a separate namespace - ‘oidc’ such that the user id doesn’t clash with any other user ids in rgw. The format of the user id is - $$ where user-namespace is ‘oidc’ for users that authenticate with oidc providers.' I see a shadow user in Web UI as e.g. 7f71c7c5-c24f-418e-87ac-aa8fe271289b but I cannot work out the syntax of a user id, I was expecting something like "arn:aws:iam:::user/$oidc$7f71c7c5-c24f-418e-87ac-aa8fe271289b" but when trying to list content of a bucket I get AccessDenied. If bucket policy has Principal "*" the my authenticated user can access the bucket Is this possible? Regards Daniel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io