Re: [ceph-users] Glance client and RBD export checksum mismatch
On Thu, Apr 11, 2019, 8:53 AM Jason Dillaman wrote: > On Thu, Apr 11, 2019 at 8:49 AM Erik McCormick > wrote: > > > > > > > > On Thu, Apr 11, 2019, 8:39 AM Erik McCormick > wrote: > >> > >> > >> > >> On Thu, Apr 11, 2019, 12:07 AM Brayan Perera > wrote: > >>> > >>> Dear Jason, > >>> > >>> > >>> Thanks for the reply. > >>> > >>> We are using python 2.7.5 > >>> > >>> Yes. script is based on openstack code. > >>> > >>> As suggested, we have tried chunk_size 32 and 64, and both giving same > >>> incorrect checksum value. > >> > >> > >> The value of rbd_store_chunk_size in glance is expressed in MB and then > converted to mb. I think the default is 8, so you would want 8192 if you're > trying to match what the image was uploaded with. > > > > > > Sorry, that should have been "...converted to KB." > > Wouldn't it be converted to bytes since all rbd API methods are in bytes? > [1] > Well yeah in the end that's true. Old versions I recall just passed a KB number, but now it's self.chunk_size = CONF.rbd_store_chunk_size * 1024 * 1024 My main point though was just that glance defaults to 8 MB chunks which is nk uch larger than IP was using. > >> > >>> > >>> We tried to copy same image in different pool and resulted same > >>> incorrect checksum. > >>> > >>> > >>> Thanks & Regards, > >>> Brayan > >>> > >>> On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman > wrote: > >>> > > >>> > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera < > brayan.per...@gmail.com> wrote: > >>> > > > >>> > > Dear All, > >>> > > > >>> > > Ceph Version : 12.2.5-2.ge988fb6.el7 > >>> > > > >>> > > We are facing an issue on glance which have backend set to ceph, > when > >>> > > we try to create an instance or volume out of an image, it throws > >>> > > checksum error. > >>> > > When we use rbd export and use md5sum, value is matching with > glance checksum. > >>> > > > >>> > > When we use following script, it provides same error checksum as > glance. > >>> > > >>> > What version of Python are you using? > >>> > > >>> > > We have used below images for testing. > >>> > > 1. Failing image (checksum mismatch): > ffed4088-74e1-4f22-86cb-35e7e97c377c > >>> > > 2. Passing image (checksum identical): > c048f0f9-973d-4285-9397-939251c80a84 > >>> > > > >>> > > Output from storage node: > >>> > > > >>> > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c > >>> > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 > >>> > > [root@storage01moc ~]# python test_rbd_format.py > >>> > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin > >>> > > Image size: 681181184 > >>> > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 > >>> > > Remarks: checksum is different > >>> > > > >>> > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 > >>> > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed > >>> > > [root@storage01moc ~]# python test_rbd_format.py > >>> > > c048f0f9-973d-4285-9397-939251c80a84 admin > >>> > > Image size: 1411121152 > >>> > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed > >>> > > Remarks: checksum is identical > >>> > > > >>> > > Wondering whether this issue is from ceph python libs or from ceph > itself. > >>> > > > >>> > > Please note that we do not have ceph pool tiering configured. > >>> > > > >>> > > Please let us know whether anyone faced similar issue and any > fixes for this. > >>> > > > >>> > > test_rbd_format.py > >>> > > === > >>> > > import rados, sys, rbd > >>> > > > >>> > > image_id = sys.argv[1] > >>> > > try: > >>> > > rados_id = sys.argv[2] > >>> > > except: > >>> > > rados_id = 'openstack' > >>> > > > >>> > > > >>> > > class ImageIterator(object): > >>> > > """ > >>> > > Reads data from an RBD image, one chunk at a time. > >>> > > """ > >>> > > > >>> > > def __init__(self, conn, pool, name, snapshot, store, > chunk_size='8'): > >>> > > >>> > Am I correct in assuming this was adapted from OpenStack code? That > >>> > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. > >>> > Not that it should matter, but does it still fail if you increase > this > >>> > to 32KiB or 64KiB? > >>> > > >>> > > self.pool = pool > >>> > > self.conn = conn > >>> > > self.name = name > >>> > > self.snapshot = snapshot > >>> > > self.chunk_size = chunk_size > >>> > > self.store = store > >>> > > > >>> > > def __iter__(self): > >>> > > try: > >>> > > with conn.open_ioctx(self.pool) as ioctx: > >>> > > with rbd.Image(ioctx, self.name, > >>> > >snapshot=self.snapshot) as image: > >>> > > img_info = image.stat() > >>> > > size = img_info['size'] > >>> > > bytes_left = size > >>> > > while bytes_left > 0: > >>> > > length = min(self.chunk_size, bytes_left) > >>> > > data = imag
Re: [ceph-users] How to reduce HDD OSD flapping due to rocksdb compacting event?
Hi Christian and Wido, I used the daily digest and lost way to reply without heavy editing the replies. I will change my subscription to be individual message later. How big is the disk? RocksDB will need to compact at some point and it > seems that the HDD can't keep up. > I've seen this with many customers and in those cases we offloaded the > WAL+DB to an SSD. > How big is the data drive and the DB? > Wido The disks are 6TB each. The data drive is around 50% of the disk and the DB varies from 40 to 67GB. Splitting the WAL+DB to SSD is not an option at this time because rebuilding the OSD one by one will take forever. It's ceph-bluestore-tool. Is there any official documentation on how to online migrate the WAL+DB to SSD? I guess this feature is not backported to Luminous right? Kind regards, Charles Alva Sent from Gmail Mobile On Fri, Apr 12, 2019 at 10:24 AM Christian Balzer wrote: > > Hello Charles, > > On Wed, 10 Apr 2019 14:07:58 +0700 Charles Alva wrote: > > > Hi Ceph Users, > > > > Is there a way around to minimize rocksdb compacting event so that it > won't > > use all the spinning disk IO utilization and avoid it being marked as > down > > due to fail to send heartbeat to others? > > > > Right now we have frequent high IO disk utilization for every 20-25 > minutes > > where the rocksdb reaches level 4 with 67GB data to compact. > > > > > Could you please follow up on the questions Wido asked? > > As in sizes of disk, DB, number and size of objects (I think you're using > object store), how busy those disks and CPUs are, etc. > > That kind of information will be invaluable for others here and likely the > developers as well. > > Regards, > > Christian > > > Kind regards, > > > > Charles Alva > > Sent from Gmail Mobile > > > -- > Christian BalzerNetwork/Systems Engineer > ch...@gol.com Rakuten Communications > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to reduce HDD OSD flapping due to rocksdb compacting event?
Hello Charles, On Wed, 10 Apr 2019 14:07:58 +0700 Charles Alva wrote: > Hi Ceph Users, > > Is there a way around to minimize rocksdb compacting event so that it won't > use all the spinning disk IO utilization and avoid it being marked as down > due to fail to send heartbeat to others? > > Right now we have frequent high IO disk utilization for every 20-25 minutes > where the rocksdb reaches level 4 with 67GB data to compact. > > Could you please follow up on the questions Wido asked? As in sizes of disk, DB, number and size of objects (I think you're using object store), how busy those disks and CPUs are, etc. That kind of information will be invaluable for others here and likely the developers as well. Regards, Christian > Kind regards, > > Charles Alva > Sent from Gmail Mobile -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BADAUTHORIZER in Nautilus
Cluster is back and clean again. So I started adding plugins and such back to the mix. After adding the 'balancer' back, I got crashes in the mgr log. ceph-post-file: 0feb1562-cdc5-4a99-86ee-91006eaf6056 Turned balancer back off for now. On Tue, Apr 9, 2019 at 9:38 AM Shawn Edwards wrote: > Update: > > I think we have a work-around, but no root cause yet. > > What is working is removing the 'v2' bits from the ceph.conf file across > the cluster, and turning off all cephx authentication. Now everything > seems to be talking correctly other than some odd metrics around the edges. > > Here's my current ceph.conf, running on all ceph hosts and clients: > > [global] > fsid = 3f390b5e-2b1d-4a2f-ba00- > mon_host = [v1:10.36.9.43:6789/0] [v1:10.36.9.44:6789/0] [v1: > 10.36.9.45:6789/0] > auth_client_required = none > auth_cluster_required = none > auth_service_required = none > > If we get better information as to what's going on, I'll post here for > future reference > > > On Thu, Apr 4, 2019 at 9:16 AM Sage Weil wrote: > >> On Thu, 4 Apr 2019, Shawn Edwards wrote: >> > It was disabled in a fit of genetic debugging. I've now tried to revert >> > all config settings related to auth and signing to defaults. >> > >> > I can't seem to change the auth_*_required settings. If I try to remove >> > them, they stay set. If I try to change them, I get both the old and >> new >> > settings: >> > >> > root@tyr-ceph-mon0:~# ceph config dump | grep -E '(auth|cephx)' >> > globaladvanced auth_client_required cephx >> > * >> > globaladvanced auth_cluster_required cephx >> > * >> > globaladvanced auth_service_required cephx >> > * >> > root@tyr-ceph-mon0:~# ceph config rm global auth_service_required >> > root@tyr-ceph-mon0:~# ceph config dump | grep -E '(auth|cephx)' >> > globaladvanced auth_client_required cephx >> > * >> > globaladvanced auth_cluster_required cephx >> > * >> > globaladvanced auth_service_required cephx >> > * >> > root@tyr-ceph-mon0:~# ceph config set global auth_service_required none >> > root@tyr-ceph-mon0:~# ceph config dump | grep -E '(auth|cephx)' >> > globaladvanced auth_client_required cephx >> > * >> > globaladvanced auth_cluster_required cephx >> > * >> > globaladvanced auth_service_required none >> >* >> > globaladvanced auth_service_required cephx >> > * >> > >> > I know these are set to RO, but according to your blog posts, this means >> > they don't get updated until a daemon restart. Does this look correct >> to >> > you? I'm assuming I need to restart all daemons on all hosts. Is this >> > correct? >> >> Yeah, that is definitely not behaving properly. Can you try "ceph >> config-key dump | grep config/" to look at how those keys are stored? >> You >> should see something like >> >> "config/auth_cluster_required": "cephx", >> "config/auth_service_required": "cephx", >> "config/auth_service_ticket_ttl": "3600.00", >> >> but maybe those names are formed differently, maybe with ".../global/..." >> in there? My guess is a subtle naming behavior change between mimic or >> something. You can remove the keys via the config-key interface and then >> restart the mons (or adjust any random config option) to make the >> mons refresh. After that config dump should show the right thing. >> >> Maybe a disagreement/confusion about the actual value of >> auth_service_ticket_ttl is the cause of this. You might try doing 'ceph >> config show osd.0' and/or a mon to see what value for the auth options >> the >> daemons are actually using and reporting... >> >> sage >> >> >> > >> > On Thu, Apr 4, 2019 at 5:54 AM Sage Weil wrote: >> > >> > > That log shows >> > > >> > > 2019-04-03 15:39:53.299 7f3733f18700 10 monclient: tick >> > > 2019-04-03 15:39:53.299 7f3733f18700 10 cephx: validate_tickets want >> 53 >> > > have 53 need 0 >> > > 2019-04-03 15:39:53.299 7f3733f18700 20 cephx client: need_tickets: >> > > want=53 have=53 need=0 >> > > 2019-04-03 15:39:53.299 7f3733f18700 10 monclient: >> _check_auth_rotating >> > > have uptodate secrets (they expire after 2019-04-03 15:39:23.301595) >> > > 2019-04-03 15:39:53.299 7f3733f18700 10 auth: dump_rotating: >> > > 2019-04-03 15:39:53.299 7f3733f18700 10 auth: id 41691 A4Q== expires >> > > 2019-04-03 14:43:07.042860 >> > > 2019-04-03 15:39:53.299
Re: [ceph-users] bluefs-bdev-expand experience
Hi Igor! I have upgraded from Luminous to Nautilus and now slow device expansion works indeed. The steps are shown below to round up the topic. node2# ceph osd df ID CLASS WEIGHT REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL %USE VAR PGS STATUS 0 hdd 0.22739 1.0 233 GiB 91 GiB 90 GiB 208 MiB 816 MiB 142 GiB 38.92 1.04 128 up 1 hdd 0.22739 1.0 233 GiB 91 GiB 90 GiB 200 MiB 824 MiB 142 GiB 38.92 1.04 128 up 3 hdd 0.227390 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 2 hdd 0.22739 1.0 481 GiB 172 GiB 90 GiB 201 MiB 823 MiB 309 GiB 35.70 0.96 128 up TOTAL 947 GiB 353 GiB 269 GiB 610 MiB 2.4 GiB 594 GiB 37.28 MIN/MAX VAR: 0.96/1.04 STDDEV: 1.62 node2# lvextend -L+50G /dev/vg0/osd2 Size of logical volume vg0/osd2 changed from 400.00 GiB (102400 extents) to 450.00 GiB (115200 extents). Logical volume vg0/osd2 successfully resized. node2# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-2/ inferring bluefs devices from bluestore path 2019-04-11 22:28:00.240 7f2e24e190c0 -1 bluestore(/var/lib/ceph/osd/ceph-2) _lock_fsid failed to lock /var/lib/ceph/osd/ceph-2/fsid (is another ceph-osd still running?)(11) Resource temporarily unavailable ... *** Caught signal (Aborted) ** [two pages of stack dump stripped] My mistake in the first place: I tried to expand non-stopped osd again. node2# systemctl stop ceph-osd.target node2# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-2/ inferring bluefs devices from bluestore path 0 : device size 0x4000 : own 0x[1000~3000] = 0x3000 : using 0x8ff000 1 : device size 0x144000 : own 0x[2000~143fffe000] = 0x143fffe000 : using 0x24dfe000 2 : device size 0x708000 : own 0x[30~4] = 0x4 : using 0x0 Expanding... 2 : expanding from 0x64 to 0x708000 2 : size label updated to 483183820800 node2# ceph-bluestore-tool show-label --dev /dev/vg0/osd2 | grep size "size": 483183820800, node2# ceph osd df ID CLASS WEIGHT REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL %USE VAR PGS STATUS 0 hdd 0.22739 1.0 233 GiB 91 GiB 90 GiB 208 MiB 816 MiB 142 GiB 38.92 1.10 128 up 1 hdd 0.22739 1.0 233 GiB 91 GiB 90 GiB 200 MiB 824 MiB 142 GiB 38.92 1.10 128 up 3 hdd 0.227390 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 2 hdd 0.22739 1.0 531 GiB 172 GiB 90 GiB 185 MiB 839 MiB 359 GiB 32.33 0.91 128 up TOTAL 997 GiB 353 GiB 269 GiB 593 MiB 2.4 GiB 644 GiB 35.41 MIN/MAX VAR: 0.91/1.10 STDDEV: 3.37 It worked: AVAIL = 594+50 = 644. Great! Thanks a lot for your help. And one more question regarding your last remark is inline below. On Wed, Apr 10, 2019 at 09:54:35PM +0300, Igor Fedotov wrote: > > On 4/9/2019 1:59 PM, Yury Shevchuk wrote: > > Igor, thank you, Round 2 is explained now. > > > > Main aka block aka slow device cannot be expanded in Luminus, this > > functionality will be available after upgrade to Nautilus. > > Wal and db devices can be expanded in Luminous. > > > > Now I have recreated osd2 once again to get rid of the paradoxical > > cepf osd df output and tried to test db expansion, 40G -> 60G: > > > > node2:/# ceph-volume lvm zap --destroy --osd-id 2 > > node2:/# ceph osd lost 2 --yes-i-really-mean-it > > node2:/# ceph osd destroy 2 --yes-i-really-mean-it > > node2:/# lvcreate -L1G -n osd2wal vg0 > > node2:/# lvcreate -L40G -n osd2db vg0 > > node2:/# lvcreate -L400G -n osd2 vg0 > > node2:/# ceph-volume lvm create --osd-id 2 --bluestore --data vg0/osd2 > > --block.db vg0/osd2db --block.wal vg0/osd2wal > > > > node2:/# ceph osd df > > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > > 0 hdd 0.22739 1.0 233GiB 9.49GiB 223GiB 4.08 1.24 128 > > 1 hdd 0.22739 1.0 233GiB 9.49GiB 223GiB 4.08 1.24 128 > > 3 hdd 0.227390 0B 0B 0B00 0 > > 2 hdd 0.22739 1.0 400GiB 9.49GiB 391GiB 2.37 0.72 128 > > TOTAL 866GiB 28.5GiB 837GiB 3.29 > > MIN/MAX VAR: 0.72/1.24 STDDEV: 0.83 > > > > node2:/# lvextend -L+20G /dev/vg0/osd2db > >Size of logical volume vg0/osd2db changed from 40.00 GiB (10240 extents) > > to 60.00 GiB (15360 extents). > >Logical volume vg0/osd2db successfully resized. > > > > node2:/# ceph-bluestore-tool bluefs-bdev-expand --path > > /var/lib/ceph/osd/ceph-2/ > > inferring bluefs devices from bluestore path > > slot 0 /var/lib/ceph/osd/ceph-2//block.wal > > slot 1 /var/lib/ceph/osd/ceph-2//block.db > > slot 2 /var/lib/ceph/osd/ceph-2//block > > 0 : size 0x4000 : own 0x[1000~3000] > > 1 : size 0xf : own 0x[2000~9e000] > > 2 : size 0x64 : own 0x[30~4] > > Expanding... > > 1 : expanding from 0xa to 0xf > > 1 : size label updated to 64424509440 > > > > node2:/# ceph-bluestore-tool show-l
Re: [ceph-users] Topology query
Thanks a lot, Marc - this looks similar to the post I found: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-July/003369.html It seems to suggest that this wouldn't be an issue in more recent kernels but would be great to get confirmation on that. I'll keep researching. On Thu, 11 Apr 2019 at 19:50, Marc Roos wrote: > > > AFAIK you at least risk with cephfs on osd nodes this 'kernel deadlock'? > I have it also, but with enough memory. Search mailing list for this. > I am looking at similar setup, but with mesos and strugling with some > cni plugin we have to develop. > > > -Original Message- > From: Bob Farrell [mailto:b...@homeflow.co.uk] > Sent: donderdag 11 april 2019 20:45 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Topology query > > Hello. I am running Ceph Nautilus v14.2.0 on Ubuntu Bionic 18.04 LTS. > > I would like to ask if anybody could advise if there will be any > potential problems with my setup as I am running a lot of services on > each node. > > I have 8 large dedicated servers, each with two physical disks. All > servers run Docker Swarm and host numerous web applications. > > I have also installed Ceph on each node (not in Docker). The secondary > disk on each server hosts an LVM volume which is dedicated to Ceph. Each > node runs one of each: osd, mon, mgr, mdss. I use CephFS to mount the > data into each node's filesystem, which is then accessed by numerous > containers via Docker bindmounts. > > So far everything is working great but we haven't put anything under > heavy load. I googled around to see if there are any potential problems > with what I'm doing but couldn't find too much. There was one forum post > I read [but can't find now] which warned against this unless using very > latest glibc due to kernel fsync issues (IIRC) but this post was from > 2014 so I hope I'm safe ? > > Thanks for the great project - I got this far just from reading the docs > and writing my own Ansible script (wanted to learn Ceph properly). It's > really good stuff. : ) > > Cheers, > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Topology query
AFAIK you at least risk with cephfs on osd nodes this 'kernel deadlock'? I have it also, but with enough memory. Search mailing list for this. I am looking at similar setup, but with mesos and strugling with some cni plugin we have to develop. -Original Message- From: Bob Farrell [mailto:b...@homeflow.co.uk] Sent: donderdag 11 april 2019 20:45 To: ceph-users@lists.ceph.com Subject: [ceph-users] Topology query Hello. I am running Ceph Nautilus v14.2.0 on Ubuntu Bionic 18.04 LTS. I would like to ask if anybody could advise if there will be any potential problems with my setup as I am running a lot of services on each node. I have 8 large dedicated servers, each with two physical disks. All servers run Docker Swarm and host numerous web applications. I have also installed Ceph on each node (not in Docker). The secondary disk on each server hosts an LVM volume which is dedicated to Ceph. Each node runs one of each: osd, mon, mgr, mdss. I use CephFS to mount the data into each node's filesystem, which is then accessed by numerous containers via Docker bindmounts. So far everything is working great but we haven't put anything under heavy load. I googled around to see if there are any potential problems with what I'm doing but couldn't find too much. There was one forum post I read [but can't find now] which warned against this unless using very latest glibc due to kernel fsync issues (IIRC) but this post was from 2014 so I hope I'm safe ? Thanks for the great project - I got this far just from reading the docs and writing my own Ansible script (wanted to learn Ceph properly). It's really good stuff. : ) Cheers, ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Topology query
Hello. I am running Ceph Nautilus v14.2.0 on Ubuntu Bionic 18.04 LTS. I would like to ask if anybody could advise if there will be any potential problems with my setup as I am running a lot of services on each node. I have 8 large dedicated servers, each with two physical disks. All servers run Docker Swarm and host numerous web applications. I have also installed Ceph on each node (not in Docker). The secondary disk on each server hosts an LVM volume which is dedicated to Ceph. Each node runs one of each: osd, mon, mgr, mdss. I use CephFS to mount the data into each node's filesystem, which is then accessed by numerous containers via Docker bindmounts. So far everything is working great but we haven't put anything under heavy load. I googled around to see if there are any potential problems with what I'm doing but couldn't find too much. There was one forum post I read [but can't find now] which warned against this unless using very latest glibc due to kernel fsync issues (IIRC) but this post was from 2014 so I hope I'm safe ? Thanks for the great project - I got this far just from reading the docs and writing my own Ansible script (wanted to learn Ceph properly). It's really good stuff. : ) Cheers, ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] reshard list
Thank you I found my problem. On Wed, Apr 10, 2019 at 10:00 PM Konstantin Shalygin wrote: > Hello, > > I am have been managing a ceph cluster running 12.2.11. This was running > 12.2.5 until the recent upgrade three months ago. We build another cluster > running 13.2.5 and synced the data between clusters and now would like to > run primarily off the 13.2.5 cluster. The data is all S3 buckets. There > are 15 buckets with more than 1 million objects in them. I attempted to > start sharding on the bucket indexes by using the following process from > the documentation. > > Pulling the zonegroup > > #radosgw-admin zonegroup get > zonegroup.json > > Changing bucket_index_max_shards to a number other than 0 and then > > #radosgw-admin zonegroup set < zonegroup.json > > Update the period > > This had no effect on existing buckets. What is the methodology to enable > sharding on existing buckets. Also I am not able to see the reshard list I > get the follwoing error. > > 2019-04-10 10:33:05.074 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.00 > 2019-04-10 10:33:05.078 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.01 > 2019-04-10 10:33:05.082 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.02 > 2019-04-10 10:33:05.082 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.03 > 2019-04-10 10:33:05.114 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.04 > 2019-04-10 10:33:05.118 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.05 > 2019-04-10 10:33:05.118 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.06 > 2019-04-10 10:33:05.122 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.07 > 2019-04-10 10:33:05.122 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.08 > 2019-04-10 10:33:05.122 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.09 > 2019-04-10 10:33:05.122 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.10 > 2019-04-10 10:33:05.126 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.11 > 2019-04-10 10:33:05.126 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.12 > 2019-04-10 10:33:05.126 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.13 > 2019-04-10 10:33:05.126 7fbd534cb300 -1 ERROR: failed to list reshard log > entries, oid=reshard.14 > > Any suggestions > > Andrew, RGW dynamic resharding is enabling via `rgw_dynamic_resharding` > and ruled by `rgw_max_objs_per_shard`. > > Or you may reshard bucket by hand via `radosgw-admin reshard add ...`. > > > > k > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] mimic stability finally achieved
I think I finally have a stable containerized mimic cluster... je zez! It was hard enough! I'm currently repopulating cephfs and cruising along at ... client: 147 MiB/s wr, 0 op/s rd, 38 op/s wr First last month I had four Seagate Barracuda drive failures at the same time with around 18,000 power_on_hours. I used to adore Seagate. Now I am shocked at how terrible they are. And they have been advertising NVMe's to me on Facebook very heavily. There is no way... Then three newegg and Amazon suppliers failed to get me correct HGST disks. I've been dealing with this garbage for like five weeks now! Once the hardware issues were settled my osds were flapping like gophers on the caddy shack golf course. Ultimately I had to copy data to standalone drives, destroy everything ceph related and start from scratch. That didn't solve the problem! osds continued to flap! For new clusters and maybe old cluster also this setting is key!!! ceph osd crush tunables optimal Without that setting I surmise that client writes would consume all available cluster bandwidth. MDS would report slow IOs. OSD would not be able to replicate object or answer heartbeats then slowly they would get knocked out of the cluster. That's all from me. Like my poor NC sinus's recovering from the crazy pollen my ceph life is also slowly recovering. /Chris C ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Glance client and RBD export checksum mismatch
On Thu, Apr 11, 2019 at 8:49 AM Erik McCormick wrote: > > > > On Thu, Apr 11, 2019, 8:39 AM Erik McCormick > wrote: >> >> >> >> On Thu, Apr 11, 2019, 12:07 AM Brayan Perera wrote: >>> >>> Dear Jason, >>> >>> >>> Thanks for the reply. >>> >>> We are using python 2.7.5 >>> >>> Yes. script is based on openstack code. >>> >>> As suggested, we have tried chunk_size 32 and 64, and both giving same >>> incorrect checksum value. >> >> >> The value of rbd_store_chunk_size in glance is expressed in MB and then >> converted to mb. I think the default is 8, so you would want 8192 if you're >> trying to match what the image was uploaded with. > > > Sorry, that should have been "...converted to KB." Wouldn't it be converted to bytes since all rbd API methods are in bytes? [1] >> >>> >>> We tried to copy same image in different pool and resulted same >>> incorrect checksum. >>> >>> >>> Thanks & Regards, >>> Brayan >>> >>> On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman wrote: >>> > >>> > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera >>> > wrote: >>> > > >>> > > Dear All, >>> > > >>> > > Ceph Version : 12.2.5-2.ge988fb6.el7 >>> > > >>> > > We are facing an issue on glance which have backend set to ceph, when >>> > > we try to create an instance or volume out of an image, it throws >>> > > checksum error. >>> > > When we use rbd export and use md5sum, value is matching with glance >>> > > checksum. >>> > > >>> > > When we use following script, it provides same error checksum as glance. >>> > >>> > What version of Python are you using? >>> > >>> > > We have used below images for testing. >>> > > 1. Failing image (checksum mismatch): >>> > > ffed4088-74e1-4f22-86cb-35e7e97c377c >>> > > 2. Passing image (checksum identical): >>> > > c048f0f9-973d-4285-9397-939251c80a84 >>> > > >>> > > Output from storage node: >>> > > >>> > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c >>> > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 >>> > > [root@storage01moc ~]# python test_rbd_format.py >>> > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin >>> > > Image size: 681181184 >>> > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 >>> > > Remarks: checksum is different >>> > > >>> > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 >>> > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed >>> > > [root@storage01moc ~]# python test_rbd_format.py >>> > > c048f0f9-973d-4285-9397-939251c80a84 admin >>> > > Image size: 1411121152 >>> > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed >>> > > Remarks: checksum is identical >>> > > >>> > > Wondering whether this issue is from ceph python libs or from ceph >>> > > itself. >>> > > >>> > > Please note that we do not have ceph pool tiering configured. >>> > > >>> > > Please let us know whether anyone faced similar issue and any fixes for >>> > > this. >>> > > >>> > > test_rbd_format.py >>> > > === >>> > > import rados, sys, rbd >>> > > >>> > > image_id = sys.argv[1] >>> > > try: >>> > > rados_id = sys.argv[2] >>> > > except: >>> > > rados_id = 'openstack' >>> > > >>> > > >>> > > class ImageIterator(object): >>> > > """ >>> > > Reads data from an RBD image, one chunk at a time. >>> > > """ >>> > > >>> > > def __init__(self, conn, pool, name, snapshot, store, >>> > > chunk_size='8'): >>> > >>> > Am I correct in assuming this was adapted from OpenStack code? That >>> > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. >>> > Not that it should matter, but does it still fail if you increase this >>> > to 32KiB or 64KiB? >>> > >>> > > self.pool = pool >>> > > self.conn = conn >>> > > self.name = name >>> > > self.snapshot = snapshot >>> > > self.chunk_size = chunk_size >>> > > self.store = store >>> > > >>> > > def __iter__(self): >>> > > try: >>> > > with conn.open_ioctx(self.pool) as ioctx: >>> > > with rbd.Image(ioctx, self.name, >>> > >snapshot=self.snapshot) as image: >>> > > img_info = image.stat() >>> > > size = img_info['size'] >>> > > bytes_left = size >>> > > while bytes_left > 0: >>> > > length = min(self.chunk_size, bytes_left) >>> > > data = image.read(size - bytes_left, length) >>> > > bytes_left -= len(data) >>> > > yield data >>> > > raise StopIteration() >>> > > except rbd.ImageNotFound: >>> > > raise exceptions.NotFound( >>> > > _('RBD image %s does not exist') % self.name) >>> > > >>> > > conn = rados.Rados(conffile='/etc/ceph/ceph.conf',rados_id=rados_id) >>> > > conn.connect() >>> > > >>> > > >>> > > with conn.open_ioctx('images') as ioctx: >>> > > try: >>> > > with rb
Re: [ceph-users] Glance client and RBD export checksum mismatch
On Thu, Apr 11, 2019, 8:39 AM Erik McCormick wrote: > > > On Thu, Apr 11, 2019, 12:07 AM Brayan Perera > wrote: > >> Dear Jason, >> >> >> Thanks for the reply. >> >> We are using python 2.7.5 >> >> Yes. script is based on openstack code. >> >> As suggested, we have tried chunk_size 32 and 64, and both giving same >> incorrect checksum value. >> > > The value of rbd_store_chunk_size in glance is expressed in MB and then > converted to mb. I think the default is 8, so you would want 8192 if you're > trying to match what the image was uploaded with. > Sorry, that should have been "...converted to KB." > >> We tried to copy same image in different pool and resulted same >> incorrect checksum. >> >> >> Thanks & Regards, >> Brayan >> >> On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman >> wrote: >> > >> > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera >> wrote: >> > > >> > > Dear All, >> > > >> > > Ceph Version : 12.2.5-2.ge988fb6.el7 >> > > >> > > We are facing an issue on glance which have backend set to ceph, when >> > > we try to create an instance or volume out of an image, it throws >> > > checksum error. >> > > When we use rbd export and use md5sum, value is matching with glance >> checksum. >> > > >> > > When we use following script, it provides same error checksum as >> glance. >> > >> > What version of Python are you using? >> > >> > > We have used below images for testing. >> > > 1. Failing image (checksum mismatch): >> ffed4088-74e1-4f22-86cb-35e7e97c377c >> > > 2. Passing image (checksum identical): >> c048f0f9-973d-4285-9397-939251c80a84 >> > > >> > > Output from storage node: >> > > >> > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c >> > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 >> > > [root@storage01moc ~]# python test_rbd_format.py >> > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin >> > > Image size: 681181184 >> > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 >> > > Remarks: checksum is different >> > > >> > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 >> > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed >> > > [root@storage01moc ~]# python test_rbd_format.py >> > > c048f0f9-973d-4285-9397-939251c80a84 admin >> > > Image size: 1411121152 >> > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed >> > > Remarks: checksum is identical >> > > >> > > Wondering whether this issue is from ceph python libs or from ceph >> itself. >> > > >> > > Please note that we do not have ceph pool tiering configured. >> > > >> > > Please let us know whether anyone faced similar issue and any fixes >> for this. >> > > >> > > test_rbd_format.py >> > > === >> > > import rados, sys, rbd >> > > >> > > image_id = sys.argv[1] >> > > try: >> > > rados_id = sys.argv[2] >> > > except: >> > > rados_id = 'openstack' >> > > >> > > >> > > class ImageIterator(object): >> > > """ >> > > Reads data from an RBD image, one chunk at a time. >> > > """ >> > > >> > > def __init__(self, conn, pool, name, snapshot, store, >> chunk_size='8'): >> > >> > Am I correct in assuming this was adapted from OpenStack code? That >> > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. >> > Not that it should matter, but does it still fail if you increase this >> > to 32KiB or 64KiB? >> > >> > > self.pool = pool >> > > self.conn = conn >> > > self.name = name >> > > self.snapshot = snapshot >> > > self.chunk_size = chunk_size >> > > self.store = store >> > > >> > > def __iter__(self): >> > > try: >> > > with conn.open_ioctx(self.pool) as ioctx: >> > > with rbd.Image(ioctx, self.name, >> > >snapshot=self.snapshot) as image: >> > > img_info = image.stat() >> > > size = img_info['size'] >> > > bytes_left = size >> > > while bytes_left > 0: >> > > length = min(self.chunk_size, bytes_left) >> > > data = image.read(size - bytes_left, length) >> > > bytes_left -= len(data) >> > > yield data >> > > raise StopIteration() >> > > except rbd.ImageNotFound: >> > > raise exceptions.NotFound( >> > > _('RBD image %s does not exist') % self.name) >> > > >> > > conn = rados.Rados(conffile='/etc/ceph/ceph.conf',rados_id=rados_id) >> > > conn.connect() >> > > >> > > >> > > with conn.open_ioctx('images') as ioctx: >> > > try: >> > > with rbd.Image(ioctx, image_id, >> > >snapshot='snap') as image: >> > > img_info = image.stat() >> > > print "Image size: %s " % img_info['size'] >> > > iter, size = (ImageIterator(conn, 'images', image_id, >> > > 'snap', 'rbd'), img_info['size']) >> > >
Re: [ceph-users] Glance client and RBD export checksum mismatch
On Thu, Apr 11, 2019 at 12:07 AM Brayan Perera wrote: > > Dear Jason, > > > Thanks for the reply. > > We are using python 2.7.5 > > Yes. script is based on openstack code. > > As suggested, we have tried chunk_size 32 and 64, and both giving same > incorrect checksum value. > > We tried to copy same image in different pool and resulted same > incorrect checksum. My best guess is that there is some odd encoding format issues between the raw byte stream and Python strings. Can you tweak your Python code to generate a md5sum for each chunk (let's say 4MiB per chunk to match the object size) and compare that to a 4MiB chunked "md5sum" CLI results from the associated "rbd export" data file (split -b 4194304 --filter=md5sum). That will allow you to isolate the issue down to a specific section of the image. > > Thanks & Regards, > Brayan > > On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman wrote: > > > > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera > > wrote: > > > > > > Dear All, > > > > > > Ceph Version : 12.2.5-2.ge988fb6.el7 > > > > > > We are facing an issue on glance which have backend set to ceph, when > > > we try to create an instance or volume out of an image, it throws > > > checksum error. > > > When we use rbd export and use md5sum, value is matching with glance > > > checksum. > > > > > > When we use following script, it provides same error checksum as glance. > > > > What version of Python are you using? > > > > > We have used below images for testing. > > > 1. Failing image (checksum mismatch): ffed4088-74e1-4f22-86cb-35e7e97c377c > > > 2. Passing image (checksum identical): > > > c048f0f9-973d-4285-9397-939251c80a84 > > > > > > Output from storage node: > > > > > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c > > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 > > > [root@storage01moc ~]# python test_rbd_format.py > > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin > > > Image size: 681181184 > > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 > > > Remarks: checksum is different > > > > > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 > > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed > > > [root@storage01moc ~]# python test_rbd_format.py > > > c048f0f9-973d-4285-9397-939251c80a84 admin > > > Image size: 1411121152 > > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed > > > Remarks: checksum is identical > > > > > > Wondering whether this issue is from ceph python libs or from ceph itself. > > > > > > Please note that we do not have ceph pool tiering configured. > > > > > > Please let us know whether anyone faced similar issue and any fixes for > > > this. > > > > > > test_rbd_format.py > > > === > > > import rados, sys, rbd > > > > > > image_id = sys.argv[1] > > > try: > > > rados_id = sys.argv[2] > > > except: > > > rados_id = 'openstack' > > > > > > > > > class ImageIterator(object): > > > """ > > > Reads data from an RBD image, one chunk at a time. > > > """ > > > > > > def __init__(self, conn, pool, name, snapshot, store, chunk_size='8'): > > > > Am I correct in assuming this was adapted from OpenStack code? That > > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. > > Not that it should matter, but does it still fail if you increase this > > to 32KiB or 64KiB? > > > > > self.pool = pool > > > self.conn = conn > > > self.name = name > > > self.snapshot = snapshot > > > self.chunk_size = chunk_size > > > self.store = store > > > > > > def __iter__(self): > > > try: > > > with conn.open_ioctx(self.pool) as ioctx: > > > with rbd.Image(ioctx, self.name, > > >snapshot=self.snapshot) as image: > > > img_info = image.stat() > > > size = img_info['size'] > > > bytes_left = size > > > while bytes_left > 0: > > > length = min(self.chunk_size, bytes_left) > > > data = image.read(size - bytes_left, length) > > > bytes_left -= len(data) > > > yield data > > > raise StopIteration() > > > except rbd.ImageNotFound: > > > raise exceptions.NotFound( > > > _('RBD image %s does not exist') % self.name) > > > > > > conn = rados.Rados(conffile='/etc/ceph/ceph.conf',rados_id=rados_id) > > > conn.connect() > > > > > > > > > with conn.open_ioctx('images') as ioctx: > > > try: > > > with rbd.Image(ioctx, image_id, > > >snapshot='snap') as image: > > > img_info = image.stat() > > > print "Image size: %s " % img_info['size'] > > > iter, size = (ImageIterator(conn, 'images', image_id, > > > 'snap', 'rbd'), img_info['size']) > > >
Re: [ceph-users] Glance client and RBD export checksum mismatch
On Thu, Apr 11, 2019, 12:07 AM Brayan Perera wrote: > Dear Jason, > > > Thanks for the reply. > > We are using python 2.7.5 > > Yes. script is based on openstack code. > > As suggested, we have tried chunk_size 32 and 64, and both giving same > incorrect checksum value. > The value of rbd_store_chunk_size in glance is expressed in MB and then converted to mb. I think the default is 8, so you would want 8192 if you're trying to match what the image was uploaded with. > We tried to copy same image in different pool and resulted same > incorrect checksum. > > > Thanks & Regards, > Brayan > > On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman > wrote: > > > > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera > wrote: > > > > > > Dear All, > > > > > > Ceph Version : 12.2.5-2.ge988fb6.el7 > > > > > > We are facing an issue on glance which have backend set to ceph, when > > > we try to create an instance or volume out of an image, it throws > > > checksum error. > > > When we use rbd export and use md5sum, value is matching with glance > checksum. > > > > > > When we use following script, it provides same error checksum as > glance. > > > > What version of Python are you using? > > > > > We have used below images for testing. > > > 1. Failing image (checksum mismatch): > ffed4088-74e1-4f22-86cb-35e7e97c377c > > > 2. Passing image (checksum identical): > c048f0f9-973d-4285-9397-939251c80a84 > > > > > > Output from storage node: > > > > > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c > > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 > > > [root@storage01moc ~]# python test_rbd_format.py > > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin > > > Image size: 681181184 > > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 > > > Remarks: checksum is different > > > > > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 > > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed > > > [root@storage01moc ~]# python test_rbd_format.py > > > c048f0f9-973d-4285-9397-939251c80a84 admin > > > Image size: 1411121152 > > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed > > > Remarks: checksum is identical > > > > > > Wondering whether this issue is from ceph python libs or from ceph > itself. > > > > > > Please note that we do not have ceph pool tiering configured. > > > > > > Please let us know whether anyone faced similar issue and any fixes > for this. > > > > > > test_rbd_format.py > > > === > > > import rados, sys, rbd > > > > > > image_id = sys.argv[1] > > > try: > > > rados_id = sys.argv[2] > > > except: > > > rados_id = 'openstack' > > > > > > > > > class ImageIterator(object): > > > """ > > > Reads data from an RBD image, one chunk at a time. > > > """ > > > > > > def __init__(self, conn, pool, name, snapshot, store, > chunk_size='8'): > > > > Am I correct in assuming this was adapted from OpenStack code? That > > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. > > Not that it should matter, but does it still fail if you increase this > > to 32KiB or 64KiB? > > > > > self.pool = pool > > > self.conn = conn > > > self.name = name > > > self.snapshot = snapshot > > > self.chunk_size = chunk_size > > > self.store = store > > > > > > def __iter__(self): > > > try: > > > with conn.open_ioctx(self.pool) as ioctx: > > > with rbd.Image(ioctx, self.name, > > >snapshot=self.snapshot) as image: > > > img_info = image.stat() > > > size = img_info['size'] > > > bytes_left = size > > > while bytes_left > 0: > > > length = min(self.chunk_size, bytes_left) > > > data = image.read(size - bytes_left, length) > > > bytes_left -= len(data) > > > yield data > > > raise StopIteration() > > > except rbd.ImageNotFound: > > > raise exceptions.NotFound( > > > _('RBD image %s does not exist') % self.name) > > > > > > conn = rados.Rados(conffile='/etc/ceph/ceph.conf',rados_id=rados_id) > > > conn.connect() > > > > > > > > > with conn.open_ioctx('images') as ioctx: > > > try: > > > with rbd.Image(ioctx, image_id, > > >snapshot='snap') as image: > > > img_info = image.stat() > > > print "Image size: %s " % img_info['size'] > > > iter, size = (ImageIterator(conn, 'images', image_id, > > > 'snap', 'rbd'), img_info['size']) > > > import six, hashlib > > > md5sum = hashlib.md5() > > > for chunk in iter: > > > if isinstance(chunk, six.string_types): > > > chunk = six.b(chunk) > > > md5sum.update(chunk) > > >
[ceph-users] multi-site between luminous and mimic broke etag
Hi Ceph Users, In our lab on VirtualBox we have installed two Centos7 VMs. One with ceph v12.2.11 and the second one with ceph v13.2.5. We connect them using multi-site (it does not matter witch one is hosting the master zone). On master zone we: created a user, made a bucket, uploaded a file, listed the bucket. Then on the slave zone we: listed the bucket, uploaded a file. After that, listing the bucket on mimic works fine, but on luminous s3cmd reports: "ERROR: Error parsing xml: not well-formed (invalid token): line 1, column 574" which comes to this part of xml: "71c416d096979c7f3657d2927e29dfd3iU" Both cluster in their ceph.conf has those added lines: osd pool default size = 1 osd pool default min size = 1 osd crush chooseleaf type = 0 mon allow pool delete = true rgw zone = test-dc1 # second has test-dc2 Has anyone had a similar issue before? Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Kraken - Pool storage MAX AVAIL drops by 30TB after disk failure
Hi, We have a 5 node EC 4+1 cluster with 335 OSDs running Kraken Bluestore 11.2.0. There was a disk failure on one of the OSDs and the disk was replaced. After which it was noticed that there was a ~30TB drop in the MAX_AVAIL value for the pool storage details on output of 'ceph df' Even though the disk was replaced and the OSD is now running properly, this value did not recover back to the original; also the disk is only a 4TB disk. Hence the drop of ~30TB from the MAX_AVAIL doesn't seem right. Has anyone had a similar issue before? Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com