Re: [ceph-users] Monitor Recovery
are you using ceph-deploy? In that case you could do: ceph-deploy mon destroy {host-name [host-name]...} and: ceph-deploy mon create {host-name [host-name]...} te recreate it. - Original Message - From: "John Petrini" To: "ceph-users" Sent: Tuesday, October 23, 2018 8:22:44 PM Subject: [ceph-users] Monitor Recovery Hi List, I've got a monitor that won't stay up. It comes up and joins the cluster but crashes within a couple of minutes with no info in the logs. At this point I'd prefer to just give up on it and assume it's in a bad state and recover it from the working monitors. What's the best way to go about this? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Crushmap and failure domains at rack level (ideally data-center level in the future)
Something must be wrong, since you have min_size 3 the pool should go read only once you take out the first rack. Probably even when you take out the first host. What is the outputput of ceph osd pool get min_size ? I guess it will be 2, since you did not hit a problem while taking out one rack. as soon as you take out an extra host after that, some PG's will have a size < 2, so the pool goes read-only and recovery should start. Once all pg's are back to size 2, the pool would be RW again. From: "Waterbly, Dan" To: "ceph-users" Sent: Tuesday, October 23, 2018 6:44:52 PM Subject: [ceph-users] Crushmap and failure domains at rack level (ideally data-center level in the future) Hello, I want to create a crushmap rule where I can lose two racks of hosts and still be able to operate. I have tried the rule below, but it only allows me to operate (rados gateway) with one rack down and two racks up. If I lose any host in the two remaining racks my rados gateway stops responding. Here is my crushmap and rule. If anyone can point out what I am doing wrong it would be greatly appreciated. I’m very new to ceph so please forgive any incorrect terminology I have used. # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd device 6 osd.6 class hdd device 7 osd.7 class hdd device 8 osd.8 class hdd device 9 osd.9 class hdd device 10 osd.10 class hdd device 11 osd.11 class hdd device 12 osd.12 class hdd device 13 osd.13 class hdd device 14 osd.14 class hdd device 15 osd.15 class hdd device 16 osd.16 class hdd device 17 osd.17 class hdd device 18 osd.18 class hdd device 19 osd.19 class hdd device 20 osd.20 class hdd device 21 osd.21 class hdd device 22 osd.22 class hdd device 23 osd.23 class hdd device 24 osd.24 class hdd device 25 osd.25 class hdd device 26 osd.26 class hdd device 27 osd.27 class hdd device 28 osd.28 class hdd device 29 osd.29 class hdd device 30 osd.30 class hdd device 31 osd.31 class hdd device 32 osd.32 class hdd device 33 osd.33 class hdd device 34 osd.34 class hdd device 35 osd.35 class hdd device 36 osd.36 class hdd device 37 osd.37 class hdd device 38 osd.38 class hdd device 39 osd.39 class hdd device 40 osd.40 class hdd device 41 osd.41 class hdd device 42 osd.42 class hdd device 43 osd.43 class hdd device 44 osd.44 class hdd device 45 osd.45 class hdd device 46 osd.46 class hdd device 47 osd.47 class hdd device 48 osd.48 class hdd device 49 osd.49 class hdd device 50 osd.50 class hdd device 51 osd.51 class hdd device 52 osd.52 class hdd device 53 osd.53 class hdd device 54 osd.54 class hdd device 55 osd.55 class hdd device 56 osd.56 class hdd device 57 osd.57 class hdd device 58 osd.58 class hdd device 59 osd.59 class hdd device 60 osd.60 class hdd device 61 osd.61 class hdd device 62 osd.62 class hdd device 63 osd.63 class hdd device 64 osd.64 class hdd device 65 osd.65 class hdd device 66 osd.66 class hdd device 67 osd.67 class hdd device 68 osd.68 class hdd device 69 osd.69 class hdd device 70 osd.70 class hdd device 71 osd.71 class hdd device 72 osd.72 class hdd device 73 osd.73 class hdd device 74 osd.74 class hdd device 75 osd.75 class hdd device 76 osd.76 class hdd device 77 osd.77 class hdd device 78 osd.78 class hdd device 79 osd.79 class hdd device 80 osd.80 class hdd device 81 osd.81 class hdd device 82 osd.82 class hdd device 83 osd.83 class hdd device 84 osd.84 class hdd device 85 osd.85 class hdd device 86 osd.86 class hdd device 87 osd.87 class hdd device 88 osd.88 class hdd device 89 osd.89 class hdd device 90 osd.90 class hdd device 91 osd.91 class hdd device 92 osd.92 class hdd device 93 osd.93 class hdd device 94 osd.94 class hdd device 95 osd.95 class hdd device 96 osd.96 class hdd device 97 osd.97 class hdd device 98 osd.98 class hdd device 99 osd.99 class hdd device 100 osd.100 class hdd device 101 osd.101 class hdd device 102 osd.102 class hdd device 103 osd.103 class hdd device 104 osd.104 class hdd device 105 osd.105 class hdd device 106 osd.106 class hdd device 107 osd.107 class hdd device 108 osd.108 class hdd device 109 osd.109 class hdd device 110 osd.110 class hdd device 111 osd.111 class hdd device 112 osd.112 class hdd device 113 osd.113 class hdd device 114 osd.114 class hdd device 115 osd.115 class hdd device 116 osd.116 class hdd device 117 osd.117 class hdd device 118 osd.118 class hdd device 119 osd.119
[ceph-users] Monitor Recovery
Hi List, I've got a monitor that won't stay up. It comes up and joins the cluster but crashes within a couple of minutes with no info in the logs. At this point I'd prefer to just give up on it and assume it's in a bad state and recover it from the working monitors. What's the best way to go about this? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Crushmap and failure domains at rack level (ideally data-center level in the future)
Hello, I want to create a crushmap rule where I can lose two racks of hosts and still be able to operate. I have tried the rule below, but it only allows me to operate (rados gateway) with one rack down and two racks up. If I lose any host in the two remaining racks my rados gateway stops responding. Here is my crushmap and rule. If anyone can point out what I am doing wrong it would be greatly appreciated. I'm very new to ceph so please forgive any incorrect terminology I have used. # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd device 6 osd.6 class hdd device 7 osd.7 class hdd device 8 osd.8 class hdd device 9 osd.9 class hdd device 10 osd.10 class hdd device 11 osd.11 class hdd device 12 osd.12 class hdd device 13 osd.13 class hdd device 14 osd.14 class hdd device 15 osd.15 class hdd device 16 osd.16 class hdd device 17 osd.17 class hdd device 18 osd.18 class hdd device 19 osd.19 class hdd device 20 osd.20 class hdd device 21 osd.21 class hdd device 22 osd.22 class hdd device 23 osd.23 class hdd device 24 osd.24 class hdd device 25 osd.25 class hdd device 26 osd.26 class hdd device 27 osd.27 class hdd device 28 osd.28 class hdd device 29 osd.29 class hdd device 30 osd.30 class hdd device 31 osd.31 class hdd device 32 osd.32 class hdd device 33 osd.33 class hdd device 34 osd.34 class hdd device 35 osd.35 class hdd device 36 osd.36 class hdd device 37 osd.37 class hdd device 38 osd.38 class hdd device 39 osd.39 class hdd device 40 osd.40 class hdd device 41 osd.41 class hdd device 42 osd.42 class hdd device 43 osd.43 class hdd device 44 osd.44 class hdd device 45 osd.45 class hdd device 46 osd.46 class hdd device 47 osd.47 class hdd device 48 osd.48 class hdd device 49 osd.49 class hdd device 50 osd.50 class hdd device 51 osd.51 class hdd device 52 osd.52 class hdd device 53 osd.53 class hdd device 54 osd.54 class hdd device 55 osd.55 class hdd device 56 osd.56 class hdd device 57 osd.57 class hdd device 58 osd.58 class hdd device 59 osd.59 class hdd device 60 osd.60 class hdd device 61 osd.61 class hdd device 62 osd.62 class hdd device 63 osd.63 class hdd device 64 osd.64 class hdd device 65 osd.65 class hdd device 66 osd.66 class hdd device 67 osd.67 class hdd device 68 osd.68 class hdd device 69 osd.69 class hdd device 70 osd.70 class hdd device 71 osd.71 class hdd device 72 osd.72 class hdd device 73 osd.73 class hdd device 74 osd.74 class hdd device 75 osd.75 class hdd device 76 osd.76 class hdd device 77 osd.77 class hdd device 78 osd.78 class hdd device 79 osd.79 class hdd device 80 osd.80 class hdd device 81 osd.81 class hdd device 82 osd.82 class hdd device 83 osd.83 class hdd device 84 osd.84 class hdd device 85 osd.85 class hdd device 86 osd.86 class hdd device 87 osd.87 class hdd device 88 osd.88 class hdd device 89 osd.89 class hdd device 90 osd.90 class hdd device 91 osd.91 class hdd device 92 osd.92 class hdd device 93 osd.93 class hdd device 94 osd.94 class hdd device 95 osd.95 class hdd device 96 osd.96 class hdd device 97 osd.97 class hdd device 98 osd.98 class hdd device 99 osd.99 class hdd device 100 osd.100 class hdd device 101 osd.101 class hdd device 102 osd.102 class hdd device 103 osd.103 class hdd device 104 osd.104 class hdd device 105 osd.105 class hdd device 106 osd.106 class hdd device 107 osd.107 class hdd device 108 osd.108 class hdd device 109 osd.109 class hdd device 110 osd.110 class hdd device 111 osd.111 class hdd device 112 osd.112 class hdd device 113 osd.113 class hdd device 114 osd.114 class hdd device 115 osd.115 class hdd device 116 osd.116 class hdd device 117 osd.117 class hdd device 118 osd.118 class hdd device 119 osd.119 class hdd device 120 osd.120 class hdd device 121 osd.121 class hdd device 122 osd.122 class hdd device 123 osd.123 class hdd device 124 osd.124 class hdd device 125 osd.125 class hdd device 126 osd.126 class hdd device 127 osd.127 class hdd device 128 osd.128 class hdd device 129 osd.129 class hdd device 130 osd.130 class hdd device 131 osd.131 class hdd device 132 osd.132 class hdd device 133 osd.133 class hdd device 134 osd.134 class hdd device 135 osd.135 class hdd device 136 osd.136 class hdd device 137 osd.137 class hdd device 138 osd.138 class hdd device 139 osd.139 class hdd device 140 osd.140 class hdd device 141 osd.141 class hdd device 142 osd.142 class hdd device 143 osd.143 class hdd # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host cephstorage02 { id -3 # do not change unnecessarily id -4 class hdd #
Re: [ceph-users] bluestore compression enabled but no data compressed
Hi Frank, On 10/23/2018 2:56 PM, Frank Schilder wrote: Dear David and Igor, thank you very much for your help. I have one more question about chunk sizes and data granularity on bluestore and will summarize the information I got on bluestore compression at the end. 1) Compression ratio --- Following Igor's explanation, I tried to understand the numbers for compressed_allocated and compressed_original and am somewhat stuck with figuring out how bluestore arithmetic works. I created a 32GB file of zeros using dd with write size bs=8M on a cephfs with ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=con-fs-data-test" The data pool is an 8+2 erasure coded pool with properties pool 37 'con-fs-data-test' erasure size 10 min_size 9 crush_rule 11 object_hash rjenkins pg_num 900 pgp_num 900 last_change 9970 flags hashpspool,ec_overwrites stripe_width 32768 compression_mode aggressive application cephfs As I understand EC pools, a 4M object is split into 8x0.5M data shards that are stored together with 2x0.5M coding shards on one OSD each. So, I would expect a full object write to put a 512K chunk on each OSD in the PG. Looking at some config options of one of the OSDs, I see: "bluestore_compression_max_blob_size_hdd": "524288", "bluestore_compression_min_blob_size_hdd": "131072", "bluestore_max_blob_size_hdd": "524288", "bluestore_min_alloc_size_hdd": "65536", From this, I would conclude that the largest chunk size is 512K, which also equals compression_max_blob_size. The minimum allocation size is 64K for any object. What I would expect now is, that the full object writes to cephfs create chunk sizes of 512M per OSD in the PG, meaning that with an all-zero file I should observe a compresses_allocated ratio of 64K/512K=0.125 instead of the 0.5 reported below. It looks like that chunks of 128K are written instead of 512K. I'm happy with the 64K granularity, but the observed maximum chunk size seems a factor of 4 too small. Where am I going wrong, what am I overlooking? Please note how selection whether to use compression_max_blob_size or compression_min_blob_size is performed. Max blob size threshold is mainly for objects that are tagged with flags indicating non-random access, e.g. sequential read and/or write, immutable, append-only etc. Here is how it's determined in the code: if ((alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_SEQUENTIAL_READ) && (alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_RANDOM_READ) == 0 && (alloc_hints & (CEPH_OSD_ALLOC_HINT_FLAG_IMMUTABLE | CEPH_OSD_ALLOC_HINT_FLAG_APPEND_ONLY)) && (alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_RANDOM_WRITE) == 0) { dout(20) << __func__ << " will prefer large blob and csum sizes" << dendl; This is done to minimize the overhead during future random access since it will need full blob decompression. Hence min blob size is used for regular random I/O. Which is probably you case as well. You can check bluestore log (once its level is raised to 20) to confirm this. E.g. by looking for the following line output: dout(20) << __func__ << " prefer csum_order " << wctx->csum_order << " target_blob_size 0x" << std::hex << wctx->target_blob_size << std::dec << dendl; So you can simply increase bluestore_compression_min_blob_size_hdd if you want longer compressed chunks. With the above-mentioned penalty on subsequent access though. 2) Bluestore compression configuration --- If I understand David correctly, pool and OSD settings do *not* override each other, but are rather *combined* into a resulting setting as follows. Let 0 - (n)one 1 - (p)assive 2 - (a)ggressive 3 - (f)orce ? - (u)nset be the 4+1 possible settings of compression modes with numeric values assigned as shown. Then, the resulting numeric compression mode for data in a pool on a specific OSD is res_compr_mode = min(mode OSD, mode pool) or in form of a table: pool | n p a f u --+-- n | n n n n n O p | n p p p ? S a | n p a a ? D f | n p a f ? u | n ? ? ? u which would allow for the flexible configuration as mentioned by David below. I'm actually not sure if I can confirm this. I have some pools where compression_mode is not set and which reside on separate OSDs with compression enabled, yet there is compressed data on these OSDs. Wondering if I polluted my test with "ceph config set bluestore_compression_mode aggressive" that I executed earlier, or if my above interpretation is still wrong. Does the setting issued with "ceph config set bluestore_compression_mode aggressive" apply to pools with 'compression_mode' not set on the pool (see question marks in table above, what is the resulting mode?). What I would like to do is
Re: [ceph-users] [ceph-ansible]Purging cluster using ceph-ansible stable 3.1/3.2
Hi Mark, Thank you for pointing out the issue. The problem is solved after I added "library= ~/.ansible/plugins/modules:/usr/share/ansible/plugins/modules:/root/ceph-ansible/library" into the /root/ceph-ansible/ansible.cfg file. The "library" key wasn't there in the first place and the result of running "ansible --version" from the /root/ceph-ansible directory initially showed the first two paths only. Thank you very much. Best regards, Cody On Tue, Oct 23, 2018 at 9:51 AM Mark Johnston wrote: > > On Mon, 2018-10-22 at 20:05 -0400, Cody wrote: > > I tried to purge a ceph cluster using infrastructure-playbooks/purge- > > cluster.yml from stable 3.1 and stable 3.2 branches, but kept getting the > > following error immediately: > > > > ERROR! no action detected in task. This often indicates a misspelled module > > name, or incorrect module path. > > > > The error appears to have been in '/root/ceph-ansible/infrastructure- > > playbooks/purge-cluster.yml': line 353, column 5, but may > > be elsewhere in the file depending on the exact syntax problem. > > > > The offending line appears to be: > > > > - name: zap and destroy osds created by ceph-volume with lvm_volumes > > ^ here > > That's Ansible's way of saying "the module referenced in this task doesn't > exist". In this case it can't find the ceph_volume module, which is packaged > with the ceph-ansible distribution. It should find it if you're running > Ansible > from the /root/ceph-ansible directory. > > Try running "ansible --version" and check what's shown for "config file" and > "configured module search path". You should have /root/ceph-ansible/library > on the module search path. > > > Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock
Quoting Patrick Donnelly (pdonn...@redhat.com): > Thanks for the detailed notes. It looks like the MDS is stuck > somewhere it's not even outputting any log messages. If possible, it'd > be helpful to get a coredump (e.g. by sending SIGQUIT to the MDS) or, > if you're comfortable with gdb, a backtrace of any threads that look > suspicious (e.g. not waiting on a futex) including `info threads`. It took a while before the same issue reappeared again ... but we managed to catch gdb backtraces and strace output. See below pastebin links. Note: we had difficulty getting the MDSs working again, so we had to restart them a couple of times, capturing debug output as much as we can. Hopefully you can squeeze some useful information out of this data. MDS1: https://8n1.org/13869/bc3b - Some few minutes after it first started acting up https://8n1.org/13870/caf4 - Probably made when I tried to stop the process and it took too long (process already received SIGKILL) https://8n1.org/13871/2f22 - After restarting the same issue returned https://8n1.org/13872/2246 - After restarting the same issue returned MDS2: https://8n1.org/13873/f861 - After it went craycray when it became active https://8n1.org/13874/c567 - After restarting the same issue returned https://8n1.org/13875/133a - After restarting the same issue returned STRACES: MDS1: https://8n1.org/mds1-strace.zip MDS2: https://8n1.org/mds2-strace.zip Gr. Stefan -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [ceph-ansible]Purging cluster using ceph-ansible stable 3.1/3.2
On Mon, 2018-10-22 at 20:05 -0400, Cody wrote: > I tried to purge a ceph cluster using infrastructure-playbooks/purge- > cluster.yml from stable 3.1 and stable 3.2 branches, but kept getting the > following error immediately: > > ERROR! no action detected in task. This often indicates a misspelled module > name, or incorrect module path. > > The error appears to have been in '/root/ceph-ansible/infrastructure- > playbooks/purge-cluster.yml': line 353, column 5, but may > be elsewhere in the file depending on the exact syntax problem. > > The offending line appears to be: > > - name: zap and destroy osds created by ceph-volume with lvm_volumes > ^ here That's Ansible's way of saying "the module referenced in this task doesn't exist". In this case it can't find the ceph_volume module, which is packaged with the ceph-ansible distribution. It should find it if you're running Ansible from the /root/ceph-ansible directory. Try running "ansible --version" and check what's shown for "config file" and "configured module search path". You should have /root/ceph-ansible/library on the module search path. Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] bluestore compression enabled but no data compressed
Dear David and Igor, thank you very much for your help. I have one more question about chunk sizes and data granularity on bluestore and will summarize the information I got on bluestore compression at the end. 1) Compression ratio --- Following Igor's explanation, I tried to understand the numbers for compressed_allocated and compressed_original and am somewhat stuck with figuring out how bluestore arithmetic works. I created a 32GB file of zeros using dd with write size bs=8M on a cephfs with ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=con-fs-data-test" The data pool is an 8+2 erasure coded pool with properties pool 37 'con-fs-data-test' erasure size 10 min_size 9 crush_rule 11 object_hash rjenkins pg_num 900 pgp_num 900 last_change 9970 flags hashpspool,ec_overwrites stripe_width 32768 compression_mode aggressive application cephfs As I understand EC pools, a 4M object is split into 8x0.5M data shards that are stored together with 2x0.5M coding shards on one OSD each. So, I would expect a full object write to put a 512K chunk on each OSD in the PG. Looking at some config options of one of the OSDs, I see: "bluestore_compression_max_blob_size_hdd": "524288", "bluestore_compression_min_blob_size_hdd": "131072", "bluestore_max_blob_size_hdd": "524288", "bluestore_min_alloc_size_hdd": "65536", >From this, I would conclude that the largest chunk size is 512K, which also >equals compression_max_blob_size. The minimum allocation size is 64K for any >object. What I would expect now is, that the full object writes to cephfs >create chunk sizes of 512M per OSD in the PG, meaning that with an all-zero >file I should observe a compresses_allocated ratio of 64K/512K=0.125 instead >of the 0.5 reported below. It looks like that chunks of 128K are written >instead of 512K. I'm happy with the 64K granularity, but the observed maximum >chunk size seems a factor of 4 too small. Where am I going wrong, what am I overlooking? 2) Bluestore compression configuration --- If I understand David correctly, pool and OSD settings do *not* override each other, but are rather *combined* into a resulting setting as follows. Let 0 - (n)one 1 - (p)assive 2 - (a)ggressive 3 - (f)orce ? - (u)nset be the 4+1 possible settings of compression modes with numeric values assigned as shown. Then, the resulting numeric compression mode for data in a pool on a specific OSD is res_compr_mode = min(mode OSD, mode pool) or in form of a table: pool | n p a f u --+-- n | n n n n n O p | n p p p ? S a | n p a a ? D f | n p a f ? u | n ? ? ? u which would allow for the flexible configuration as mentioned by David below. I'm actually not sure if I can confirm this. I have some pools where compression_mode is not set and which reside on separate OSDs with compression enabled, yet there is compressed data on these OSDs. Wondering if I polluted my test with "ceph config set bluestore_compression_mode aggressive" that I executed earlier, or if my above interpretation is still wrong. Does the setting issued with "ceph config set bluestore_compression_mode aggressive" apply to pools with 'compression_mode' not set on the pool (see question marks in table above, what is the resulting mode?). What I would like to do is enable compression on all OSDs, enable compression on all data pools and disable compression on all meta data pools. Data and meta data pools might share OSDs in the future. The above table says I should be able to do just that by being explicit. Many thanks again and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Igor Fedotov Sent: 19 October 2018 23:41 To: Frank Schilder; David Turner Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] bluestore compression enabled but no data compressed Hi Frank, On 10/19/2018 2:19 PM, Frank Schilder wrote: > Hi David, > > sorry for the slow response, we had a hell of a week at work. > > OK, so I had compression mode set to aggressive on some pools, but the global > option was not changed, because I interpreted the documentation as "pool > settings take precedence". To check your advise, I executed > >ceph tell "osd.*" config set bluestore_compression_mode aggressive > > and dumped a new file consisting of null-bytes. Indeed, this time I observe > compressed objects: > > [root@ceph-08 ~]# ceph daemon osd.80 perf dump | grep blue > "bluefs": { > "bluestore": { > "bluestore_allocated": 2967207936, > "bluestore_stored": 3161981179, > "bluestore_compressed": 24549408, > "bluestore_compressed_allocated": 261095424, > "bluestore_compressed_original": 522190848, > >
Re: [ceph-users] scrub errors
There is an osd_scrub_auto_repair setting which defaults to 'false'. > On 23.10.2018, at 12:12, Dominque Roux wrote: > > Hi all, > > We lately faced several scrub errors. > All of them were more or less easily fixed with the ceph pg repair X.Y > command. > > We're using ceph version 12.2.7 and have SSD and HDD pools. > > Is there a way to prevent our datastore from these kind of errors, or is > there a way to automate the fix (It would be rather easy to create a > bash script) > > Thank you very much for your help! > > Best regards, > > Dominique > > -- > Your Swiss, Open Source and IPv6 Virtual Machine. Now on > www.datacenterlight.ch > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] scrub errors
Hi all, We lately faced several scrub errors. All of them were more or less easily fixed with the ceph pg repair X.Y command. We're using ceph version 12.2.7 and have SSD and HDD pools. Is there a way to prevent our datastore from these kind of errors, or is there a way to automate the fix (It would be rather easy to create a bash script) Thank you very much for your help! Best regards, Dominique -- Your Swiss, Open Source and IPv6 Virtual Machine. Now on www.datacenterlight.ch ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW stale buckets
When you run rgw it creates a ton of pools, so one of the other pools were holding the indexes of what buckets there are, and the actual data is what got stored in default.rgw.data (or whatever name it had), so that cleanup was not complete and this is what causes your issues, I'd say. How to move from here depends on how much work/data you have put into the badly-cleaned-pools and if you can redo the last part again after a good clean restart. Den tis 23 okt. 2018 kl 00:27 skrev Robert Stanford : > > > Someone deleted our rgw data pool to clean up. They recreated it afterward. > This is fine in one respect, we don't need the data. But listing with > radosgw-admin still shows all the buckets. How can we clean things up and > get rgw to understand what actually exists, and what doesn't? > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] slow requests and degraded cluster, but not really ?
Hello all, We have an issue with our ceph cluster where 'ceph -s' shows that several requests are blocked, however querying further with 'ceph health detail' indicates that the PGs affected are either active+clean or do not currently exist. OSD 32 appears to be working fine, and the cluster is performing as expected with no clients seemingly affected. Note - we had just upgraded to Luminous - and despite having "mon max pg per osd = 400" set in ceph.conf, we still have the message "too many PGs per OSD (278 > max 200)" In order to improve the situation above, I removed several pools that were not used anymore. I assume the PGs that ceph cannot find now are related to this pool deletion. Does anyone have any ideas on how to get out of this state? Details below - and full 'ceph health detail' attached to this email. Kind regards, Ben Morrice [root@ceph03 ~]# ceph -s cluster: id: 6c21c4ba-9c4d-46ef-93a3-441b8055cdc6 health: HEALTH_WARN Degraded data redundancy: 443765/14311983 objects degraded (3.101%), 162 pgs degraded, 241 pgs undersized 75 slow requests are blocked > 32 sec. Implicated osds 32 too many PGs per OSD (278 > max 200) services: mon: 5 daemons, quorum bbpocn01,bbpocn02,bbpocn03,bbpocn04,bbpocn07 mgr: bbpocn03(active, starting) osd: 36 osds: 36 up, 36 in rgw: 1 daemon active data: pools: 24 pools, 3440 pgs objects: 4.77M objects, 7.69TiB usage: 23.1TiB used, 104TiB / 127TiB avail pgs: 443765/14311983 objects degraded (3.101%) 3107 active+clean 170 active+undersized 109 active+undersized+degraded 43 active+recovery_wait+degraded 10 active+recovering+degraded 1 active+recovery_wait [root@ceph03 ~]# for i in `ceph health detail |grep stuck | awk '{print $2}'`; do echo -n "$i: " ; ceph pg $i query -f plain | cut -d: -f2 | cut -d\" -f2; done 150.270: active+clean 150.2a0: active+clean 150.2b6: active+clean 150.2c2: active+clean 150.2cc: active+clean 150.2d5: active+clean 150.2d6: active+clean 150.2e1: active+clean 150.2ef: active+clean 150.2f5: active+clean 150.2f7: active+clean 150.2fc: active+clean 150.315: active+clean 150.318: active+clean 150.31a: active+clean 150.320: active+clean 150.326: active+clean 150.36e: active+clean 150.380: active+clean 150.389: active+clean 150.3a4: active+clean 150.3ad: active+clean 150.3b4: active+clean 150.3bb: active+clean 150.3ce: active+clean 150.3d0: active+clean 150.3d8: active+clean 150.3e0: active+clean 150.3f6: active+clean 165.24c: Error ENOENT: problem getting command descriptions from pg.165.24c 165.28f: Error ENOENT: problem getting command descriptions from pg.165.28f 165.2b3: Error ENOENT: problem getting command descriptions from pg.165.2b3 165.2b4: Error ENOENT: problem getting command descriptions from pg.165.2b4 165.2d6: Error ENOENT: problem getting command descriptions from pg.165.2d6 165.2f4: Error ENOENT: problem getting command descriptions from pg.165.2f4 165.2fd: Error ENOENT: problem getting command descriptions from pg.165.2fd 165.30f: Error ENOENT: problem getting command descriptions from pg.165.30f 165.322: Error ENOENT: problem getting command descriptions from pg.165.322 165.325: Error ENOENT: problem getting command descriptions from pg.165.325 165.334: Error ENOENT: problem getting command descriptions from pg.165.334 165.36e: Error ENOENT: problem getting command descriptions from pg.165.36e 165.37c: Error ENOENT: problem getting command descriptions from pg.165.37c 165.382: Error ENOENT: problem getting command descriptions from pg.165.382 165.387: Error ENOENT: problem getting command descriptions from pg.165.387 165.3af: Error ENOENT: problem getting command descriptions from pg.165.3af 165.3da: Error ENOENT: problem getting command descriptions from pg.165.3da 165.3e0: Error ENOENT: problem getting command descriptions from pg.165.3e0 165.3e2: Error ENOENT: problem getting command descriptions from pg.165.3e2 165.3e9: Error ENOENT: problem getting command descriptions from pg.165.3e9 165.3fb: Error ENOENT: problem getting command descriptions from pg.165.3fb [root@ceph03 ~]# ceph pg 165.24c query Error ENOENT: problem getting command descriptions from pg.165.24c [root@ceph03 ~]# ceph pg 165.24c delete Error ENOENT: problem getting command descriptions from pg.165.24c -- Kind regards, Ben Morrice __ Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670 EPFL / BBP Biotech Campus Chemin des Mines 9 1202 Geneva Switzerland HEALTH_WARN Degraded data redundancy: 443765/14311983 objects degraded (3.101%), 162 pgs degraded, 241 pgs undersized; 75 slow requests are blocked > 32 sec. Implicated osds 32; too many PGs per OSD (278 > max 200) pg 150.270 is stuck undersized for 1871.987162, current state active+undersized, last acting [17,30] pg 150.2a0 is