[ceph-users] ask about "recovery optimazation:recovery what isreally modified"
yaoning, haomai, Json what about the "recovery what is really modified" feature? I didn't see any update on github recently, will it be further developed? https://github.com/ceph/ceph/pull/3837 (PG:: recovery optimazation: recovery what is really modified) Thanks a lot. donglifec...@gmail.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CRC mismatch detection on read (XFS OSD)
On Fri, Jul 28, 2017 at 8:16 AM Дмитрий Глушенок wrote: > Hi! > > Just found strange thing while testing deep-scrub on 10.2.7. > 1. Stop OSD > 2. Change primary copy's contents (using vi) > 3. Start OSD > > Then 'rados get' returns "No such file or directory". No error messages > seen in OSD log, cluster status "HEALTH_OK". > > 4. ceph pg repair > > Then 'rados get' works as expected, "currupted" data repaired. > > One time (I was unable to reproduce this) the error was detected on-fly > (without OSD restart): > > 2017-07-28 17:34:22.362968 7ff8bfa27700 -1 log_channel(cluster) log [ERR] > : 16.d full-object read crc 0x78fcc738 != expected 0x5fd86d3e on > 16:b36845b2:::testobject1:head > > Am I missed that CRC storing/verifying started to work on XFS? If so, > where the are stored? xattr? I thought it was only implemented in Bluestore. > FileStore maintains CRC checksums opportunistically, such as when you do a full-object write. So in some circumstances it can detect objects with the wrong data and do repairs on its own. (And the checksum is stored in the object_info, which is written down in an xattr, yes.) I'm not certain why overwriting the file with vi made it return ENOENT, but probably because it lost the xattrs storing metadata. (...though I'd expect that to return an error on the primary that either prompts it to repair, or else incorrectly returns that raw error to the client. Can you create a ticket with exactly what steps you followed and what outcome you saw?) -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph osd safe to remove
Hello Dan, Based on what I know and what people told me on IRC, this means basicaly the condition that the osd is not acting nor up for any pg. And for one person (fusl on irc) that said there was a unfound objects bug when he had size = 1, also he said if reweight (and I assume crush weight) is 0, it will surely be safe, but possibly it won't be otherwise. And so here I took my bc-ceph-reweight-by-utilization.py script that already parses `ceph pg dump --format=json` (for up,acting,bytes,count of pgs) and `ceph osd df --format=json` (for weight and reweight), and gutted out the unneeded parts, and changed the report to show the condition I described as True or False per OSD. So the ceph auth needs to allow ceph pg dump and ceph osd df. The script is attached. The script doesn't assume you're ok with acting lower than size, or care about min_size, and just assumes you want the OSD completely empty. Sample output: Real cluster: > root@cephtest:~ # ./bc-ceph-empty-osds.py -a > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new > empty > 0 4.00099 0.61998 38 1221853911536 38 1221853911536 > False > 1 4.00099 0.59834 43 1168531341347 43 1168531341347 > False > 2 4.00099 0.79213 44 1155260814435 44 1155260814435 > False > 27 4.00099 0.69459 39 1210145117377 39 1210145117377 > False > 30 6.00099 0.73933 56 1691992924542 56 1691992924542 > False > 31 6.00099 0.81180 64 1810503842054 64 1810503842054 > False > ... Test cluster with some -nan and 0's in crush map: > root@tceph1:~ # ceph osd df > ID WEIGHT REWEIGHT SIZE USEAVAIL %USE VAR PGS > 4 1.00 0 0 0 -nan -nan 0 > 1 0.06439 1.0 61409M 98860k 61313M 0.16 0.93 47 > 0 0.06438 1.0 61409M 134M 61275M 0.22 1.29 59 > 2 0.06439 1.0 61409M 82300k 61329M 0.13 0.77 46 > 3 00 0 0 0 -nan -nan 0 > TOTAL 179G 311M 179G 0.17 > MIN/MAX VAR: 0.77/1.29 STDDEV: 0.04 > root@tceph1:~ # ./bc-ceph-empty-osds.py > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new > empty > 3 0.0 0.0 0 0 0 0 True > 4 1.0 0.0 0 0 0 0 True > root@tceph1:~ # ./bc-ceph-empty-osds.py -a > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new > empty > 0 0.06438 1.0 59 46006167 59 46006167 > False > 1 0.06439 1.0 47 28792306 47 28792306 > False > 2 0.06439 1.0 46 17623485 46 17623485 > False > 3 0.0 0.0 0 0 0 0 True > 4 1.0 0.0 0 0 0 0 True The "old" vs "new" suffixes refer to the position of data now and after recovery is complete, respectively. (the magic that made my reweight script efficient compared to the official reweight script) And I have not used such a method in the past... my cluster is small, so I have always just let recovery completely finish instead. I hope you find it useful and it develops from there. Peter On 07/28/17 15:36, Dan van der Ster wrote: > Hi all, > > We are trying to outsource the disk replacement process for our ceph > clusters to some non-expert sysadmins. > We could really use a tool that reports if a Ceph OSD *would* or > *would not* be safe to stop, e.g. > > # ceph-osd-safe-to-stop osd.X > Yes it would be OK to stop osd.X > > (which of course means that no PGs would go inactive if osd.X were to > be stopped). > > Does anyone have such a script that they'd like to share? > > Thanks! > > Dan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com #!/usr/bin/env python3 # # tells you if an osd is empty (no pgs up or acting, and no weight) # (most of the code here was copied from bc-ceph-reweight-by-utilization.py) # # Author: Peter Maloney # Licensed GNU GPLv2; if you did not recieve a copy of the license, get one at http://www.gnu.org/licenses/gpl-2.0.html import sys import subprocess import re import argparse import time import logging import json # # global variables # osds = {} health = "" json_nan_regex = None # # logging # logging.VERBOSE = 15 def log_verbose(self, message, *args, **kws): if self.isEnabledFor(logging.VERBOSE): self.log(logging.VERBOSE, message, *args, **kws) logging.addLevelName(logging.VERBOSE, "VERBOSE") logging.Logger.verbose = log_verbose formatter = logging.Formatter( fmt='%(asctime)-15s.%(msecs)03d %(levelname)s: %(message)s', datefmt="%Y-%m-%d %H:%M:%S" ) handler = logging.StreamHandler() handler.setFormatter(formatter) l
Re: [ceph-users] ceph osd safe to remove
Hello Dan, Something like this maybe? https://github.com/CanonicalLtd/ceph_safe_disk Cheers, Alex 2017-07-28 9:36 GMT-04:00 Dan van der Ster : > Hi all, > > We are trying to outsource the disk replacement process for our ceph > clusters to some non-expert sysadmins. > We could really use a tool that reports if a Ceph OSD *would* or > *would not* be safe to stop, e.g. > > # ceph-osd-safe-to-stop osd.X > Yes it would be OK to stop osd.X > > (which of course means that no PGs would go inactive if osd.X were to > be stopped). > > Does anyone have such a script that they'd like to share? > > Thanks! > > Dan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CRC mismatch detection on read (XFS OSD)
Hi! Just found strange thing while testing deep-scrub on 10.2.7. 1. Stop OSD 2. Change primary copy's contents (using vi) 3. Start OSD Then 'rados get' returns "No such file or directory". No error messages seen in OSD log, cluster status "HEALTH_OK". 4. ceph pg repair Then 'rados get' works as expected, "currupted" data repaired. One time (I was unable to reproduce this) the error was detected on-fly (without OSD restart): 2017-07-28 17:34:22.362968 7ff8bfa27700 -1 log_channel(cluster) log [ERR] : 16.d full-object read crc 0x78fcc738 != expected 0x5fd86d3e on 16:b36845b2:::testobject1:head Am I missed that CRC storing/verifying started to work on XFS? If so, where the are stored? xattr? I thought it was only implemented in Bluestore. -- Dmitry Glushenok Jet Infosystems ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Networking/naming doubt
Hi David, Thanks a lot for your comments! I just want to utilize a different network than the public one (where dns resolves the name) for ceph-deploy and client connections. For example with 3 nics: Nic1: Public (internet acces) Nic2: Ceph-mon (clients and ceph-deploy) Nic3: Ceph-osd Thanks a lot for your help! El 28 jul. 2017 2:25 a. m., "David Turner" escribió: The only thing that is supposed to use the cluster network are the OSDs. Not even the MONs access the cluster network. I am sure that if you have a need to make this work that you can find a way, but I don't know that one exists in the standard tool set. You might try temporarily setting the /etc/hosts reference for vdicnode02 and vdicnode03 to the cluster network and use the proper hosts name in the ceph-deploy command. Ceph cluster operations do not use dns at all, so you could probably leave your /etc/hosts in this state. I don't know if it would work though. It's really not intended for any communication to happen on this subnet other than inter-OSD traffic. On Thu, Jul 27, 2017 at 6:31 PM Oscar Segarra wrote: > Sorry! I'd like to add that I want to use the cluster network for both > purposes: > > ceph-deploy --username vdicceph new vdicnode01 --cluster-network > 192.168.100.0/24 --public-network 192.168.100.0/24 > > Thanks a lot > > > 2017-07-28 0:29 GMT+02:00 Oscar Segarra : > >> Hi, >> >> ¿Do you mean that for security reasons ceph-deploy can only be executed >> from the public interface? >> >> Looks extrange that one cannot decide what network use for ceph-deploy... >> I could have a dedicated network for ceph-deploy... :S >> >> Thanks a lot >> >> 2017-07-28 0:03 GMT+02:00 Roger Brown : >> >>> I could be wrong, but I think you cannot achieve this objective. If you >>> declare a cluster network, OSDs will route heartbeat, object replication >>> and recovery traffic over the cluster network. We prefer that the cluster >>> network is NOT reachable from the public network or the Internet for added >>> security. Therefore it will not work with ceph-deploy actions. >>> Source: http://docs.ceph.com/docs/master/rados/ >>> configuration/network-config-ref/ >>> >>> >>> On Thu, Jul 27, 2017 at 3:53 PM Oscar Segarra >>> wrote: >>> Hi, In my environment I have 3 hosts, every host has 2 network interfaces: public: 192.168.2.0/24 cluster: 192.168.100.0/24 The hostname "vdicnode01", "vdicnode02" and "vdicnode03" are resolved by public DNS through the public interface, that means the "ping vdicnode01" will resolve 192.168.2.1. In my environment the "admin" node is the first node vdicnode01 and I'd like all the deployment "ceph-deploy" and all osd traffic to go from the cluster network. 1) To begin with, I create the cluster and I want all traffic to go from the cluster network: ceph-deploy --username vdicceph new vdicnode01 --cluster-network 192.168.100.0/24 --public-network 192.168.100.0/24 2) The problem comes when I have to launch my commands to the other hosts for example, from node vdicnode01 I execute: 2.1) ceph-deploy --username vdicceph osd create vdicnode02:sdb --> Finishes Ok but communication goes through the public interface 2.2) ceph-deploy --username vdicceph osd create vdicnode02.local:sdb --> vdicnode02.local is added manually in /etc/hosts (assigned a cluster IP) --> It raises some errors/warnings becase vdicnod02.local is not the real hostname. Some files are created with vdicnode02.local in the middle of the name of the file and some errors appear when starting up the osd service related to "file does not exist" 2.3) ceph-deploy --username vdicceph osd create vdicnode02-priv:sdb --> vdicnode02-priv is added manually in /etc/hosts (assigned a cluster IP) --> It raises some errors/warnings becase vdicnod02-priv is not the real hostname. Some files are created with vdicnode02-priv in the middle of the name of the file and some errors appear when starting up the osd service related to "file does not exist" What would be the right way to achieve my objective? If is there any documentation I have not found, please redirect me... Thanks a lot for your help in advance. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph osd safe to remove
Hi all, We are trying to outsource the disk replacement process for our ceph clusters to some non-expert sysadmins. We could really use a tool that reports if a Ceph OSD *would* or *would not* be safe to stop, e.g. # ceph-osd-safe-to-stop osd.X Yes it would be OK to stop osd.X (which of course means that no PGs would go inactive if osd.X were to be stopped). Does anyone have such a script that they'd like to share? Thanks! Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Unable to remove osd from crush map. - leads remapped pg's v11.2.0
Hello, Recently we got an underlying issue with osd.10 which mapped to /dev/sde . So we tried to removed it from the crush === #systemctl stop ceph-osd@10.service #for x in {10..10}; do ceph osd out $x;ceph osd crush remove osd.$x;ceph auth del osd.$x;ceph osd rm osd.$x ;done marked out osd.10. removed item id 10 name 'osd.10' from crush map updated removed osd.10 Still I can see the entry in crush #ceph osd crush dump <<..>> { "id": 10, "name": "device10" }, <<..>> Then I tried to manually removed using crush tool by below steps. #ceph osd getcrushmap -o /tmp/test.map #crushtool -d /tmp/test.map -o /tmp/test1.map Opened file /tmp/test1.map and removed #vim /tmp/test1.map <..> device 9 osd.9 device 10 device10 --< removed this entry device 11 osd.11 <..> #crushtool -c /tmp/test1.map -o /tmp/test2.map #ceph osd setcrushmap -i /tmp/test2.map -- Reinject to crush Still i can see the device10 info in the crush #ceph osd crush dump 2> /dev/null | grep device10 "name": "device10" Even i tried below command, no luck... #ceph osd crush rm osd.10 #ceph osd crush rm 10 #ceph osd crush rm device0 Due to this issue, 9 PG's got affected and landed in "remapped+incomplete" state. # for i in `cat test`; do ceph pg map $i 2> /dev/null; done osdmap e2443 pg 3.d9 (3.d9) -> up [8,63,77,35,117] acting [2147483647,63,77,2147483647,117] osdmap e2443 pg 3.9f (3.9f) -> up [80,47,116,19,3] acting [80,2147483647,116,2147483647,3] osdmap e2443 pg 3.7fe (3.7fe) -> up [17,27,93,23,102] acting [17,27,93,2147483647,102] osdmap e2443 pg 3.32f (3.32f) -> up [64,69,94,111,20] acting [2147483647,69,94,111,2147483647] osdmap e2443 pg 3.34f (3.34f) -> up [102,25,90,1,24] acting [102,2147483647,90,2147483647,24] osdmap e2443 pg 3.176 (3.176) -> up [9,2,107,13,91] acting [9,2,107,2147483647,91] osdmap e2443 pg 3.10e (3.10e) -> up [88,61,21,59,100] acting [2147483647,2147483647,21,2147483647,2147483647] osdmap e2443 pg 3.48 (3.48) -> up [114,18,32,90,8] acting [114,18,2147483647,90,8] osdmap e2443 pg 3.71a (3.71a) -> up [3,78,58,71,116] acting [3,78,58,2147483647,116] # ceph pg $i query 2> /dev/null| grep -w -A1 "blocked_by\"\: \[" | grep -v - "blocked_by": [ 10 ==>>> #ceph pg $i query 2> /dev/null| grep -w -A1 down_osds_we_would_probe "down_osds_we_would_probe": [ -->> 10 Then I tried to recreate the pg using below command #ceph pg force_create_pg id Still no luck ... Here the osd.10 is still present in crush which caused I'm unable to recover these 9 PG's . Whenever we reboot that affected OSD.10 node which leads the osd join back to the cluster again which is weired. >> Comments please how to forcefully remove device10 / osd.10 info from the crush. Attached crushmap file. Thanks Jayaram # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 device 10 device10 device 11 osd.11 device 12 osd.12 device 13 osd.13 device 14 osd.14 device 15 osd.15 device 16 osd.16 device 17 osd.17 device 18 osd.18 device 19 osd.19 device 20 osd.20 device 21 osd.21 device 22 osd.22 device 23 osd.23 device 24 osd.24 device 25 osd.25 device 26 osd.26 device 27 osd.27 device 28 osd.28 device 29 osd.29 device 30 osd.30 device 31 osd.31 device 32 osd.32 device 33 osd.33 device 34 osd.34 device 35 osd.35 device 36 osd.36 device 37 osd.37 device 38 osd.38 device 39 osd.39 device 40 osd.40 device 41 osd.41 device 42 osd.42 device 43 osd.43 device 44 osd.44 device 45 osd.45 device 46 osd.46 device 47 osd.47 device 48 osd.48 device 49 osd.49 device 50 osd.50 device 51 osd.51 device 52 osd.52 device 53 osd.53 device 54 osd.54 device 55 osd.55 device 56 osd.56 device 57 osd.57 device 58 osd.58 device 59 osd.59 device 60 osd.60 device 61 osd.61 device 62 osd.62 device 63 osd.63 device 64 osd.64 device 65 osd.65 device 66 osd.66 device 67 osd.67 device 68 osd.68 device 69 osd.69 device 70 osd.70 device 71 osd.71 device 72 osd.72 device 73 osd.73 device 74 osd.74 device 75 osd.75 device 76 osd.76 device 77 osd.77 device 78 osd.78 device 79 osd.79 device 80 osd.80 device 81 osd.81 device 82 osd.82 device 83 osd.83 device 84 osd.84 device 85 osd.85 device 86 osd.86 device 87 osd.87 device 88 osd.88 device 89 osd.89 device 90 osd.90 device 91 osd.91 device 92 osd.92 device 93 osd.93 device 94 osd.94 device 95 osd.95 device 96 osd.96 device 97 osd.97 device 98 osd.98 device 99 osd.99 device 100 osd.100 device 101 osd.101 device 102 osd.102 device 103 osd.103 device 104 osd.104 device 105 osd.105 device 106 osd.106 device 107 osd.107 device 108 osd.108 device 109 osd.109 device 110 o
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
1 You have 3 size pool I do not know why you set min_size 1. It is too dangous. 2 You had better use the same size and same num osds each host for crush. now you can try ceph osd reweight-by-utilization. command. When there is no user in you cluster. and I will go home. At 2017-07-28 17:57:11, "Nikola Ciprich" wrote: >On Fri, Jul 28, 2017 at 05:52:29PM +0800, linghucongsong wrote: >> >> >> >> You have two crush rule? One is ssd the other is hdd? >yes, exactly.. > >> >> Can you show ceph osd dump|grep pool >> > >pool 3 'vm' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins >pg_num 1024 pgp_num 1024 last_change 69955 flags hashpspool >min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0 >pool 4 'cephfs_data' replicated size 3 min_size 1 crush_ruleset 0 object_hash >rjenkins pg_num 1024 pgp_num 1024 last_change 74682 flags hashpspool >crash_replay_interval 45 min_write_recency_for_promote 1 stripe_width 0 >pool 5 'cephfs_metadata' replicated size 3 min_size 1 crush_ruleset 0 >object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 74667 flags >hashpspool min_write_recency_for_promote 1 stripe_width 0 >pool 11 'ssd' replicated size 3 min_size 1 crush_ruleset 1 object_hash >rjenkins pg_num 128 pgp_num 128 last_change 46119 flags hashpspool >min_write_recency_for_promote 1 stripe_width 0 > > >> ceph osd crush dump > >{ >"devices": [ >{ >"id": 0, >"name": "osd.0" >}, >{ >"id": 1, >"name": "osd.1" >}, >{ >"id": 2, >"name": "osd.2" >}, >{ >"id": 3, >"name": "osd.3" >}, >{ >"id": 4, >"name": "osd.4" >}, >{ >"id": 5, >"name": "osd.5" >}, >{ >"id": 6, >"name": "osd.6" >}, >{ >"id": 7, >"name": "device7" >}, >{ >"id": 8, >"name": "osd.8" >}, >{ >"id": 9, >"name": "osd.9" >}, >{ >"id": 10, >"name": "osd.10" >}, >{ >"id": 11, >"name": "osd.11" >}, >{ >"id": 12, >"name": "osd.12" >}, >{ >"id": 13, >"name": "osd.13" >}, >{ >"id": 14, >"name": "osd.14" >}, >{ >"id": 15, >"name": "osd.15" >}, >{ >"id": 16, >"name": "osd.16" >}, >{ >"id": 17, >"name": "osd.17" >}, >{ >"id": 18, >"name": "osd.18" >}, >{ >"id": 19, >"name": "osd.19" >}, >{ >"id": 20, >"name": "osd.20" >}, >{ >"id": 21, >"name": "osd.21" >}, >{ >"id": 22, >"name": "osd.22" >}, >{ >"id": 23, >"name": "osd.23" >}, >{ >"id": 24, >"name": "osd.24" >}, >{ >"id": 25, >"name": "osd.25" >}, >{ >"id": 26, >"name": "osd.26" >} >], >"types": [ >{ >"type_id": 0, >"name": "osd" >}, >{ >"type_id": 1, >"name": "host" >}, >{ >"type_id": 2, >"name": "chassis" >}, >{ >"type_id": 3, >"name": "rack" >}, >{ >"type_id": 4, >"name": "row" >}, >{ >"type_id": 5, >"name": "pdu" >}, >{ >"type_id": 6, >"name": "pod" >}, >{ >"type_id": 7, >"name": "room" >}, >{ >"type_id": 8, >"name": "datacenter" >}, >{ >"type_id": 9, >"name": "region" >}, >{ >"type_id": 10, >"name": "root" >} >], >"buckets": [ >{ >"id": -1, >"name": "default", >"type_id": 10, >"type_name": "root", >"weight": 2575553, >"alg": "straw2", >"hash": "rjenkins1", >"items": [ >{ >"id": -4, >"weight": 779875, >"pos": 0 >}, >{ >"id": -5, >"weight": 681571, >"pos": 1 >}, >{ >"id": -6, >"weight": 511178, >
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
On Fri, Jul 28, 2017 at 05:52:29PM +0800, linghucongsong wrote: > > > > You have two crush rule? One is ssd the other is hdd? yes, exactly.. > > Can you show ceph osd dump|grep pool > pool 3 'vm' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 69955 flags hashpspool min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0 pool 4 'cephfs_data' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 74682 flags hashpspool crash_replay_interval 45 min_write_recency_for_promote 1 stripe_width 0 pool 5 'cephfs_metadata' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 74667 flags hashpspool min_write_recency_for_promote 1 stripe_width 0 pool 11 'ssd' replicated size 3 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 46119 flags hashpspool min_write_recency_for_promote 1 stripe_width 0 > ceph osd crush dump { "devices": [ { "id": 0, "name": "osd.0" }, { "id": 1, "name": "osd.1" }, { "id": 2, "name": "osd.2" }, { "id": 3, "name": "osd.3" }, { "id": 4, "name": "osd.4" }, { "id": 5, "name": "osd.5" }, { "id": 6, "name": "osd.6" }, { "id": 7, "name": "device7" }, { "id": 8, "name": "osd.8" }, { "id": 9, "name": "osd.9" }, { "id": 10, "name": "osd.10" }, { "id": 11, "name": "osd.11" }, { "id": 12, "name": "osd.12" }, { "id": 13, "name": "osd.13" }, { "id": 14, "name": "osd.14" }, { "id": 15, "name": "osd.15" }, { "id": 16, "name": "osd.16" }, { "id": 17, "name": "osd.17" }, { "id": 18, "name": "osd.18" }, { "id": 19, "name": "osd.19" }, { "id": 20, "name": "osd.20" }, { "id": 21, "name": "osd.21" }, { "id": 22, "name": "osd.22" }, { "id": 23, "name": "osd.23" }, { "id": 24, "name": "osd.24" }, { "id": 25, "name": "osd.25" }, { "id": 26, "name": "osd.26" } ], "types": [ { "type_id": 0, "name": "osd" }, { "type_id": 1, "name": "host" }, { "type_id": 2, "name": "chassis" }, { "type_id": 3, "name": "rack" }, { "type_id": 4, "name": "row" }, { "type_id": 5, "name": "pdu" }, { "type_id": 6, "name": "pod" }, { "type_id": 7, "name": "room" }, { "type_id": 8, "name": "datacenter" }, { "type_id": 9, "name": "region" }, { "type_id": 10, "name": "root" } ], "buckets": [ { "id": -1, "name": "default", "type_id": 10, "type_name": "root", "weight": 2575553, "alg": "straw2", "hash": "rjenkins1", "items": [ { "id": -4, "weight": 779875, "pos": 0 }, { "id": -5, "weight": 681571, "pos": 1 }, { "id": -6, "weight": 511178, "pos": 2 }, { "id": -3, "weight": 602929, "pos": 3 } ] }, { "id": -2, "name": "ssd", "type_id": 10, "type_name": "root", "weight": 102233, "alg": "straw2", "hash": "rjenkins1", "items": [ { "id": -9, "weight": 26214, "pos": 0
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
You have two crush rule? One is ssd the other is hdd? Can you show ceph osd dump|grep pool ceph osd crush dump At 2017-07-28 17:47:48, "Nikola Ciprich" wrote: > >On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote: >> >> >> It look like the osd in your cluster is not all the same size. >> >> can you show ceph osd df output? > >you're right, they're not.. here's the output: > >[root@v1b ~]# ceph osd df tree >ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME > -2 1.55995- 1706G 883G 805G 51.78 2.55 0 root ssd > -9 0.3- 393G 221G 171G 56.30 2.78 0 host v1c-ssd > 10 0.3 1.0 393G 221G 171G 56.30 2.78 98 osd.10 >-10 0.59998- 683G 275G 389G 40.39 1.99 0 host v1a-ssd > 5 0.2 1.0 338G 151G 187G 44.77 2.21 65 osd.5 > 26 0.2 1.0 344G 124G 202G 36.07 1.78 52 osd.26 >-11 0.34000- 338G 219G 119G 64.68 3.19 0 host v1b-ssd > 13 0.34000 1.0 338G 219G 119G 64.68 3.19 96 osd.13 > -7 0.21999- 290G 166G 123G 57.43 2.83 0 host v1d-ssd > 19 0.21999 1.0 290G 166G 123G 57.43 2.83 73 osd.19 > -1 39.29982- 43658G 8312G 34787G 19.04 0.94 0 root default > -4 11.89995- 12806G 2422G 10197G 18.92 0.93 0 host v1a > 6 1.5 1.0 1833G 358G 1475G 19.53 0.96 366 osd.6 > 8 1.7 1.0 1833G 313G 1519G 17.11 0.84 370 osd.8 > 2 1.5 1.0 1833G 320G 1513G 17.46 0.86 331 osd.2 > 0 1.7 1.0 1804G 431G 1373G 23.90 1.18 359 osd.0 > 4 1.5 1.0 1833G 294G 1539G 16.07 0.79 360 osd.4 > 25 3.5 1.0 3667G 704G 2776G 19.22 0.95 745 osd.25 > -5 10.39995- 10914G 2154G 8573G 19.74 0.97 0 host v1b > 1 1.5 1.0 1804G 350G 1454G 19.42 0.96 409 osd.1 > 3 1.7 1.0 1804G 360G 1444G 19.98 0.99 412 osd.3 > 9 1.5 1.0 1804G 331G 1473G 18.37 0.91 363 osd.9 > 11 1.7 1.0 1833G 367G 1465G 20.06 0.99 415 osd.11 > 24 3.5 1.0 3667G 744G 2736G 20.30 1.00 834 osd.24 > -6 7.79996- 9051G 1769G 7282G 19.54 0.96 0 host v1c > 14 1.5 1.0 1804G 370G 1433G 20.54 1.01 442 osd.14 > 15 1.7 1.0 1833G 383G 1450G 20.92 1.03 447 osd.15 > 16 1.3 1.0 1804G 295G 1508G 16.38 0.81 355 osd.16 > 18 1.3 1.0 1804G 366G 1438G 20.29 1.00 381 osd.18 > 17 1.5 1.0 1804G 353G 1451G 19.57 0.97 429 osd.17 > -3 9.19997- 10885G 1965G 8733G 18.06 0.89 0 host v1d-sata > 12 1.3 1.0 1804G 348G 1455G 19.32 0.95 365 osd.12 > 20 1.3 1.0 1804G 335G 1468G 18.60 0.92 371 osd.20 > 21 3.5 1.0 3667G 695G 2785G 18.97 0.94 871 osd.21 > 22 1.3 1.0 1804G 281G 1522G 15.63 0.77 326 osd.22 > 23 1.3 1.0 1804G 303G 1500G 16.83 0.83 321 osd.23 >TOTAL 45365G 9195G 35592G 20.27 >MIN/MAX VAR: 0.77/3.19 STDDEV: 14.69 > > > >apart from replacing OSDs, how can I help it? > > > > >> >> >> At 2017-07-28 17:24:29, "Nikola Ciprich" wrote: >> >I forgot to add that OSD daemons really seem to be idle, no disk >> >activity, no CPU usage.. it just looks to me like some kind of >> >deadlock, as they were waiting for each other.. >> > >> >and so I'm trying to get last 1.5% of misplaced / degraded PGs >> >for almost a week.. >> > >> > >> >On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote: >> >> Hi, >> >> >> >> I'm trying to find reason for strange recovery issues I'm seeing on >> >> our cluster.. >> >> >> >> it's mostly idle, 4 node cluster with 26 OSDs evenly distributed >> >> across nodes. jewel 10.2.9 >> >> >> >> the problem is that after some disk replaces and data moves, recovery >> >> is progressing extremely slowly.. pgs seem to be stuck in >> >> active+recovering+degraded >> >> state: >> >> >> >> [root@v1d ~]# ceph -s >> >> cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33 >> >> health HEALTH_WARN >> >> 159 pgs backfill_wait >> >> 4 pgs backfilling >> >> 259 pgs degraded >> >> 12 pgs recovering >> >> 113 pgs recovery_wait >> >> 215 pgs stuck degraded >> >> 266 pgs stuck unclean >> >> 140 pgs stuck undersized >> >> 151 pgs undersized >> >> recovery 37788/2327775 objects degraded (1.623%) >> >> recovery 23854/2327775 objects misplaced (1.025%) >> >> noout,noin flag(s) set >> >> monmap e21: 3 mons at >> >> {v1a=10.0.0.1
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote: > > > It look like the osd in your cluster is not all the same size. > > can you show ceph osd df output? you're right, they're not.. here's the output: [root@v1b ~]# ceph osd df tree ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME -2 1.55995- 1706G 883G 805G 51.78 2.55 0 root ssd -9 0.3- 393G 221G 171G 56.30 2.78 0 host v1c-ssd 10 0.3 1.0 393G 221G 171G 56.30 2.78 98 osd.10 -10 0.59998- 683G 275G 389G 40.39 1.99 0 host v1a-ssd 5 0.2 1.0 338G 151G 187G 44.77 2.21 65 osd.5 26 0.2 1.0 344G 124G 202G 36.07 1.78 52 osd.26 -11 0.34000- 338G 219G 119G 64.68 3.19 0 host v1b-ssd 13 0.34000 1.0 338G 219G 119G 64.68 3.19 96 osd.13 -7 0.21999- 290G 166G 123G 57.43 2.83 0 host v1d-ssd 19 0.21999 1.0 290G 166G 123G 57.43 2.83 73 osd.19 -1 39.29982- 43658G 8312G 34787G 19.04 0.94 0 root default -4 11.89995- 12806G 2422G 10197G 18.92 0.93 0 host v1a 6 1.5 1.0 1833G 358G 1475G 19.53 0.96 366 osd.6 8 1.7 1.0 1833G 313G 1519G 17.11 0.84 370 osd.8 2 1.5 1.0 1833G 320G 1513G 17.46 0.86 331 osd.2 0 1.7 1.0 1804G 431G 1373G 23.90 1.18 359 osd.0 4 1.5 1.0 1833G 294G 1539G 16.07 0.79 360 osd.4 25 3.5 1.0 3667G 704G 2776G 19.22 0.95 745 osd.25 -5 10.39995- 10914G 2154G 8573G 19.74 0.97 0 host v1b 1 1.5 1.0 1804G 350G 1454G 19.42 0.96 409 osd.1 3 1.7 1.0 1804G 360G 1444G 19.98 0.99 412 osd.3 9 1.5 1.0 1804G 331G 1473G 18.37 0.91 363 osd.9 11 1.7 1.0 1833G 367G 1465G 20.06 0.99 415 osd.11 24 3.5 1.0 3667G 744G 2736G 20.30 1.00 834 osd.24 -6 7.79996- 9051G 1769G 7282G 19.54 0.96 0 host v1c 14 1.5 1.0 1804G 370G 1433G 20.54 1.01 442 osd.14 15 1.7 1.0 1833G 383G 1450G 20.92 1.03 447 osd.15 16 1.3 1.0 1804G 295G 1508G 16.38 0.81 355 osd.16 18 1.3 1.0 1804G 366G 1438G 20.29 1.00 381 osd.18 17 1.5 1.0 1804G 353G 1451G 19.57 0.97 429 osd.17 -3 9.19997- 10885G 1965G 8733G 18.06 0.89 0 host v1d-sata 12 1.3 1.0 1804G 348G 1455G 19.32 0.95 365 osd.12 20 1.3 1.0 1804G 335G 1468G 18.60 0.92 371 osd.20 21 3.5 1.0 3667G 695G 2785G 18.97 0.94 871 osd.21 22 1.3 1.0 1804G 281G 1522G 15.63 0.77 326 osd.22 23 1.3 1.0 1804G 303G 1500G 16.83 0.83 321 osd.23 TOTAL 45365G 9195G 35592G 20.27 MIN/MAX VAR: 0.77/3.19 STDDEV: 14.69 apart from replacing OSDs, how can I help it? > > > At 2017-07-28 17:24:29, "Nikola Ciprich" wrote: > >I forgot to add that OSD daemons really seem to be idle, no disk > >activity, no CPU usage.. it just looks to me like some kind of > >deadlock, as they were waiting for each other.. > > > >and so I'm trying to get last 1.5% of misplaced / degraded PGs > >for almost a week.. > > > > > >On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote: > >> Hi, > >> > >> I'm trying to find reason for strange recovery issues I'm seeing on > >> our cluster.. > >> > >> it's mostly idle, 4 node cluster with 26 OSDs evenly distributed > >> across nodes. jewel 10.2.9 > >> > >> the problem is that after some disk replaces and data moves, recovery > >> is progressing extremely slowly.. pgs seem to be stuck in > >> active+recovering+degraded > >> state: > >> > >> [root@v1d ~]# ceph -s > >> cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33 > >> health HEALTH_WARN > >> 159 pgs backfill_wait > >> 4 pgs backfilling > >> 259 pgs degraded > >> 12 pgs recovering > >> 113 pgs recovery_wait > >> 215 pgs stuck degraded > >> 266 pgs stuck unclean > >> 140 pgs stuck undersized > >> 151 pgs undersized > >> recovery 37788/2327775 objects degraded (1.623%) > >> recovery 23854/2327775 objects misplaced (1.025%) > >> noout,noin flag(s) set > >> monmap e21: 3 mons at > >> {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0} > >> election epoch 6160, quorum 0,1,2 v1a,v1b,v1c > >> fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby > >> osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs > >> flags noout,n
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
It look like the osd in your cluster is not all the same size. can you show ceph osd df output? At 2017-07-28 17:24:29, "Nikola Ciprich" wrote: >I forgot to add that OSD daemons really seem to be idle, no disk >activity, no CPU usage.. it just looks to me like some kind of >deadlock, as they were waiting for each other.. > >and so I'm trying to get last 1.5% of misplaced / degraded PGs >for almost a week.. > > >On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote: >> Hi, >> >> I'm trying to find reason for strange recovery issues I'm seeing on >> our cluster.. >> >> it's mostly idle, 4 node cluster with 26 OSDs evenly distributed >> across nodes. jewel 10.2.9 >> >> the problem is that after some disk replaces and data moves, recovery >> is progressing extremely slowly.. pgs seem to be stuck in >> active+recovering+degraded >> state: >> >> [root@v1d ~]# ceph -s >> cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33 >> health HEALTH_WARN >> 159 pgs backfill_wait >> 4 pgs backfilling >> 259 pgs degraded >> 12 pgs recovering >> 113 pgs recovery_wait >> 215 pgs stuck degraded >> 266 pgs stuck unclean >> 140 pgs stuck undersized >> 151 pgs undersized >> recovery 37788/2327775 objects degraded (1.623%) >> recovery 23854/2327775 objects misplaced (1.025%) >> noout,noin flag(s) set >> monmap e21: 3 mons at >> {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0} >> election epoch 6160, quorum 0,1,2 v1a,v1b,v1c >> fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby >> osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs >> flags noout,noin,sortbitwise,require_jewel_osds >> pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects >> 9215 GB used, 35572 GB / 45365 GB avail >> 37788/2327775 objects degraded (1.623%) >> 23854/2327775 objects misplaced (1.025%) >> 2912 active+clean >> 130 active+undersized+degraded+remapped+wait_backfill >> 97 active+recovery_wait+degraded >> 29 active+remapped+wait_backfill >> 12 active+recovery_wait+undersized+degraded+remapped >>6 active+recovering+degraded >>5 active+recovering+undersized+degraded+remapped >>4 active+undersized+degraded+remapped+backfilling >>4 active+recovery_wait+degraded+remapped >>1 active+recovering+degraded+remapped >> client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr >> >> >> when I restart affected OSDs, it bumps the recovery, but then another >> PGs get stuck.. All OSDs were restarted multiple times, none are even close >> to >> nearfull, I just cant find what I'm doing wrong.. >> >> possibly related OSD options: >> >> osd max backfills = 4 >> osd recovery max active = 15 >> debug osd = 0/0 >> osd op threads = 4 >> osd backfill scan min = 4 >> osd backfill scan max = 16 >> >> Any hints would be greatly appreciated >> >> thanks >> >> nik >> >> >> -- >> - >> Ing. Nikola CIPRICH >> LinuxBox.cz, s.r.o. >> 28.rijna 168, 709 00 Ostrava >> >> tel.: +420 591 166 214 >> fax:+420 596 621 273 >> mobil: +420 777 093 799 >> www.linuxbox.cz >> >> mobil servis: +420 737 238 656 >> email servis: ser...@linuxbox.cz >> - >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >-- >- >Ing. Nikola CIPRICH >LinuxBox.cz, s.r.o. >28.rijna 168, 709 00 Ostrava > >tel.: +420 591 166 214 >fax:+420 596 621 273 >mobil: +420 777 093 799 >www.linuxbox.cz > >mobil servis: +420 737 238 656 >email servis: ser...@linuxbox.cz >- >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
I forgot to add that OSD daemons really seem to be idle, no disk activity, no CPU usage.. it just looks to me like some kind of deadlock, as they were waiting for each other.. and so I'm trying to get last 1.5% of misplaced / degraded PGs for almost a week.. On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote: > Hi, > > I'm trying to find reason for strange recovery issues I'm seeing on > our cluster.. > > it's mostly idle, 4 node cluster with 26 OSDs evenly distributed > across nodes. jewel 10.2.9 > > the problem is that after some disk replaces and data moves, recovery > is progressing extremely slowly.. pgs seem to be stuck in > active+recovering+degraded > state: > > [root@v1d ~]# ceph -s > cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33 > health HEALTH_WARN > 159 pgs backfill_wait > 4 pgs backfilling > 259 pgs degraded > 12 pgs recovering > 113 pgs recovery_wait > 215 pgs stuck degraded > 266 pgs stuck unclean > 140 pgs stuck undersized > 151 pgs undersized > recovery 37788/2327775 objects degraded (1.623%) > recovery 23854/2327775 objects misplaced (1.025%) > noout,noin flag(s) set > monmap e21: 3 mons at > {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0} > election epoch 6160, quorum 0,1,2 v1a,v1b,v1c > fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby > osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs > flags noout,noin,sortbitwise,require_jewel_osds > pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects > 9215 GB used, 35572 GB / 45365 GB avail > 37788/2327775 objects degraded (1.623%) > 23854/2327775 objects misplaced (1.025%) > 2912 active+clean > 130 active+undersized+degraded+remapped+wait_backfill > 97 active+recovery_wait+degraded > 29 active+remapped+wait_backfill > 12 active+recovery_wait+undersized+degraded+remapped >6 active+recovering+degraded >5 active+recovering+undersized+degraded+remapped >4 active+undersized+degraded+remapped+backfilling >4 active+recovery_wait+degraded+remapped >1 active+recovering+degraded+remapped > client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr > > > when I restart affected OSDs, it bumps the recovery, but then another > PGs get stuck.. All OSDs were restarted multiple times, none are even close to > nearfull, I just cant find what I'm doing wrong.. > > possibly related OSD options: > > osd max backfills = 4 > osd recovery max active = 15 > debug osd = 0/0 > osd op threads = 4 > osd backfill scan min = 4 > osd backfill scan max = 16 > > Any hints would be greatly appreciated > > thanks > > nik > > > -- > - > Ing. Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28.rijna 168, 709 00 Ostrava > > tel.: +420 591 166 214 > fax:+420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)
Hi, I'm trying to find reason for strange recovery issues I'm seeing on our cluster.. it's mostly idle, 4 node cluster with 26 OSDs evenly distributed across nodes. jewel 10.2.9 the problem is that after some disk replaces and data moves, recovery is progressing extremely slowly.. pgs seem to be stuck in active+recovering+degraded state: [root@v1d ~]# ceph -s cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33 health HEALTH_WARN 159 pgs backfill_wait 4 pgs backfilling 259 pgs degraded 12 pgs recovering 113 pgs recovery_wait 215 pgs stuck degraded 266 pgs stuck unclean 140 pgs stuck undersized 151 pgs undersized recovery 37788/2327775 objects degraded (1.623%) recovery 23854/2327775 objects misplaced (1.025%) noout,noin flag(s) set monmap e21: 3 mons at {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0} election epoch 6160, quorum 0,1,2 v1a,v1b,v1c fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs flags noout,noin,sortbitwise,require_jewel_osds pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects 9215 GB used, 35572 GB / 45365 GB avail 37788/2327775 objects degraded (1.623%) 23854/2327775 objects misplaced (1.025%) 2912 active+clean 130 active+undersized+degraded+remapped+wait_backfill 97 active+recovery_wait+degraded 29 active+remapped+wait_backfill 12 active+recovery_wait+undersized+degraded+remapped 6 active+recovering+degraded 5 active+recovering+undersized+degraded+remapped 4 active+undersized+degraded+remapped+backfilling 4 active+recovery_wait+degraded+remapped 1 active+recovering+degraded+remapped client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr when I restart affected OSDs, it bumps the recovery, but then another PGs get stuck.. All OSDs were restarted multiple times, none are even close to nearfull, I just cant find what I'm doing wrong.. possibly related OSD options: osd max backfills = 4 osd recovery max active = 15 debug osd = 0/0 osd op threads = 4 osd backfill scan min = 4 osd backfill scan max = 16 Any hints would be greatly appreciated thanks nik -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bluestore wal / block db size
Hey, I am just playing around with luminous RC. As far as I can see it works nice. Studying around I found the following discussion about wal and block db size: http://marc.info/?l=ceph-devel&m=149978799900866&w=2 Creating an osd with the following command: ceph-deploy osd create --bluestore --block-db=/dev/sdj --block-wal=/dev/sdj osd01:/dev/sdb creates wal of 576M and block db of 1G. In my scenario /dev/sdj is a SSD. In the discussion mentioned above it is said that bluestore automatically roles rocksdb data over to the hdd when the block db gets full and performance decreases. So what are good values for wal and blockdb? Is there any documentation for this? I can hardly find information on this topic. Thank you. Tobias ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com