Re: [ceph-users] strange osd beacon
osd send osd beacons every 300s, and it's used to let mon know that osd is alive, for some cases, the osd don't have peers, ex, no pools created. Rafał Wądołowski 于2019年6月14日周五 下午12:53写道: > > Hi, > > Is it normal that osd beacon could be without pgs? Like below. This > drive contain data, but I cannot make him to run. > > Ceph v.12.2.4 > > > { > "description": "osd_beacon(pgs [] lec 857158 v869771)", > "initiated_at": "2019-06-14 06:39:37.972795", > "age": 189.310037, > "duration": 189.453167, > "type_data": { > "events": [ > { > "time": "2019-06-14 06:39:37.972795", > "event": "initiated" > }, > { > "time": "2019-06-14 06:39:37.972954", > "event": "mon:_ms_dispatch" > }, > { > "time": "2019-06-14 06:39:37.972956", > "event": "mon:dispatch_op" > }, > { > "time": "2019-06-14 06:39:37.972956", > "event": "psvc:dispatch" > }, > { > "time": "2019-06-14 06:39:37.972976", > "event": "osdmap:preprocess_query" > }, > { > "time": "2019-06-14 06:39:37.972978", > "event": "osdmap:preprocess_beacon" > }, > { > "time": "2019-06-14 06:39:37.972982", > "event": "forward_request_leader" > }, > { > "time": "2019-06-14 06:39:37.973064", > "event": "forwarded" > } > ], > "info": { > "seq": 22378, > "src_is_mon": false, > "source": "osd.1092 10.11.2.33:6842/159188", > "forwarded_to_leader": true > } > } > } > > > Best Regards, > > Rafał Wądołowski > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] problem with degraded PG
can you show us the output of 'ceph osd dump' and 'ceph health detail'? Luk 于2019年6月14日周五 下午8:02写道: > > Hello, > > All kudos are going to friends from Wroclaw, PL :) > > It was as simple as typo... > > There was osd added two times to crushmap due to (this commands where > run over week ago - didn't have problem then, it showed up after > replacing another osd - osd-7): > > ceph osd crush add osd.112 0.00 root=hdd > ceph osd crush move osd.112 0.00 root=hdd rack=rack-a host=stor-a02 > ceph osd crush add osd.112 0.00 host=stor-a02 > > and the ceph osd tree was like this: > [root@ceph-mon-01 ~]# ceph osd tree > ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF > -100 200.27496 root hdd > -10167.64999 rack rack-a > -233.82500 host stor-a01 >0 hdd 7.27499 osd.0 up 1.0 1.0 >6 hdd 7.27499 osd.6 up 1.0 1.0 > 12 hdd 7.27499 osd.12up 1.0 1.0 > 108 hdd 4.0 osd.108 up 1.0 1.0 > 109 hdd 4.0 osd.109 up 1.0 1.0 > 110 hdd 4.0 osd.110 up 1.0 1.0 > -733.82500 host stor-a02 >5 hdd 7.27499 osd.5 up 1.0 1.0 >9 hdd 7.27499 osd.9 up 1.0 1.0 > 15 hdd 7.27499 osd.15up 1.0 1.0 > 111 hdd 4.0 osd.111 up 1.0 1.0 > 112 hdd 4.0 osd.112 up 1.0 1.0 > 113 hdd 4.0 osd.113 up 1.0 1.0 > -10260.97498 rack rack-b > -327.14998 host stor-b01 >1 hdd 7.27499 osd.1 up 1.0 1.0 >7 hdd 0.5 osd.7 up 1.0 1.0 > 13 hdd 7.27499 osd.13up 1.0 1.0 > 114 hdd 4.0 osd.114 up 1.0 1.0 > 115 hdd 4.0 osd.115 up 1.0 1.0 > 116 hdd 4.0 osd.116 up 1.0 1.0 > -433.82500 host stor-b02 >2 hdd 7.27499 osd.2 up 1.0 1.0 > 10 hdd 7.27499 osd.10up 1.0 1.0 > 16 hdd 7.27499 osd.16up 1.0 1.0 > 117 hdd 4.0 osd.117 up 1.0 1.0 > 118 hdd 4.0 osd.118 up 1.0 1.0 > 119 hdd 4.0 osd.119 up 1.0 1.0 > -10367.64999 rack rack-c > -633.82500 host stor-c01 >4 hdd 7.27499 osd.4 up 1.0 1.0 >8 hdd 7.27499 osd.8 up 1.0 1.0 > 14 hdd 7.27499 osd.14up 1.0 1.0 > 120 hdd 4.0 osd.120 up 1.0 1.0 > 121 hdd 4.0 osd.121 up 1.0 1.0 > 122 hdd 4.0 osd.122 up 1.0 1.0 > -533.82500 host stor-c02 >3 hdd 7.27499 osd.3 up 1.0 1.0 > 11 hdd 7.27499 osd.11up 1.0 1.0 > 17 hdd 7.27499 osd.17up 1.0 1.0 > 123 hdd 4.0 osd.123 up 1.0 1.0 > 124 hdd 4.0 osd.124 up 1.0 1.0 > 125 hdd 4.0 osd.125 up 1.0 1.0 > 112 hdd 4.0 osd.112 up 1.0 1.0 > > [cut] > > after editing crushmap and removing osd.112 from root ceph started > recover and is healthy now :) > > Regards > Lukasz > > > > Here is ceph osd tree, in first post there is also ceph osd df tree: > > > https://pastebin.com/Vs75gpwZ > > > > >> Ahh I was thinking of chooseleaf_vary_r, which you already have. > >> So probably not related to tunables. What is your `ceph osd tree` ? > > >> By the way, 12.2.9 has an unrelated bug (details > >> http://tracker.ceph.com/issues/36686) > >> AFAIU you will just need to update to v12.2.11 or v12.2.12 for that fix. > > >> -- Dan > > >> On Fri, Jun 14, 2019 at 11:29 AM Luk wrote: > >>> > >>> Hi, > >>> > >>> here is the output: > >>> > >>> ceph osd crush show-tunables > >>> { > >>> "choose_local_tries": 0, > >>> "choose_local_fallback_tries": 0, > >>> "choose_total_tries": 100, > >>> "chooseleaf_descend_once": 1, > >>> "chooseleaf_vary_r": 1, > >>> "chooseleaf_stable": 0, > >>> "straw_calc_version": 1, > >>> "allowed_bucket_algs": 22, > >>> "profile": "unknown", > >>> "optimal_tunables": 0, > >>> "legacy_tunables": 0, > >>> "minimum_required_version":
Re: [ceph-users] HEALTH_WARN - 3 modules have failed dependencies
Ubuntu ceph dashboard failure/regression still exists as of today. root@nocsupport2:~# uname -a Linux nocsupport2 5.0.0-16-generic #17-Ubuntu SMP Wed May 15 10:52:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux root@nocsupport2:~# date Sat 15 Jun 2019 03:03:52 PM CDT root@nocsupport2:~# ceph -s cluster: id: x health: HEALTH_WARN ... Module 'dashboard' has failed dependency: Interpreter change detected - this module can only be loaded into one interpreter per process. ... On 5/1/19 11:07 AM, Ranjan Ghosh wrote: Ah, after researching some more I think I got hit by this bug: https://github.com/ceph/ceph/pull/25585 At least that's exactly what I see in the logs: "Interpreter change detected - this module can only be loaded into one interpreter per process." Ceph modules don't seem to work at all with the newest Ubuntu version. Only one module can be loaded. Sad :-( Hope this will be fixed soon... Am 30.04.19 um 21:18 schrieb Ranjan Ghosh: Hi my beloved Ceph list, After an upgrade from Ubuntu Cosmic to Ubuntu Disco (and according Ceph packages updated from 13.2.2 to 13.2.4), I now get this when I enter "ceph health": HEALTH_WARN 3 modules have failed dependencies "ceph mgr module ls" only reports those 3 modules enabled: "enabled_modules": [ "dashboard", "restful", "status" ], ... Then I found this page here: docs.ceph.com/docs/master/rados/operations/health-checks Under "MGR_MODULE_DEPENDENCY" it says: "An enabled manager module is failing its dependency check. This health check should come with an explanatory message from the module about the problem." What is "this health check"? If the page talks about "ceph health" or "ceph -s" then, no, there is no explanatory message there on what's wrong. Furthermore, it says: "This health check is only applied to enabled modules. If a module is not enabled, you can see whether it is reporting dependency issues in the output of ceph module ls." The command "ceph module ls", however, doesn't exist. If "ceph mgr module ls" is really meant, then I get this: { "enabled_modules": [ "dashboard", "restful", "status" ], "disabled_modules": [ { "name": "balancer", "can_run": true, "error_string": "" }, { "name": "hello", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "influx", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "iostat", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "localpool", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "prometheus", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "selftest", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "smart", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "telegraf", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "telemetry", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." }, { "name": "zabbix", "can_run": false, "error_string": "Interpreter change detected - this module can only be loaded into one interpreter per process." } ] } Usually the Ceph documentation is great, very detailed and helpful. But I can find nothing on how to resolve this problem. Any help is much appreciated. Thank you / Best regards Ranjan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___
[ceph-users] RGW Blocking Behaviour on Inactive / Incomplete PG
Hi, I wanted to understand the nature of the RGW Threads Being Blocked on Requests for a PG which is currently in INACTIVE State 1.As long as the PG is inactive the requests stay blocked 2.Could the RGW Threads Use Event Based Model, if a PG is inactive, put the Current Request into a Block Queue, A event based Model, which is something similar to nginx 3.Could The RGW threads Timeout, if the Request stay blocked then a certain threshold? 4.Was the Design of Blocking the RGW threads on a Inactive PG by choice, or it was they way this model was implemented? 5.Are there any serialisation issues that could arise if a async model is used? The above questions are based upon the observation seen on Hammer, and the reason for the above is to Increase the Availability of a WebStack service, is slight percentages of PG are down for a increased amount of time. Thanks Romit ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com