OK, looks like clock skew is the problem. I thought this is caused by the
reboot but it did not fix itself after some minutes (mon3 was 6 seconds
ahead).
After forcing time sync from the same server, it seems to be solved now.

Kevin

Am Fr., 20. Sept. 2019 um 07:33 Uhr schrieb Kevin Olbrich <k...@sv01.de>:

> Hi!
>
> Today some OSDs went down, a temporary problem that was solved easily.
> The mimic cluster is working and all OSDs are complete, all active+clean.
>
> Completely new for me is this:
> > 25 slow ops, oldest one blocked for 219 sec, mon.mon03 has slow ops
>
> The cluster itself looks fine, monitoring for the VMs that use RBD are
> fine.
>
> I thought that might be (https://tracker.ceph.com/issues/24531) but I've
> restarted the mon service (and the node as a whole) but both did not help.
> The slop ops slowly increase.
>
> Example:
>
>         {
>             "description": "auth(proto 0 30 bytes epoch 0)",
>             "initiated_at": "2019-09-20 05:31:52.295858",
>             "age": 7.851164,
>             "duration": 7.900068,
>             "type_data": {
>                 "events": [
>                     {
>                         "time": "2019-09-20 05:31:52.295858",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.295858",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.295864",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.295875",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296075",
>                         "event": "dispatched"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296089",
>                         "event": "mon:_ms_dispatch"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296097",
>                         "event": "mon:dispatch_op"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296098",
>                         "event": "psvc:dispatch"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296172",
>                         "event": "auth:wait_for_readable"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296177",
>                         "event": "auth:wait_for_readable/paxos"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.296232",
>                         "event": "paxos:wait_for_readable"
>                     }
>                 ],
>                 "info": {
>                     "seq": 1708,
>                     "src_is_mon": false,
>                     "source": "client.?
> [fd91:462b:4243:47e::1:3]:0/2365414961",
>                     "forwarded_to_leader": false
>                 }
>             }
>         },
>         {
>             "description": "auth(proto 0 30 bytes epoch 0)",
>             "initiated_at": "2019-09-20 05:31:52.314892",
>             "age": 7.832131,
>             "duration": 7.881230,
>             "type_data": {
>                 "events": [
>                     {
>                         "time": "2019-09-20 05:31:52.314892",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.314892",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.314897",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.314907",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315057",
>                         "event": "dispatched"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315072",
>                         "event": "mon:_ms_dispatch"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315082",
>                         "event": "mon:dispatch_op"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315083",
>                         "event": "psvc:dispatch"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315161",
>                         "event": "auth:wait_for_readable"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315167",
>                         "event": "auth:wait_for_readable/paxos"
>                     },
>                     {
>                         "time": "2019-09-20 05:31:52.315230",
>                         "event": "paxos:wait_for_readable"
>                     }
>                 ],
>                 "info": {
>                     "seq": 1709,
>                     "src_is_mon": false,
>                     "source": "client.?
> [fd91:462b:4243:47e::1:3]:0/997594187",
>                     "forwarded_to_leader": false
>                 }
>             }
>         }
>
>
>
> This is a new situation for me. What am I supposed to do in this case?
>
> Thank you!
>
> Kind regards
> Kevin
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to