On Mon, Jan 31, 2022 at 5:58 PM Anmol Arora <anmol.ar...@clarisights.com> wrote:
>
> Hi,
> I'm using cephfs as a storage layer for a database.
> And seeing the following message in the health warning of ceph-
> ```
> # ceph health detail
> HEALTH_WARN 1 clients failing to respond to capability release
> [WRN] MDS_CLIENT_LATE_RELEASE: 1 clients failing to respond to capability
> release
>     mds.ceph-mon-2(mds.0): Client client-3-4:cephfs failing to respond to
> capability release client_id: 2100909

The MDS is requesting clients to trim caches, This happens when an MDS
is hitting its cache limits (experiencing cache pressure). It can also
proactively request clients to release unused caps. However, in your
case, the clients are not releasing caps soon enough.

Are you using default cache configurations for the MDS
(mds_cache_memory_limit)? How many clients do you have?

> ```
> And the `mds session ls` output for the client is:
> ```
>     {
>         "id": 2100909,
>         "entity": {
>             "name": {
>                 "type": "client",
>                 "num": 2100909
>             },
>             "addr": {
>                 "type": "v1",
>                 "addr": "xxxx",
>                 "nonce": 2770840461
>             }
>         },
>         "state": "open",
>         "num_leases": 0,
>         "num_caps": 507799,
>         "request_load_avg": 1066,
>         "uptime": 274389.13036294398,
>         "requests_in_flight": 0,
>         "completed_requests": 0,
>         "reconnecting": false,
>         "recall_caps": {
>             "value": 0,
>             "halflife": 60
>         },
>         "release_caps": {
>             "value": 0,
>             "halflife": 60
>         },
>         "recall_caps_throttle": {
>             "value": 0,
>             "halflife": 1.5
>         },
>         "recall_caps_throttle2o": {
>             "value": 0,
>             "halflife": 0.5
>         },
>         "session_cache_liveness": {
>             "value": 9665.2693316477944,
>             "halflife": 300
>         },
>         "cap_acquisition": {
>             "value": 0,
>             "halflife": 10
>         },
>         "inst": "client.2100909 v1:xxxx:0/2770840461",
>         "completed_requests": [],
>         "prealloc_inos": [],
>         "used_inos": [],
>         "client_metadata": {
>             "client_features": {
>                 "feature_bits": "0x0000000000007bff"
>             },
>             "metric_spec": {
>                 "metric_flags": {
>                     "feature_bits": "0x000000000000001f"
>                 }
>             },
>             "entity_id": "cephfs",
>             "hostname": "client-3-4",
>             "kernel_version": "5.11.0-1024-gcp",
>             "root": "/"
>         }
>     },
> ```
> ceph version: `15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus
> (stable)`
> client kernel version: `Linux 5.11.0-1023-gcp`
> debug details-
> ```
> # cat /sys/kernel/debug/ceph/*/osdc
> REQUESTS 0 homeless 0
> LINGER REQUESTS
> BACKOFFS
>
> # cat /sys/kernel/debug/ceph/*/caps
> total           508281
> avail           45
> used            508230
> reserved        6
> min             1024
>
> # Nothing in mdsc
> ```
> The above warning seems to go away after some time, but it pops up multiple
> times a day with different clients. It also seems to go away for some time
> after I drop the cache using `echo 3 > /proc/sys/vm/drop_caches`.
> Please suggest how do I resolve this issue permanently?
>
> Best,
> Anmol Arora
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Cheers,
Venky

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to