Hi David,

How does your filesystem looks like. We have a few folders with a lot of subfolders, which are all randomly accessed. And I guess the balancer is moving a lot of folders between the mds nodes.
We noticed that multiple active mds isn't working in this setup, with the same errors as you get. And after restarting the problematic mds, everything is fine for a few hours and the errors show again. So for now we reverted to 1 mds (the load is low with the holidays). 
Also the load on the cluster was very high (1000+ iops and 100+ MB traffic) with multiple mds, like it was continuing to load balance folders over the active mds nodes. The load is currently around 500 iops and 50 MB traffic, or even lower. 

After the holidays I'm going to see what I can achieve with manual pinning directories to mds ranks. 

Best regards, 
Sake 


On 31 Dec 2023 09:01, David Yang <gmydw1...@gmail.com> wrote:

I hope this message finds you well.

I have a cephfs cluster with 3 active mds, and use 3-node samba to
export through the kernel.

Currently, there are 2 node mds experiencing slow requests. We have
tried restarting the mds. After a few hours, the replay log status
became active.
But the slow request reappears. The slow request does not seem to come
from the client, but from the request of the mds node.

Looking forward to your prompt response.

HEALTH_WARN 2 MDSs report slow requests; 2 MDSs behind on trimming
[WRN] MDS_SLOW_REQUEST: 2 MDSs report slow requests
    mds.osd44(mds.0): 2 slow requests are blocked > 30 secs
    mds.osd43(mds.1): 2 slow requests are blocked > 30 secs
[WRN] MDS_TRIM: 2 MDSs behind on trimming
    mds.osd44(mds.0): Behind on trimming (18642/1024) max_segments:
1024, num_segments: 18642
    mds.osd43(mds.1): Behind on trimming (976612/1024) max_segments:
1024, num_segments: 976612

mds.0

{
    "ops": [
        {
            "description": "peer_request:mds.1:1",
            "initiated_at": "2023-12-31T11:19:38.679925+0800",
            "age": 4358.8009461359998,
            "duration": 4358.8009636369998,
            "type_data": {
                "flag_point": "dispatched",
                "reqid": "mds.1:1",
                "op_type": "peer_request",
                "leader_info": {
                    "leader": "1"
                },
                "events": [
                    {
                        "time": "2023-12-31T11:19:38.679925+0800",
                        "event": "initiated"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679925+0800",
                        "event": "throttled"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679925+0800",
                        "event": "header_read"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679936+0800",
                        "event": "all_read"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679940+0800",
                        "event": "dispatched"
                    }
                ]
            }
        },
        {
            "description": "peer_request:mds.1:2",
            "initiated_at": "2023-12-31T11:19:38.679938+0800",
            "age": 4358.8009326969996,
            "duration": 4358.8009763549999,
            "type_data": {
                "flag_point": "dispatched",
                "reqid": "mds.1:2",
                "op_type": "peer_request",
                "leader_info": {
                    "leader": "1"
                },
                "events": [
                    {
                        "time": "2023-12-31T11:19:38.679938+0800",
                        "event": "initiated"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679938+0800",
                        "event": "throttled"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679938+0800",
                        "event": "header_read"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679941+0800",
                        "event": "all_read"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679991+0800",
                        "event": "dispatched"
                    }
                ]
            }
        }
    ],
    "complaint_time": 30,
    "num_blocked_ops": 2
}


mds.1

{
    "ops": [
        {
            "description": "internal op exportdir:mds.1:1",
            "initiated_at": "2023-12-31T11:19:34.416451+0800",
            "age": 4384.38814198,
            "duration": 4384.3881617610004,
            "type_data": {
                "flag_point": "failed to wrlock, waiting",
                "reqid": "mds.1:1",
                "op_type": "internal_op",
                "internal_op": 5377,
                "op_name": "exportdir",
                "events": [
                    {
                        "time": "2023-12-31T11:19:34.416451+0800",
                        "event": "initiated"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416451+0800",
                        "event": "throttled"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416451+0800",
                        "event": "header_read"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416451+0800",
                        "event": "all_read"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416451+0800",
                        "event": "dispatched"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679923+0800",
                        "event": "requesting remote authpins"
                    },
                    {
                        "time": "2023-12-31T11:19:38.693981+0800",
                        "event": "failed to wrlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "internal op exportdir:mds.1:2",
            "initiated_at": "2023-12-31T11:19:34.416482+0800",
            "age": 4384.3881117999999,
            "duration": 4384.3881714600002,
            "type_data": {
                "flag_point": "failed to wrlock, waiting",
                "reqid": "mds.1:2",
                "op_type": "internal_op",
                "internal_op": 5377,
                "op_name": "exportdir",
                "events": [
                    {
                        "time": "2023-12-31T11:19:34.416482+0800",
                        "event": "initiated"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416482+0800",
                        "event": "throttled"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416482+0800",
                        "event": "header_read"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416482+0800",
                        "event": "all_read"
                    },
                    {
                        "time": "2023-12-31T11:19:34.416482+0800",
                        "event": "dispatched"
                    },
                    {
                        "time": "2023-12-31T11:19:38.679929+0800",
                        "event": "requesting remote authpins"
                    },
                    {
                        "time": "2023-12-31T11:19:38.693995+0800",
                        "event": "failed to wrlock, waiting"
                    }
                ]
            }
        }
    ],
    "complaint_time": 30,
    "num_blocked_ops": 2
}



I can't find any other solution other than restarting the mds service
with slow requests.

Currently, the backlog of mds logs in the metadata pool exceeds 4TB.

Best regards,
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to