Hi André,

On Sat, Jan 14, 2023 at 12:14 AM André de Freitas Smaira
<afsma...@ufscar.br> wrote:
>
> Hello!
>
> Yesterday we found some errors in our cephadm disks, which is making it
> impossible to access our HPC Cluster:
>
> # ceph health detail
> HEALTH_WARN 3 failed cephadm daemon(s); insufficient standby MDS daemons
> available
> [WRN] CEPHADM_FAILED_DAEMON: 3 failed cephadm daemon(s)
>     daemon mds.cephfs.s1.nvopyf on s1.ceph.infra.ufscar.br is in error state
>     daemon mds.cephfs.s2.qikxmw on s2.ceph.infra.ufscar.br is in error state
>     daemon mds.cftv.s2.anybzk on s2.ceph.infra.ufscar.br is in error state
> [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available
>     have 0; want 1 more

What's in the MDS logs?

>
> Googling we found out that we should remove the failed MDS, but the data in
> these disks is relatively important. We would like to know if we need to
> remove it or if it can be fixed, and if we have to remove it if the data
> will be lost. Please tell me if you need more information.
>
> Thanks in advance,
> André de Freitas Smaira
> Federal University of São Carlos - UFSCar
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Cheers,
Venky
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to