Re: [ceph-users] Crashed MDS (segfault)
Well, I coundn't identify which object I need to "rmomapkey" as instructed in https://tracker.ceph.com/issues/38452#note-12. This is the log around the crash: https://pastebin.com/muw34Qdc On Fri, Oct 25, 2019 at 11:27 AM Yan, Zheng wrote: > On Fri, Oct 25, 2019 at 9:42 PM Gustavo Tonini > wrote: > > > > Running "cephfs-data-scan init --force-init" solved the problem. > > > > Then I had to run "cephfs-journal-tool event recover_dentries summary" > and truncate the journal to fix the corrupted journal. > > > > CephFS worked well for approximately 3 hours and then our MDS crashed > again, apparently due to the bug described at > https://tracker.ceph.com/issues/38452 > > > > does the method in issue #38452 work for you? if not, please > debug_mds to 10, and set log around the crash to us > > > Yan, Zheng > > > On Wed, Oct 23, 2019, 02:24 Yan, Zheng wrote: > >> > >> On Tue, Oct 22, 2019 at 1:49 AM Gustavo Tonini > wrote: > >> > > >> > Is there a possibility to lose data if I use "cephfs-data-scan init > --force-init"? > >> > > >> > >> It only causes incorrect stat on root inode, can't cause data lose. > >> > >> running 'ceph daemon mds.a scrub_path / force repair' after mds > >> restart can fix the incorrect stat. > >> > >> > On Mon, Oct 21, 2019 at 4:36 AM Yan, Zheng wrote: > >> >> > >> >> On Fri, Oct 18, 2019 at 9:10 AM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> > > >> >> > Hi Zheng, > >> >> > the cluster is running ceph mimic. This warning about network only > appears when using nautilus' cephfs-journal-tool. > >> >> > > >> >> > "cephfs-data-scan scan_links" does not report any issue. > >> >> > > >> >> > How could variable "newparent" be NULL at > https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is > there a way to fix this? > >> >> > > >> >> > >> >> > >> >> try 'cephfs-data-scan init'. It will setup root inode's snaprealm. > >> >> > >> >> > On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng > wrote: > >> >> >> > >> >> >> On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> >> > > >> >> >> > No. The cluster was just rebalancing. > >> >> >> > > >> >> >> > The journal seems damaged: > >> >> >> > > >> >> >> > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 > journal inspect > >> >> >> > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler > create_socket couldn't create socket (97) Address family not supported by > protocol > >> >> >> > >> >> >> corrupted journal shouldn't cause error like this. This is more > like > >> >> >> network issue. please double check network config of your cluster. > >> >> >> > >> >> >> > Overall journal integrity: DAMAGED > >> >> >> > Corrupt regions: > >> >> >> > 0x1c5e4d904ab-1c5e4d9ddbc > >> >> >> > ceph@deployer:~$ > >> >> >> > > >> >> >> > Could a journal reset help with this? > >> >> >> > > >> >> >> > I could snapshot all FS pools and export the journal before to > guarantee a rollback to this state if something goes wrong with jounal > reset. > >> >> >> > > >> >> >> > On Thu, Oct 17, 2019, 09:07 Yan, Zheng > wrote: > >> >> >> >> > >> >> >> >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> >> >> > > >> >> >> >> > Dear ceph users, > >> >> >> >> > we're experiencing a segfault during MDS startup (replay > process) which is making our FS inaccessible. > >> >> >> >> > > >> >> >> >> > MDS log messages: > >> >> >> >> > > >> >> >> >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2
Re: [ceph-users] Crashed MDS (segfault)
Running "cephfs-data-scan init --force-init" solved the problem. Then I had to run "cephfs-journal-tool event recover_dentries summary" and truncate the journal to fix the corrupted journal. CephFS worked well for approximately 3 hours and then our MDS crashed again, apparently due to the bug described at https://tracker.ceph.com/issues/38452 On Wed, Oct 23, 2019, 02:24 Yan, Zheng wrote: > On Tue, Oct 22, 2019 at 1:49 AM Gustavo Tonini > wrote: > > > > Is there a possibility to lose data if I use "cephfs-data-scan init > --force-init"? > > > > It only causes incorrect stat on root inode, can't cause data lose. > > running 'ceph daemon mds.a scrub_path / force repair' after mds > restart can fix the incorrect stat. > > > On Mon, Oct 21, 2019 at 4:36 AM Yan, Zheng wrote: > >> > >> On Fri, Oct 18, 2019 at 9:10 AM Gustavo Tonini > wrote: > >> > > >> > Hi Zheng, > >> > the cluster is running ceph mimic. This warning about network only > appears when using nautilus' cephfs-journal-tool. > >> > > >> > "cephfs-data-scan scan_links" does not report any issue. > >> > > >> > How could variable "newparent" be NULL at > https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is > there a way to fix this? > >> > > >> > >> > >> try 'cephfs-data-scan init'. It will setup root inode's snaprealm. > >> > >> > On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng wrote: > >> >> > >> >> On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> > > >> >> > No. The cluster was just rebalancing. > >> >> > > >> >> > The journal seems damaged: > >> >> > > >> >> > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal > inspect > >> >> > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket > couldn't create socket (97) Address family not supported by protocol > >> >> > >> >> corrupted journal shouldn't cause error like this. This is more like > >> >> network issue. please double check network config of your cluster. > >> >> > >> >> > Overall journal integrity: DAMAGED > >> >> > Corrupt regions: > >> >> > 0x1c5e4d904ab-1c5e4d9ddbc > >> >> > ceph@deployer:~$ > >> >> > > >> >> > Could a journal reset help with this? > >> >> > > >> >> > I could snapshot all FS pools and export the journal before to > guarantee a rollback to this state if something goes wrong with jounal > reset. > >> >> > > >> >> > On Thu, Oct 17, 2019, 09:07 Yan, Zheng wrote: > >> >> >> > >> >> >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> >> > > >> >> >> > Dear ceph users, > >> >> >> > we're experiencing a segfault during MDS startup (replay > process) which is making our FS inaccessible. > >> >> >> > > >> >> >> > MDS log messages: > >> >> >> > > >> >> >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] > v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 > 0 0) 0x2776340 con 0x18bd500 > >> >> >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 MDSIOContextBase::complete: > 18C_IO_Inode_Fetched > >> >> >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 > >> >> >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs > volume v011' (expecting 'ceph fs volume v011') > >> >> >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) > open_parents [1,head] > >> >> >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 > [...2,head] ~mds0/
Re: [ceph-users] Crashed MDS (segfault)
Is there a possibility to lose data if I use "cephfs-data-scan init --force-init"? On Mon, Oct 21, 2019 at 4:36 AM Yan, Zheng wrote: > On Fri, Oct 18, 2019 at 9:10 AM Gustavo Tonini > wrote: > > > > Hi Zheng, > > the cluster is running ceph mimic. This warning about network only > appears when using nautilus' cephfs-journal-tool. > > > > "cephfs-data-scan scan_links" does not report any issue. > > > > How could variable "newparent" be NULL at > https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is > there a way to fix this? > > > > > try 'cephfs-data-scan init'. It will setup root inode's snaprealm. > > > On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng wrote: > >> > >> On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> > > >> > No. The cluster was just rebalancing. > >> > > >> > The journal seems damaged: > >> > > >> > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal > inspect > >> > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket > couldn't create socket (97) Address family not supported by protocol > >> > >> corrupted journal shouldn't cause error like this. This is more like > >> network issue. please double check network config of your cluster. > >> > >> > Overall journal integrity: DAMAGED > >> > Corrupt regions: > >> > 0x1c5e4d904ab-1c5e4d9ddbc > >> > ceph@deployer:~$ > >> > > >> > Could a journal reset help with this? > >> > > >> > I could snapshot all FS pools and export the journal before to > guarantee a rollback to this state if something goes wrong with jounal > reset. > >> > > >> > On Thu, Oct 17, 2019, 09:07 Yan, Zheng wrote: > >> >> > >> >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> >> > > >> >> > Dear ceph users, > >> >> > we're experiencing a segfault during MDS startup (replay process) > which is making our FS inaccessible. > >> >> > > >> >> > MDS log messages: > >> >> > > >> >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] > v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 > 0 0) 0x2776340 con 0x18bd500 > >> >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 MDSIOContextBase::complete: > 18C_IO_Inode_Fetched > >> >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 > >> >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs > volume v011' (expecting 'ceph fs volume v011') > >> >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) > open_parents [1,head] > >> >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 > [...2,head] ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 > rc2020-07-17 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion > lock) 0x18bf800] > >> >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 MDSIOContextBase::complete: > 18C_IO_Inode_Fetched > >> >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 > >> >> > Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume > v011' (expecting 'ceph fs volume v011') > >> >> > Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 > 00:40:30.205 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 > in thread 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: > (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm > const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x3
Re: [ceph-users] Crashed MDS (segfault)
Hi Zheng, the cluster is running ceph mimic. This warning about network only appears when using nautilus' cephfs-journal-tool. "cephfs-data-scan scan_links" does not report any issue. How could variable "newparent" be NULL at https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is there a way to fix this? On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng wrote: > On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini > wrote: > > > > No. The cluster was just rebalancing. > > > > The journal seems damaged: > > > > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal inspect > > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket > couldn't create socket (97) Address family not supported by protocol > > corrupted journal shouldn't cause error like this. This is more like > network issue. please double check network config of your cluster. > > > Overall journal integrity: DAMAGED > > Corrupt regions: > > 0x1c5e4d904ab-1c5e4d9ddbc > > ceph@deployer:~$ > > > > Could a journal reset help with this? > > > > I could snapshot all FS pools and export the journal before to guarantee > a rollback to this state if something goes wrong with jounal reset. > > > > On Thu, Oct 17, 2019, 09:07 Yan, Zheng wrote: > >> > >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> > > >> > Dear ceph users, > >> > we're experiencing a segfault during MDS startup (replay process) > which is making our FS inaccessible. > >> > > >> > MDS log messages: > >> > > >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] > v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 > 0 0) 0x2776340 con 0x18bd500 > >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 > >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents > [1,head] > >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] > ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 > 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800] > >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 > >> > Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > >> > Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 00:40:30.205 > 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread > 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: > (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm > const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) > [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) > [0x6e1f63]#012 5: > (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 > 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, > Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) > [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e) > [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: (clone()+0x6d) > [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or `objdump -rdS > ` is needed to interpret this. > >> > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels --- > >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none > >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep >
[ceph-users] Crashed MDS (segfault)
Dear ceph users, we're experiencing a segfault during MDS startup (replay process) which is making our FS inaccessible. MDS log messages: Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 0 0) 0x2776340 con 0x18bd500 Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs volume v011' (expecting 'ceph fs volume v011') Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents [1,head] Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800] Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume v011' (expecting 'ceph fs volume v011') Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 00:40:30.205 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) [0x6e1f63]#012 5: (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e) [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: (clone()+0x6d) [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels --- Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep Cluster status information: cluster: id: b8205875-e56f-4280-9e52-6aab9c758586 health: HEALTH_WARN 1 filesystem is degraded 1 nearfull osd(s) 11 pool(s) nearfull services: mon: 3 daemons, quorum mon1,mon2,mon3 mgr: mon1(active), standbys: mon2, mon3 mds: fs_padrao-1/1/1 up {0=mds1=up:replay(laggy or crashed)} osd: 90 osds: 90 up, 90 in data: pools: 11 pools, 1984 pgs objects: 75.99 M objects, 285 TiB usage: 457 TiB used, 181 TiB / 639 TiB avail pgs: 1896 active+clean 87 active+clean+scrubbing+deep+repair 1active+clean+scrubbing io: client: 89 KiB/s wr, 0 op/s rd, 3 op/s wr Has anyone seen anything like this? Regards, Gustavo. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com