Dear Xiubo, thanks for your reply.
> BTW, did you enabled the async dirop ? Currently this is disabled by > default in 4.18.0-486.el8.x86_64. I have never heard about that option until now. How do I check that and how to I disable it if necessary? I'm in meetings pretty much all day and will try to send some more info later. > Could you reproduce this by enabling the mds debug logs ? Not right now. Our users are annoyed enough already. I first need to figure out how to move the troublesome inode somewhere else where I might be able to do something. The boot message shows up on this one file server every time. Is there any information about what dir/inode might be causing the issue? How could I reproduce this without affecting the users, say, by re-creating the same condition somewhere else? Any hints are appreciated. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Xiubo Li <xiu...@redhat.com> Sent: Thursday, May 11, 2023 3:45 AM To: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-users] Re: mds dump inode crashes file system Hey Frank, On 5/10/23 21:44, Frank Schilder wrote: > The kernel message that shows up on boot on the file server in text format: > > May 10 13:56:59 rit-pfile01 kernel: WARNING: CPU: 3 PID: 34 at > fs/ceph/caps.c:689 ceph_add_cap+0x53e/0x550 [ceph] > May 10 13:56:59 rit-pfile01 kernel: Modules linked in: ceph libceph > dns_resolver nls_utf8 isofs cirrus drm_shmem_helper intel_rapl_msr iTCO_wdt > intel_rapl_common iTCO_vendor_support drm_kms_helper syscopyarea sysfillrect > sysimgblt fb_sys_fops pcspkr joydev virtio_net drm i2c_i801 net_failover > virtio_balloon failover lpc_ich nfsd nfs_acl lockd auth_rpcgss grace sunrpc > sr_mod cdrom sg xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci > libahci ghash_clmulni_intel libata serio_raw virtio_blk virtio_console > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod fuse > May 10 13:56:59 rit-pfile01 kernel: CPU: 3 PID: 34 Comm: kworker/3:0 Not > tainted 4.18.0-486.el8.x86_64 #1 > May 10 13:56:59 rit-pfile01 kernel: Hardware name: Red Hat KVM/RHEL-AV, BIOS > 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014 > May 10 13:56:59 rit-pfile01 kernel: Workqueue: ceph-msgr ceph_con_workfn > [libceph] > May 10 13:56:59 rit-pfile01 kernel: RIP: 0010:ceph_add_cap+0x53e/0x550 [ceph] > May 10 13:56:59 rit-pfile01 kernel: Code: c0 48 c7 c7 c0 69 7f c0 e8 6c 4c 72 > c3 0f 0b 44 89 7c 24 04 e9 7e fc ff ff 44 8b 7c 24 04 e9 68 fe ff ff 0f 0b e9 > c9 fc ff ff <0f> 0b e9 0a fe ff ff 0f 0b e9 12 fe ff ff 0f 0b 66 90 0f 1f 44 > 00 > May 10 13:56:59 rit-pfile01 kernel: RSP: 0018:ffffa4d000d87b48 EFLAGS: > 00010217 > May 10 13:56:59 rit-pfile01 kernel: RAX: 0000000000000000 RBX: > 0000000000000005 RCX: dead000000000200 > May 10 13:56:59 rit-pfile01 kernel: RDX: ffff92d7d7f6e7d0 RSI: > ffff92d7d7f6e7d0 RDI: ffff92d7d7f6e7c8 > May 10 13:56:59 rit-pfile01 kernel: RBP: ffff92d7c5588970 R08: > ffff92d7d7f6e7d0 R09: 0000000000000001 > May 10 13:56:59 rit-pfile01 kernel: R10: ffff92d80078cbb8 R11: > 00000000ffff92c0 R12: 0000000000000155 > May 10 13:56:59 rit-pfile01 kernel: R13: ffff92d80078cbb8 R14: > ffff92d80078cbc0 R15: 0000000000000001 > May 10 13:56:59 rit-pfile01 kernel: FS: 0000000000000000(0000) > GS:ffff92d937d80000(0000) knlGS:0000000000000000 > May 10 13:56:59 rit-pfile01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > May 10 13:56:59 rit-pfile01 kernel: CR2: 00007f74435b9008 CR3: > 00000001099fa000 CR4: 00000000003506e0 > May 10 13:56:59 rit-pfile01 kernel: Call Trace: > May 10 13:56:59 rit-pfile01 kernel: ceph_handle_caps+0xdf2/0x1780 [ceph] > May 10 13:56:59 rit-pfile01 kernel: mds_dispatch+0x13a/0x670 [ceph] > May 10 13:56:59 rit-pfile01 kernel: ceph_con_process_message+0x79/0x140 > [libceph] > May 10 13:56:59 rit-pfile01 kernel: ? calc_signature+0xdf/0x110 [libceph] > May 10 13:56:59 rit-pfile01 kernel: ceph_con_v1_try_read+0x5d7/0xf30 [libceph] > May 10 13:56:59 rit-pfile01 kernel: ceph_con_workfn+0x329/0x680 [libceph] > May 10 13:56:59 rit-pfile01 kernel: process_one_work+0x1a7/0x360 > May 10 13:56:59 rit-pfile01 kernel: worker_thread+0x30/0x390 > May 10 13:56:59 rit-pfile01 kernel: ? create_worker+0x1a0/0x1a0 > May 10 13:56:59 rit-pfile01 kernel: kthread+0x134/0x150 > May 10 13:56:59 rit-pfile01 kernel: ? set_kthread_struct+0x50/0x50 > May 10 13:56:59 rit-pfile01 kernel: ret_from_fork+0x35/0x40 > May 10 13:56:59 rit-pfile01 kernel: ---[ end trace 84e4b3694bbe9fde ]--- BTW, did you enabled the async dirop ? Currently this is disabled by default in 4.18.0-486.el8.x86_64. The async dirop is buggy and we have hit a very similar bug as above, please see https://tracker.ceph.com/issues/55857. This is a racy between the client requests and dir migrating in MDS and this has been fixed long time ago. If you didn't enable the async dirop then it should be a different issue without the async dirop. But I guess this should also be a racy between client requests and dir migrating in non-async dirop case. And from the kernel call trace, I just guess the MDS was doing dir spliting and migrating and the MDS daemon's crash possible caused by: "dirfragtree.dump(f);" Could you reproduce this by enabling the mds debug logs ? Thanks - Xiubo > I can't interpret it, some help is appreciated. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Frank Schilder <fr...@dtu.dk> > Sent: Wednesday, May 10, 2023 3:36 PM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: mds dump inode crashes file system > > For the "mds dump inode" command I could find the crash in the log; see > below. Most of the log contents is the past OPS dump from the 3 MDS restarts > that happened. It contains the 10000 last OPS before the crash and I can > upload the log if someone can use it. The crash stack trace somewhat > truncated for readability: > > 2023-05-10T12:54:53.142+0200 7fe971ca6700 1 mds.ceph-23 Updating MDS map to > version 892464 from mon.4 > 2023-05-10T13:39:50.962+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : > client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb > pending pAsLsXsFscr issued pAsLsXsFscr, sent 61.705410 seconds ago > 2023-05-10T13:39:52.550+0200 7fe971ca6700 1 mds.ceph-23 Updating MDS map to > version 892465 from mon.4 > 2023-05-10T13:40:50.963+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : > client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb > pending pAsLsXsFscr issued pAsLsXsFscr, sent 121.706193 seconds ago > 2023-05-10T13:42:50.966+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : > client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb > pending pAsLsXsFscr issued pAsLsXsFscr, sent 241.709072 seconds ago > 2023-05-10T13:44:00.506+0200 7fe972ca8700 1 mds.ceph-23 asok_command: dump > inode {number=2199322355147,prefix=dump inode} (starting...) > 2023-05-10T13:44:00.520+0200 7fe972ca8700 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/common/buffer.cc: > In function 'const char* ceph::buffer::v15_2_0::ptr::c_str() const' thread > 7fe972ca8700 time 2023-05-10T13:44:00.507652+0200 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/common/buffer.cc: > 501: FAILED ceph_assert(_raw) > > ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x158) [0x7fe979ae9b92] > 2: (()+0x27ddac) [0x7fe979ae9dac] > 3: (()+0x5ce831) [0x7fe979e3a831] > 4: (InodeStoreBase::dump(ceph::Formatter*) const+0x153) [0x55c08c59b543] > 5: (CInode::dump(ceph::Formatter*, int) const+0x144) [0x55c08c59b8d4] > 6: (MDCache::dump_inode(ceph::Formatter*, unsigned long)+0x7c) > [0x55c08c41e00c] > 7: (MDSRank::command_dump_inode(ceph::Formatter*, ..., std::ostream&)+0xb5) > [0x55c08c353e75] > 8: (MDSRankDispatcher::handle_asok_command(std::basic_string_view<char, > std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x2296) > [0x55c08c36c5f6] > 9: (MDSDaemon::asok_command(std::basic_string_view<char, ..., > ceph::buffer::v15_2_0::list&)>)+0x75b) [0x55c08c340eab] > 10: (MDSSocketHook::call_async(std::basic_string_view<char, > std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x6a) > [0x55c08c34f9ca] > 11: > (AdminSocket::execute_command(std::vector<std::__cxx11::basic_string<char, > std::char_traits<char>, ..., ceph::buffer::v15_2_0::list&)>)+0x6f9) > [0x7fe979bece59] > 12: (AdminSocket::do_tell_queue()+0x289) [0x7fe979bed809] > 13: (AdminSocket::entry()+0x4d3) [0x7fe979beefd3] > 14: (()+0xc2ba3) [0x7fe977afaba3] > 15: (()+0x81ca) [0x7fe9786bf1ca] > 16: (clone()+0x43) [0x7fe977111dd3] > > 2023-05-10T13:44:00.522+0200 7fe972ca8700 -1 *** Caught signal (Aborted) ** > in thread 7fe972ca8700 thread_name:admin_socket > > ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus > (stable) > 1: (()+0x12ce0) [0x7fe9786c9ce0] > 2: (gsignal()+0x10f) [0x7fe977126a9f] > 3: (abort()+0x127) [0x7fe9770f9e05] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x1a9) [0x7fe979ae9be3] > 5: (()+0x27ddac) [0x7fe979ae9dac] > 6: (()+0x5ce831) [0x7fe979e3a831] > 7: (InodeStoreBase::dump(ceph::Formatter*) const+0x153) [0x55c08c59b543] > 8: (CInode::dump(ceph::Formatter*, int) const+0x144) [0x55c08c59b8d4] > 9: (MDCache::dump_inode(ceph::Formatter*, unsigned long)+0x7c) > [0x55c08c41e00c] > 10: (MDSRank::command_dump_inode(ceph::Formatter*, ..., > std::ostream&)+0xb5) [0x55c08c353e75] > 11: (MDSRankDispatcher::handle_asok_command(std::basic_string_view<char, > std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x2296) > [0x55c08c36c5f6] > 12: (MDSDaemon::asok_command(std::basic_string_view<char, > std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x75b) > [0x55c08c340eab] > 13: (MDSSocketHook::call_async(std::basic_string_view<char, > std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x6a) > [0x55c08c34f9ca] > 14: > (AdminSocket::execute_command(std::vector<std::__cxx11::basic_string<char, > std::char_traits<char>, ..., ceph::buffer::v15_2_0::list&)>)+0x6f9) > [0x7fe979bece59] > 15: (AdminSocket::do_tell_queue()+0x289) [0x7fe979bed809] > 16: (AdminSocket::entry()+0x4d3) [0x7fe979beefd3] > 17: (()+0xc2ba3) [0x7fe977afaba3] > 18: (()+0x81ca) [0x7fe9786bf1ca] > 19: (clone()+0x43) [0x7fe977111dd3] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Frank Schilder <fr...@dtu.dk> > Sent: Wednesday, May 10, 2023 2:33 PM > To: ceph-users@ceph.io > Subject: [ceph-users] mds dump inode crashes file system > > Hi all, > > I have an annoying problem with a specific ceph fs client. We have a file > server on which we re-export kernel mounts via samba (all mounts with noshare > option). On one of these re-exports we have recurring problems. Today I > caught it with > > 2023-05-10T13:39:50.963685+0200 mds.ceph-23 (mds.1) 1761 : cluster [WRN] > client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb > pending pAsLsXsFscr issued pAsLsXsFscr, sent 61.705410 seconds ago > > and I wanted to look up what path the inode 0x20011d3e5cb points to. > Unfortunately, the command > > ceph tell "mds.*" dump inode 0x20011d3e5cb > > crashes an MDS in a way that it restarts itself, but doesn't seem to come > back clean (it does not fail over to a stand-by). If I repeat the command > above, it crashes the MDS again. Execution on other MDS daemons succeeds, for > example: > > # ceph tell "mds.ceph-24" dump inode 0x20011d3e5cb > 2023-05-10T14:14:37.091+0200 7fa47ffff700 0 client.210149523 ms_handle_reset > on v2:192.168.32.88:6800/3216233914 > 2023-05-10T14:14:37.124+0200 7fa4857fa700 0 client.210374440 ms_handle_reset > on v2:192.168.32.88:6800/3216233914 > dump inode failed, wrong inode number or the inode is not cached > > The caps recall gets the client evicted at some point but it doesn't manage > to come back clean. On a single ceph fs mount point I see this > > # ls /shares/samba/rit-oil > ls: cannot access '/shares/samba/rit-oil': Stale file handle > > All other mount points are fine, just this one acts up. A "mount -o remount > /shares/samba/rit-oil" crashed the entire server and I had to do a cold > reboot. On reboot I see this message: https://imgur.com/a/bOSLxBb , which > only occurs on this one file server (we are running a few of those). Does > this point to a more serious problem, like a file system corruption? Should I > try an fs scrub on the corresponding path? > > Some info about the system: > > The file server's kernel version is quite recent, updated two weeks ago: > > $ uname -r > 4.18.0-486.el8.x86_64 > # cat /etc/redhat-release > CentOS Stream release 8 > > Our ceph cluster is octopus latest and we use the packages from the octopus > el8 repo on this server. > > We have several such shares and they all work fine. It is only on one share > where we have persistent problems with the mount point hanging or the server > freezing and crashing. > > After working hours I will try a proper fail of the "broken" MDS to see if I > can execute the dump inode command without it crashing everything. > > In the mean time, any hints would be appreciated. I see that we have an > exceptionally large MDS log for the problematic one. Any hint what to look > for would be appreciated, it contains a lot from the recovery operations: > > # pdsh -w ceph-[08-17,23-24] ls -lh "/var/log/ceph/ceph-mds.ceph-??.log" > > ceph-23: -rw-r--r--. 1 ceph ceph 15M May 10 14:28 > /var/log/ceph/ceph-mds.ceph-23.log *** huge *** > > ceph-24: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-24.log > ceph-10: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 > /var/log/ceph/ceph-mds.ceph-10.log > ceph-13: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 > /var/log/ceph/ceph-mds.ceph-13.log > ceph-08: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 > /var/log/ceph/ceph-mds.ceph-08.log > ceph-15: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-15.log > ceph-17: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-17.log > ceph-14: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-14.log > ceph-09: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-09.log > ceph-16: -rw-r--r--. 1 ceph ceph 15K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-16.log > ceph-11: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 > /var/log/ceph/ceph-mds.ceph-11.log > ceph-12: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 > /var/log/ceph/ceph-mds.ceph-12.log > > Thanks and best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io