Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-17 Thread Andreas Dilger via lustre-discuss
d_delete()) ahome3-MDD: > error unlinking orphan [0x20001bb76:0xd86e:0x0]: rc = -2) - not sure if this > could be related to the problem. > > We also had one login node locked up with a load of >4000 and many ps > commands uninterruptible and hung. Again, not sure if this coul

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-17 Thread Kaizaad Bilimorya
> Mar 17 09:44:46 amds01b kernel: [] > > ptlrpc_main+0xb34/0x1470 [ptlrpc] > > Mar 17 09:44:46 amds01b kernel: [] ? > > ptlrpc_register_service+0xf80/0xf80 [ptlrpc] > > Mar 17 09:44:46 amds01b kernel: [] kthread+0xd1/0xe0 > > Mar 17 09:44:46 amds01b kernel: [] ? > > i

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-17 Thread Bernd Melchers
t_from_fork_nospec_begin+0x7/0x21 > Mar 17 09:44:46 amds01b kernel: [] ? > insert_kthread_work+0x40/0x40 > > Has anyone seen similar when using Lustre on ZFS and were you able to recover > the filesystem to a workable state - I've found a few similar problems in the > zfs mailing lists over the last

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-17 Thread Mountford, Christopher J. (Dr.) via lustre-discuss
years, but no solutions. Kind Regards, Christopher. From: Mountford, Christopher J. (Dr.) Sent: 15 March 2023 20:35 To: Colin Faber; Mountford, Christopher J. (Dr.) Cc: lustre-discuss Subject: Re: [lustre-discuss] Repeated ZFS panics on MDT The ZFS scrub

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-15 Thread Mountford, Christopher J. (Dr.) via lustre-discuss
rds, Christopher. From: lustre-discuss on behalf of Mountford, Christopher J. (Dr.) via lustre-discuss Sent: 15 March 2023 19:21 To: Colin Faber Cc: lustre-discuss Subject: Re: [lustre-discuss] Repeated ZFS panics on MDT ***CAUTION:*** This email was s

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-15 Thread Mountford, Christopher J. (Dr.) via lustre-discuss
small ssd based MDT). Thanks, Chris From: Colin Faber Sent: 15 March 2023 18:41 To: Mountford, Christopher J. (Dr.) Cc: lustre-discuss Subject: Re: [lustre-discuss] Repeated ZFS panics on MDT ***CAUTION:*** This email was sent from an EXTERNAL source. Think

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-15 Thread Colin Faber via lustre-discuss
Have you tried resilvering the pool? On Wed, Mar 15, 2023, 11:57 AM Mountford, Christopher J. (Dr.) via lustre-discuss wrote: > I'm hoping someone offer some suggestions. > > We have a problem on our production Lustre/ZFS filesystem (CentOS 7, ZFS > 0.7.13, Lustre 2.12.9), so far I've drawn a

[lustre-discuss] Repeated ZFS panics on MDT

2023-03-15 Thread Mountford, Christopher J. (Dr.) via lustre-discuss
I'm hoping someone offer some suggestions. We have a problem on our production Lustre/ZFS filesystem (CentOS 7, ZFS 0.7.13, Lustre 2.12.9), so far I've drawn a blank trying to track down the cause of this. We see the following zfs panic message in the logs (in every case the VERIFY3/panic