[lustre-discuss] Disk caching
I'm currently looking at how to get the best performance out of Lustre when deploying to Azure and would like to know if disk caching should be enabled/considered at all at the OS or Azure level? I have the following options available to me in Azure: Host Caching: None, Read-Only or Read-Write I'm getting high IO_WAIT build up with my current configuration with Host Caching: Read-Write enabled in Azure for my disks (as well as the default OS disk caching) and was wondering if that might be the cause. Thanks ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] [EXTERNAL] Re: Lustre Timeouts/Filesystem Hanging
Thanks, will take a look. Any other areas i should be looking? Should i be applying any Lustre tuning? Thanks Get Outlook for Android<https://aka.ms/ghei36> From: Oral, H. Sent: Monday, October 28, 2019 7:06:41 PM To: Louis Allen ; Carlson, Timothy S ; lustre-discuss@lists.lustre.org Subject: Re: [EXTERNAL] Re: [lustre-discuss] Lustre Timeouts/Filesystem Hanging For inspecting client side I/O, you can use Darshan. Thanks, Sarp -- Sarp Oral, PhD National Center for Computational Sciences Oak Ridge National Laboratory ora...@ornl.gov 865-574-2173 On 10/28/19, 1:58 PM, "lustre-discuss on behalf of Louis Allen" wrote: Thanks for the reply, Tim. Are there any tools I can use to see if that is the cause? Could any tuning possibly help the situation? Thanks From: Carlson, Timothy S Sent: Monday, 28 October 2019, 17:24 To: Louis Allen; lustre-discuss@lists.lustre.org Subject: RE: Lustre Timeouts/Filesystem Hanging In my experience, this is almost always related to some code doing really bad I/O. Let’s say you have a 1000 rank MPI code doing open/read 4k/close on a few specific files on that OST. That will make for a bad day. The other place you can see this, and this isn’t your case, is when ZFS refuses to give up on a disk that is failing and your overall I/O suffers from ZFS continuing to try to read from a disk that it should just kick out Tim From: lustre-discuss On Behalf Of Louis Allen Sent: Monday, October 28, 2019 10:16 AM To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] Lustre Timeouts/Filesystem Hanging Hello, Lustre (2.12) seem to be hanging quite frequently (5+ times a day) for us and one of the OSS servers (out of 4) is reporting an extremely high load average (150+) but the CPU usage of that server is actually very low - so it must be related to something else - possibly CPU_IO_WAIT. The OSS server we are seeing the high load averages we can also see multiple LustreError messages in /var/log/messages: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 was inactive for 200.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Skipped 4 previous similar messages Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2403, comm: ll_ost00_068 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] kthread+0xd1/0xe0 Oct 28 11:22:23 pazlustreoss001 kernel: [] ret_from_fork_nospec_end+0x0/0x39 Oct 28 11:22:23 pazlustreoss001 kernel: [] 0x Oct 28 11:22:23 pazlustreoss001 kernel: LustreError: dumping log to /tmp/lustre-log.1572261743.2403 Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2292, comm: ll_ost03_043 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 completed after 200.29s. This indi
Re: [lustre-discuss] Lustre Timeouts/Filesystem Hanging
Thanks for the reply, Tim. Are there any tools I can use to see if that is the cause? Could any tuning possibly help the situation? Thanks From: Carlson, Timothy S Sent: Monday, 28 October 2019, 17:24 To: Louis Allen; lustre-discuss@lists.lustre.org Subject: RE: Lustre Timeouts/Filesystem Hanging In my experience, this is almost always related to some code doing really bad I/O. Let’s say you have a 1000 rank MPI code doing open/read 4k/close on a few specific files on that OST. That will make for a bad day. The other place you can see this, and this isn’t your case, is when ZFS refuses to give up on a disk that is failing and your overall I/O suffers from ZFS continuing to try to read from a disk that it should just kick out Tim From: lustre-discuss On Behalf Of Louis Allen Sent: Monday, October 28, 2019 10:16 AM To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] Lustre Timeouts/Filesystem Hanging Hello, Lustre (2.12) seem to be hanging quite frequently (5+ times a day) for us and one of the OSS servers (out of 4) is reporting an extremely high load average (150+) but the CPU usage of that server is actually very low - so it must be related to something else - possibly CPU_IO_WAIT. The OSS server we are seeing the high load averages we can also see multiple LustreError messages in /var/log/messages: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 was inactive for 200.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Skipped 4 previous similar messages Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2403, comm: ll_ost00_068 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] kthread+0xd1/0xe0 Oct 28 11:22:23 pazlustreoss001 kernel: [] ret_from_fork_nospec_end+0x0/0x39 Oct 28 11:22:23 pazlustreoss001 kernel: [] 0x Oct 28 11:22:23 pazlustreoss001 kernel: LustreError: dumping log to /tmp/lustre-log.1572261743.2403 Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2292, comm: ll_ost03_043 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 completed after 200.29s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Skipped 48 previous similar messages Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] kthread+0xd1/0xe0 Oct 28 11:22:23 pazlustreoss001 kernel: [] ret_from_fork_nospec_end+0x0/0x39 Oct 28 11:22:23 pazlustreoss001 kernel: [] 0x ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre Timeouts/Filesystem Hanging
Hello, Lustre (2.12) seem to be hanging quite frequently (5+ times a day) for us and one of the OSS servers (out of 4) is reporting an extremely high load average (150+) but the CPU usage of that server is actually very low - so it must be related to something else - possibly CPU_IO_WAIT. The OSS server we are seeing the high load averages we can also see multiple LustreError messages in /var/log/messages: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 was inactive for 200.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Skipped 4 previous similar messages Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2403, comm: ll_ost00_068 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] kthread+0xd1/0xe0 Oct 28 11:22:23 pazlustreoss001 kernel: [] ret_from_fork_nospec_end+0x0/0x39 Oct 28 11:22:23 pazlustreoss001 kernel: [] 0x Oct 28 11:22:23 pazlustreoss001 kernel: LustreError: dumping log to /tmp/lustre-log.1572261743.2403 Oct 28 11:22:23 pazlustreoss001 kernel: Pid: 2292, comm: ll_ost03_043 3.10.0-957.10.1.el7_lustre.x86_64 #1 SMP Sun May 26 21:48:35 UTC 2019 Oct 28 11:22:23 pazlustreoss001 kernel: Call Trace: Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_log_wait_commit+0xc5/0x140 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] jbd2_complete_transaction+0x52/0xa0 [jbd2] Oct 28 11:22:23 pazlustreoss001 kernel: [] ldiskfs_sync_file+0x2e2/0x320 [ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] vfs_fsync_range+0x20/0x30 Oct 28 11:22:23 pazlustreoss001 kernel: [] osd_object_sync+0xb1/0x160 [osd_ldiskfs] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_sync+0xb7/0x270 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ofd_sync_hdl+0x111/0x530 [ofd] Oct 28 11:22:23 pazlustreoss001 kernel: [] tgt_request_handle+0xaea/0x1580 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Service thread pid 2403 completed after 200.29s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Oct 28 11:22:23 pazlustreoss001 kernel: LNet: Skipped 48 previous similar messages Oct 28 11:22:23 pazlustreoss001 kernel: [] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] Oct 28 11:22:23 pazlustreoss001 kernel: [] kthread+0xd1/0xe0 Oct 28 11:22:23 pazlustreoss001 kernel: [] ret_from_fork_nospec_end+0x0/0x39 Oct 28 11:22:23 pazlustreoss001 kernel: [] 0x ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org