[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
Hi Matt, On 11/15/23 02:40, Matt Larson wrote: On CentOS 7 systems with the CephFS kernel client, if the data pool has a `nearfull` status there is a slight reduction in write speeds (possibly 20-50% fewer IOPS). On a similar Rocky 8 system with the CephFS kernel client, if the data pool has `nearfull` status, a similar test shows write speeds at different block sizes shows the IOPS < 150 bottlenecked vs the typical write performance that might be with 2-3 IOPS at a particular block size. Is there any way to avoid the extremely bottlenecked IOPS seen on the Rocky 8 system CephFS kernel clients during the `nearfull` condition or to have behavior more similar to the CentOS 7 CephFS clients? Do different OS or Linux kernels have greatly different ways they respond or limit on the IOPS? Are there any options to adjust how they limit on IOPS? Just to be clear that the kernel on CentOS 7 is lower than the kernel on Rocky 8, they may behave differently someway. BTW, are the ceph versions the same for your test between CentOS 7 and Rocky 8 ? I saw in libceph.ko there has some code will handle the OSD FULL case, but I didn't find the near full case, let's get help from Ilya about this. @Ilya, Do you know will the osdc will behave differently when it detects the pool is near full ? Thanks - Xiubo Thanks, Matt ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
On Thu, Nov 16, 2023 at 3:21 AM Xiubo Li wrote: > > Hi Matt, > > On 11/15/23 02:40, Matt Larson wrote: > > On CentOS 7 systems with the CephFS kernel client, if the data pool has a > > `nearfull` status there is a slight reduction in write speeds (possibly > > 20-50% fewer IOPS). > > > > On a similar Rocky 8 system with the CephFS kernel client, if the data pool > > has `nearfull` status, a similar test shows write speeds at different block > > sizes shows the IOPS < 150 bottlenecked vs the typical write > > performance that might be with 2-3 IOPS at a particular block size. > > > > Is there any way to avoid the extremely bottlenecked IOPS seen on the Rocky > > 8 system CephFS kernel clients during the `nearfull` condition or to have > > behavior more similar to the CentOS 7 CephFS clients? > > > > Do different OS or Linux kernels have greatly different ways they respond > > or limit on the IOPS? Are there any options to adjust how they limit on > > IOPS? > > Just to be clear that the kernel on CentOS 7 is lower than the kernel on > Rocky 8, they may behave differently someway. BTW, are the ceph versions > the same for your test between CentOS 7 and Rocky 8 ? > > I saw in libceph.ko there has some code will handle the OSD FULL case, > but I didn't find the near full case, let's get help from Ilya about this. > > @Ilya, > > Do you know will the osdc will behave differently when it detects the > pool is near full ? Hi Xiubo, It's not libceph or osdc, but CephFS itself. I think Matt is running against this fix: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7614209736fbc4927584d4387faade4f31444fce It was previously discussed in detail here: https://lore.kernel.org/ceph-devel/caoi1vp_k2ybx9+jffmuhcuxsyngftqjyh+frusyy4ureprk...@mail.gmail.com/ The solution is to add additional capacity or bump the nearfull threshold: https://lore.kernel.org/ceph-devel/23f46ca6dd1f45a78beede92fc91d...@mpinat.mpg.de/ Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
Ilya, Thank you for providing these discussion threads on the Kernel fixes for where there was a change and details on this affects the clients. What is the expected behavior in CephFS client when there are multiple data pools in the CephFS? Does having 'nearfull' in any data pool in the CephFS then trigger the synchronous writes for clients even if they would be writing to a CephFS location mapped to a non-nearfull data pool? I.e. is 'nearfull' / sync behavior global across the same CephFS filesystem? Thanks, Matt On Thu, Nov 16, 2023 at 8:39 AM Ilya Dryomov wrote: > On Thu, Nov 16, 2023 at 3:21 AM Xiubo Li wrote: > > > > Hi Matt, > > > > On 11/15/23 02:40, Matt Larson wrote: > > > On CentOS 7 systems with the CephFS kernel client, if the data pool > has a > > > `nearfull` status there is a slight reduction in write speeds (possibly > > > 20-50% fewer IOPS). > > > > > > On a similar Rocky 8 system with the CephFS kernel client, if the data > pool > > > has `nearfull` status, a similar test shows write speeds at different > block > > > sizes shows the IOPS < 150 bottlenecked vs the typical write > > > performance that might be with 2-3 IOPS at a particular block > size. > > > > > > Is there any way to avoid the extremely bottlenecked IOPS seen on the > Rocky > > > 8 system CephFS kernel clients during the `nearfull` condition or to > have > > > behavior more similar to the CentOS 7 CephFS clients? > > > > > > Do different OS or Linux kernels have greatly different ways they > respond > > > or limit on the IOPS? Are there any options to adjust how they limit on > > > IOPS? > > > > Just to be clear that the kernel on CentOS 7 is lower than the kernel on > > Rocky 8, they may behave differently someway. BTW, are the ceph versions > > the same for your test between CentOS 7 and Rocky 8 ? > > > > I saw in libceph.ko there has some code will handle the OSD FULL case, > > but I didn't find the near full case, let's get help from Ilya about > this. > > > > @Ilya, > > > > Do you know will the osdc will behave differently when it detects the > > pool is near full ? > > Hi Xiubo, > > It's not libceph or osdc, but CephFS itself. I think Matt is running > against this fix: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7614209736fbc4927584d4387faade4f31444fce > > It was previously discussed in detail here: > > > https://lore.kernel.org/ceph-devel/caoi1vp_k2ybx9+jffmuhcuxsyngftqjyh+frusyy4ureprk...@mail.gmail.com/ > > The solution is to add additional capacity or bump the nearfull > threshold: > > > https://lore.kernel.org/ceph-devel/23f46ca6dd1f45a78beede92fc91d...@mpinat.mpg.de/ > > Thanks, > > Ilya > -- Matt Larson, PhD Madison, WI 53705 U.S.A. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
On Thu, Nov 16, 2023 at 5:26 PM Matt Larson wrote: > > Ilya, > > Thank you for providing these discussion threads on the Kernel fixes for > where there was a change and details on this affects the clients. > > What is the expected behavior in CephFS client when there are multiple data > pools in the CephFS? Does having 'nearfull' in any data pool in the CephFS > then trigger the synchronous writes for clients even if they would be writing > to a CephFS location mapped to a non-nearfull data pool? I.e. is 'nearfull' / > sync behavior global across the same CephFS filesystem? I would expect it to apply only to the pool in question (i.e. not be global), but let's get Xiubo or someone else working on CephFS to confirm. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
On 11/16/23 22:39, Ilya Dryomov wrote: On Thu, Nov 16, 2023 at 3:21 AM Xiubo Li wrote: Hi Matt, On 11/15/23 02:40, Matt Larson wrote: On CentOS 7 systems with the CephFS kernel client, if the data pool has a `nearfull` status there is a slight reduction in write speeds (possibly 20-50% fewer IOPS). On a similar Rocky 8 system with the CephFS kernel client, if the data pool has `nearfull` status, a similar test shows write speeds at different block sizes shows the IOPS < 150 bottlenecked vs the typical write performance that might be with 2-3 IOPS at a particular block size. Is there any way to avoid the extremely bottlenecked IOPS seen on the Rocky 8 system CephFS kernel clients during the `nearfull` condition or to have behavior more similar to the CentOS 7 CephFS clients? Do different OS or Linux kernels have greatly different ways they respond or limit on the IOPS? Are there any options to adjust how they limit on IOPS? Just to be clear that the kernel on CentOS 7 is lower than the kernel on Rocky 8, they may behave differently someway. BTW, are the ceph versions the same for your test between CentOS 7 and Rocky 8 ? I saw in libceph.ko there has some code will handle the OSD FULL case, but I didn't find the near full case, let's get help from Ilya about this. @Ilya, Do you know will the osdc will behave differently when it detects the pool is near full ? Hi Xiubo, It's not libceph or osdc, but CephFS itself. I think Matt is running against this fix: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7614209736fbc4927584d4387faade4f31444fce It was previously discussed in detail here: https://lore.kernel.org/ceph-devel/caoi1vp_k2ybx9+jffmuhcuxsyngftqjyh+frusyy4ureprk...@mail.gmail.com/ The solution is to add additional capacity or bump the nearfull threshold: https://lore.kernel.org/ceph-devel/23f46ca6dd1f45a78beede92fc91d...@mpinat.mpg.de/ Yeah, correct. I just missed this commit. Thanks Ilya. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?
On 11/17/23 00:41, Ilya Dryomov wrote: On Thu, Nov 16, 2023 at 5:26 PM Matt Larson wrote: Ilya, Thank you for providing these discussion threads on the Kernel fixes for where there was a change and details on this affects the clients. What is the expected behavior in CephFS client when there are multiple data pools in the CephFS? Does having 'nearfull' in any data pool in the CephFS then trigger the synchronous writes for clients even if they would be writing to a CephFS location mapped to a non-nearfull data pool? I.e. is 'nearfull' / sync behavior global across the same CephFS filesystem? I would expect it to apply only to the pool in question (i.e. not be global), but let's get Xiubo or someone else working on CephFS to confirm. It seems just before mimic when any pool is nearfull it will affect the whole cephfs, and after since mimic the 'CEPH_OSDMAP_NEARFULL' has been deprecated, so it will depends on each pool. -#define CEPH_OSDMAP_NEARFULL (1<<0) /* sync writes (near ENOSPC) */ -#define CEPH_OSDMAP_FULL (1<<1) /* no data writes (ENOSPC) */ +#define CEPH_OSDMAP_NEARFULL (1<<0) /* sync writes (near ENOSPC), deprecated since mimic*/ +#define CEPH_OSDMAP_FULL (1<<1) /* no data writes (ENOSPC), deprecated since mimic */ Thanks - Xiubo Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io