from:"Jacobson, Erik"

Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

2024-03-21 Thread Jacobson, Erik

Dear team. I made a new PR (sorry some experience showing in github.com I 
created a new PR instead of updating the old one. Seemed easier to close the 
old one and use the new one than fix the old one).

In the new PR, I integrated feedback. Thank you so much.
https://github.com/gluster/glusterfs/pull/4322

I am attaching to this email my notes on reproducing this environment. I used 
virtual machines and a constrained test environment to duplicate the problem 
and test the fix. I hope these notes resolve all the outstanding questions.

If not, please let me know! Thanks again to all.

Erik



From: Jacobson, Erik 
Date: Monday, March 18, 2024 at 10:22 AM
To: Aravinda 
Cc: Gluster Devel 
Subject: Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug
I will need to set up a test case that is isolated.

In the meantime, I did a fork and a PR. I marked it as draft as I try to find 
an easier test case.

https://github.com/gluster/glusterfs/pull/4319

From: Aravinda 
Date: Saturday, March 16, 2024 at 9:37 AM
To: Jacobson, Erik 
Cc: Gluster Devel 
Subject: Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug
> We ran into some trouble in Gluster 9.3 with the Gluster NFS server. We 
> updated to a supported Gluster  9.6 and reproduced the problem.

Please share the reproducer steps. We can include in our tests if possible.

> We understand the Gluster team recommends the use of Ganesha for NFS but in 
> our specific environment and use case, Ganesha isn’t fast enough. No 
> disrespect intended; we never got the chance to work with the Ganesha team on 
> it.

That is totally fine. I think gnfs is disabled in the later versions, you have 
to build from source to enable it. Only issue I see is gnfs doesn't support NFS 
v4 and the NFS+Gluster team shifted the focus to NFS Ganesha.

> We tried to avoid Ganesha and Gluster NFS altogether, using kernel NFS with 
> fuse mounts exported, and that was faster, but failover didn’t work. We could 
> make the mount point highly available but not open files (so when the IP 
> failover happened, the mount point would still function but the open file – a 
> squashfs in this example – would not fail over).

Was Gluster backup volfile server option used or any other method used for high 
availability?

> So we embarked on a mission to try to figure out what was going on with the 
> NFS server. I am not an expert in network code or distributed filesystems. 
> So, someone with a careful eye would need to check these changes out. 
> However, what I generally found was that the Gluster NFS server requires the 
> layers of gluster to report back ‘errno’ to determine if EINVAL is set (to 
> determine is_eof). In some instances, errno was not being passed down the 
> chain or was being reset to 0. This resulted in NFS traces showing multiple 
> READs for a 1 byte file and the NFS client showing an “I/O” error. It seemed 
> like files above 170M worked ok. This is likely due to how the layers of 
> gluster change with changing and certain file sizes. However, we did not 
> track this part down.

> We found in one case disabling the NFS performance IO cache would fix the 
> problem for a non-sharded volume, but the problem persisted in a sharded 
> volume. Testing found our environment takes the disabling of the NFS 
> performance IO cache quite hard anyway, so it wasn’t an option for us.

> We were curious why the fuse client wouldn’t be impacted but our quick look 
> found that fuse doesn’t really use or need errno in the same way Gluster NFS 
> does.

> So, the attached patch fixed the issue. Accessing small files in either case 
> above now work properly. We tried running md5sum against large files over NFS 
> and fuse mounts and everything seemed fine.

> In our environment, the NFS-exported directories tend to contain squashfs 
> files representing read-only root filesystems for compute nodes, and those 
> worked fine over NFS after the change as well.

> If you do not wish to include this patch because Gluster NFS is deprecated, I 
> would greatly appreciate it if someone could validate my work as our solution 
> will need Gluster NFS enabled for the time being. I am concerned I could have 
> missed a nuance and caused a hard to detect problem.

We can surely include this patch in Gluster repo since many tests are still 
using this feature and it is available for interested users. Thanks for the PR. 
Please submit the PR to Github repo, I will followup with the maintainers and 
update. Let me know if you need any help to submit the PR.

--
Thanks and Regards
Aravinda
Kadalu Technologies



 On Thu, 14 Mar 2024 01:32:50 +0530 Jacobson, Erik  
wrote ---

Hello team.

We ran into some trouble in Gluster 9.3 with the Gluster NFS server. We updated 
to a supported Gluster  9.6 and reproduced the problem.

We understand the Gluster team rec

Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

2024-03-18 Thread Jacobson, Erik

I will need to set up a test case that is isolated.

In the meantime, I did a fork and a PR. I marked it as draft as I try to find 
an easier test case.

https://github.com/gluster/glusterfs/pull/4319

From: Aravinda 
Date: Saturday, March 16, 2024 at 9:37 AM
To: Jacobson, Erik 
Cc: Gluster Devel 
Subject: Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug
> We ran into some trouble in Gluster 9.3 with the Gluster NFS server. We 
> updated to a supported Gluster  9.6 and reproduced the problem.

Please share the reproducer steps. We can include in our tests if possible.

> We understand the Gluster team recommends the use of Ganesha for NFS but in 
> our specific environment and use case, Ganesha isn’t fast enough. No 
> disrespect intended; we never got the chance to work with the Ganesha team on 
> it.

That is totally fine. I think gnfs is disabled in the later versions, you have 
to build from source to enable it. Only issue I see is gnfs doesn't support NFS 
v4 and the NFS+Gluster team shifted the focus to NFS Ganesha.

> We tried to avoid Ganesha and Gluster NFS altogether, using kernel NFS with 
> fuse mounts exported, and that was faster, but failover didn’t work. We could 
> make the mount point highly available but not open files (so when the IP 
> failover happened, the mount point would still function but the open file – a 
> squashfs in this example – would not fail over).

Was Gluster backup volfile server option used or any other method used for high 
availability?

> So we embarked on a mission to try to figure out what was going on with the 
> NFS server. I am not an expert in network code or distributed filesystems. 
> So, someone with a careful eye would need to check these changes out. 
> However, what I generally found was that the Gluster NFS server requires the 
> layers of gluster to report back ‘errno’ to determine if EINVAL is set (to 
> determine is_eof). In some instances, errno was not being passed down the 
> chain or was being reset to 0. This resulted in NFS traces showing multiple 
> READs for a 1 byte file and the NFS client showing an “I/O” error. It seemed 
> like files above 170M worked ok. This is likely due to how the layers of 
> gluster change with changing and certain file sizes. However, we did not 
> track this part down.

> We found in one case disabling the NFS performance IO cache would fix the 
> problem for a non-sharded volume, but the problem persisted in a sharded 
> volume. Testing found our environment takes the disabling of the NFS 
> performance IO cache quite hard anyway, so it wasn’t an option for us.

> We were curious why the fuse client wouldn’t be impacted but our quick look 
> found that fuse doesn’t really use or need errno in the same way Gluster NFS 
> does.

> So, the attached patch fixed the issue. Accessing small files in either case 
> above now work properly. We tried running md5sum against large files over NFS 
> and fuse mounts and everything seemed fine.

> In our environment, the NFS-exported directories tend to contain squashfs 
> files representing read-only root filesystems for compute nodes, and those 
> worked fine over NFS after the change as well.

> If you do not wish to include this patch because Gluster NFS is deprecated, I 
> would greatly appreciate it if someone could validate my work as our solution 
> will need Gluster NFS enabled for the time being. I am concerned I could have 
> missed a nuance and caused a hard to detect problem.

We can surely include this patch in Gluster repo since many tests are still 
using this feature and it is available for interested users. Thanks for the PR. 
Please submit the PR to Github repo, I will followup with the maintainers and 
update. Let me know if you need any help to submit the PR.

--
Thanks and Regards
Aravinda
Kadalu Technologies

 On Thu, 14 Mar 2024 01:32:50 +0530 Jacobson, Erik  
wrote ---

Hello team.

We ran into some trouble in Gluster 9.3 with the Gluster NFS server. We updated 
to a supported Gluster  9.6 and reproduced the problem.

We understand the Gluster team recommends the use of Ganesha for NFS but in our 
specific environment and use case, Ganesha isn’t fast enough. No disrespect 
intended; we never got the chance to work with the Ganesha team on it.

We tried to avoid Ganesha and Gluster NFS altogether, using kernel NFS with 
fuse mounts exported, and that was faster, but failover didn’t work. We could 
make the mount point highly available but not open files (so when the IP 
failover happened, the mount point would still function but the open file – a 
squashfs in this example – would not fail over).

So we embarked on a mission to try to figure out what was going on with the NFS 
server. I am not an expert in network code or distributed filesystems. So, 
someone with a careful eye would need to check these changes o

[Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

2024-03-13 Thread Jacobson, Erik

Hello team.

We ran into some trouble in Gluster 9.3 with the Gluster NFS server. We updated 
to a supported Gluster  9.6 and reproduced the problem.

We understand the Gluster team recommends the use of Ganesha for NFS but in our 
specific environment and use case, Ganesha isn’t fast enough. No disrespect 
intended; we never got the chance to work with the Ganesha team on it.

We tried to avoid Ganesha and Gluster NFS altogether, using kernel NFS with 
fuse mounts exported, and that was faster, but failover didn’t work. We could 
make the mount point highly available but not open files (so when the IP 
failover happened, the mount point would still function but the open file – a 
squashfs in this example – would not fail over).

So we embarked on a mission to try to figure out what was going on with the NFS 
server. I am not an expert in network code or distributed filesystems. So, 
someone with a careful eye would need to check these changes out. However, what 
I generally found was that the Gluster NFS server requires the layers of 
gluster to report back ‘errno’ to determine if EINVAL is set (to determine 
is_eof). In some instances, errno was not being passed down the chain or was 
being reset to 0. This resulted in NFS traces showing multiple READs for a 1 
byte file and the NFS client showing an “I/O” error. It seemed like files above 
170M worked ok. This is likely due to how the layers of gluster change with 
changing and certain file sizes. However, we did not track this part down.

We found in one case disabling the NFS performance IO cache would fix the 
problem for a non-sharded volume, but the problem persisted in a sharded 
volume. Testing found our environment takes the disabling of the NFS 
performance IO cache quite hard anyway, so it wasn’t an option for us.

We were curious why the fuse client wouldn’t be impacted but our quick look 
found that fuse doesn’t really use or need errno in the same way Gluster NFS 
does.

So, the attached patch fixed the issue. Accessing small files in either case 
above now work properly. We tried running md5sum against large files over NFS 
and fuse mounts and everything seemed fine.

In our environment, the NFS-exported directories tend to contain squashfs files 
representing read-only root filesystems for compute nodes, and those worked 
fine over NFS after the change as well.

If you do not wish to include this patch because Gluster NFS is deprecated, I 
would greatly appreciate it if someone could validate my work as our solution 
will need Gluster NFS enabled for the time being. I am concerned I could have 
missed a nuance and caused a hard to detect problem.

Thank you all!

patch.txt attached.
diff -Narup glusterfs-9.6.sgi-ORIG/xlators/features/shard/src/shard.c 
glusterfs-9.6.sgi/xlators/features/shard/src/shard.c
--- glusterfs-9.6.sgi-ORIG/xlators/features/shard/src/shard.c   2022-08-09 
05:31:26.738079305 -0500
+++ glusterfs-9.6.sgi/xlators/features/shard/src/shard.c2024-03-13 
12:31:56.110756841 -0500
@@ -4852,8 +4852,11 @@ shard_readv_do_cbk(call_frame_t *frame,
 goto out;
 }
 
-if (local->op_ret >= 0)
+if (local->op_ret >= 0) {
 local->op_ret += op_ret;
+/* gnfs requires op_errno to determine is_eof */
+local->op_errno = op_errno;
+}
 
 shard_inode_ctx_get(anon_fd->inode, this, &ctx);
 block_num = ctx->block_num;
diff -Narup glusterfs-9.6.sgi-ORIG/xlators/performance/io-cache/src/page.c 
glusterfs-9.6.sgi/xlators/performance/io-cache/src/page.c
--- glusterfs-9.6.sgi-ORIG/xlators/performance/io-cache/src/page.c  
2022-08-09 05:31:26.825079586 -0500
+++ glusterfs-9.6.sgi/xlators/performance/io-cache/src/page.c   2024-03-13 
12:32:01.978748913 -0500
@@ -790,6 +790,8 @@ ioc_frame_unwind(call_frame_t *frame)
 GF_ASSERT(frame);
 
 local = frame->local;
+/* gnfs requires op_errno to determine is_eof */
+op_errno = local->op_errno;
 if (local == NULL) {
 gf_smsg(frame->this->name, GF_LOG_WARNING, ENOMEM,
 IO_CACHE_MSG_LOCAL_NULL, NULL);
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Using kernel NFS with a gluster fuse mount

2024-03-09 Thread Jacobson, Erik

Dear team,

First, a thank as always for making the gluster storage solution.

The question:
Years back, when we first started using Gluster, we had a requirement to 
provide NFS services. Documentation and discussions at the time made us feel as 
if we should not use the kernel NFS server from a fuse mount but rather use 
Ganesha. Recently I have seen people are using Linux kernel NFS server with 
gluster fuse mounts. It has been working well in our test environment. What are 
the drawbacks to exporting a fuse-mounted glusterfs filesystem using the Linux 
NFS server?



Optional background reading for the question:
At the time, and still today, Ganesha doesn’t serve our specific workload fast 
enough (no disrespect to the great people working on Ganesha). We therefore 
used and continue to use Gluster NFS.

Recently, we have begun to have issues with gluster NFS mounts when the nfs 
performance io cache is enabled. Files under 200M or so, in certain situations 
and systems but not all systems, will get an IO error on the NFS mount. The 
fuse access is fine. Turning off the cache makes the server nearly unusable for 
our specific workload and scale. We may write about that separately but we’re 
behind in gluster version and know we need to get current before we ask for 
help.

This problem at an important site had us looking for workarounds. We are using 
Ganesha as the work-around, but it is significantly slower. Internal testing 
showed exporting with kernel NFS was very slick. Hence the question.

Gluster is used for several things but for NFS, it’s largely a collection of 
squashfs files that represent root filesystems that is the most load.

Thank you all!!!
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

Re: [Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

[Gluster-devel] Gluster 9.6 changes to fix gluster NFS bug

[Gluster-devel] Using kernel NFS with a gluster fuse mount

4 matches

Site Navigation

Mail list logo

Footer information