Re: [Gluster-users] NFS problem
hi shehjar, do these logs help you? if you need further information - just tell me thx christopher ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] NFS problem
We'll need the crash stack trace also. Christopher Anderlik wrote: here are our logs when nfs is crashing [2011-06-10 08:54:14.900049] D [nfs3-helpers.c:2424:nfs3_log_common_res] 0-nfs-nfsv3: XID: f8851fc2, ACCESS: NFS: 0(Call completed successfully.), POSIX: 0(Success) [2011-06-10 08:54:14.902002] D [rpcsvc.c:1940:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: f9851fc2, Ver: 2, Program: 13, ProgVers: 3, Proc: 1 [2011-06-10 08:54:14.902037] D [rpcsvc.c:1357:nfs_rpcsvc_program_actor] 0-nfsrpc: Actor found: NFS3 - GETATTR [2011-06-10 08:54:14.902062] D [nfs3-helpers.c:2292:nfs3_log_common_call] 0-nfs-nfsv3: XID: f9851fc2, GETATTR: args: FH: hashcount 3, exportid ea50df7c-ff08-4416-8fb3-59d09667cc51, gfid 74c48fb3-d065-462b-83a9-e4558b042465 [2011-06-10 08:54:14.920579] D [afr-transaction.c:976:afr_post_nonblocking_inodelk_cbk] 0-ksc-replicate-0: Non blocking inodelks done. Proceeding to FOP [2011-06-10 08:54:14.921099] D [client-lk.c:442:delete_granted_locks_fd] 0-ksc-client-0: Number of locks cleared=0 [2011-06-10 08:54:14.921155] D [client-lk.c:442:delete_granted_locks_fd] 0-ksc-client-1: Number of locks cleared=0 [2011-06-10 08:54:14.932629] D [nfs3-helpers.c:2424:nfs3_log_common_res] 0-nfs-nfsv3: XID: f9851fc2, GETATTR: NFS: 0(Call completed successfully.), POSIX: 0(Success) [2011-06-10 08:54:14.932863] D [rpcsvc.c:1940:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: fa851fc2, Ver: 2, Program: 13, ProgVers: 3, Proc: 4 [2011-06-10 08:54:14.932890] D [rpcsvc.c:1357:nfs_rpcsvc_program_actor] 0-nfsrpc: Actor found: NFS3 - ACCESS [2011-06-10 08:54:14.932907] D [nfs3-helpers.c:2292:nfs3_log_common_call] 0-nfs-nfsv3: XID: fa851fc2, ACCESS: args: FH: hashcount 3, exportid ea50df7c-ff08-4416-8fb3-59d09667cc51, gfid 74c48fb3-d065-462b-83a9-e4558b042465 [2011-06-10 08:54:14.961700] D [socket.c:193:__socket_rwv] 0-ksc-client-0: EOF from peer 10.0.1.198:24031 [2011-06-10 08:54:14.961741] W [socket.c:1494:__socket_proto_state_machine] 0-ksc-client-0: reading from socket failed. Error (Transport endpoint is not connected), peer (10.0.1.198:24031) [2011-06-10 08:54:14.961757] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.961858] E [rpc-clnt.c:338:saved_frames_unwind] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_notify+0x158) [0x7f140e8c0acc] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x101) [0x7f140e8c006a] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(saved_frames_destroy+0x1c) [0x7f140e8bfb78]))) 0-ksc-client-0: forced unwinding frame type(GlusterFS 3.1) op(SETATTR(38)) called at 2011-06-10 08:54:14.920644 [2011-06-10 08:54:14.961880] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-0: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.961880] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-0: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.961940] I [client.c:1883:client_rpc_notify] 0-ksc-client-0: disconnected [2011-06-10 08:54:14.988468] W [socket.c:204:__socket_rwv] 0-ksc-client-1: readv failed (Connection reset by peer) [2011-06-10 08:54:14.988490] W [socket.c:1494:__socket_proto_state_machine] 0-ksc-client-1: reading from socket failed. Error (Connection reset by peer), peer (10.0.1.199:24027) [2011-06-10 08:54:14.988501] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.988501] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.988551] E [rpc-clnt.c:338:saved_frames_unwind] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_notify+0x158) [0x7f140e8c0acc] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x101) [0x7f140e8c006a] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(saved_frames_destroy+0x1c) [0x7f140e8bfb78]))) 0-ksc-client-1: forced unwinding frame type(GlusterFS 3.1) op(SETATTR(38)) called at 2011-06-10 08:54:14.920655 [2011-06-10 08:54:14.988568] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-1: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.988568] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-1: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.988599] D [client.c:77:client_submit_request] 0-ksc-client-0: connection in disconnected state [2011-06-10 08:54:14.988630] W [client3_1-fops.c:4379:client3_1_xattrop] 0-ksc-client-0: failed to send the fop: Transport endpoint is not connected [2011-06-10 08:54:14.988657] D [name.c:157:client_fill_address_family] 0-ksc-client-1: address-family not specified, guessing it to be inet/inet6 [2011-06-10 08:54:14.991719] D [common-utils.c:151:gf_resolve_ip6] 0-resolver: returning ip-10.0.1.199 (port-24007) for hostname: 10.0.1.199 and port: 24007 [2011-06-10 08:54:14.991786] I [socket.c:2272:socket_submit_request] 0-ksc-
Re: [Gluster-users] NFS problem
here are our logs when nfs is crashing [2011-06-10 08:54:14.900049] D [nfs3-helpers.c:2424:nfs3_log_common_res] 0-nfs-nfsv3: XID: f8851fc2, ACCESS: NFS: 0(Call completed successfully.), POSIX: 0(Success) [2011-06-10 08:54:14.902002] D [rpcsvc.c:1940:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: f9851fc2, Ver: 2, Program: 13, ProgVers: 3, Proc: 1 [2011-06-10 08:54:14.902037] D [rpcsvc.c:1357:nfs_rpcsvc_program_actor] 0-nfsrpc: Actor found: NFS3 - GETATTR [2011-06-10 08:54:14.902062] D [nfs3-helpers.c:2292:nfs3_log_common_call] 0-nfs-nfsv3: XID: f9851fc2, GETATTR: args: FH: hashcount 3, exportid ea50df7c-ff08-4416-8fb3-59d09667cc51, gfid 74c48fb3-d065-462b-83a9-e4558b042465 [2011-06-10 08:54:14.920579] D [afr-transaction.c:976:afr_post_nonblocking_inodelk_cbk] 0-ksc-replicate-0: Non blocking inodelks done. Proceeding to FOP [2011-06-10 08:54:14.921099] D [client-lk.c:442:delete_granted_locks_fd] 0-ksc-client-0: Number of locks cleared=0 [2011-06-10 08:54:14.921155] D [client-lk.c:442:delete_granted_locks_fd] 0-ksc-client-1: Number of locks cleared=0 [2011-06-10 08:54:14.932629] D [nfs3-helpers.c:2424:nfs3_log_common_res] 0-nfs-nfsv3: XID: f9851fc2, GETATTR: NFS: 0(Call completed successfully.), POSIX: 0(Success) [2011-06-10 08:54:14.932863] D [rpcsvc.c:1940:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: fa851fc2, Ver: 2, Program: 13, ProgVers: 3, Proc: 4 [2011-06-10 08:54:14.932890] D [rpcsvc.c:1357:nfs_rpcsvc_program_actor] 0-nfsrpc: Actor found: NFS3 - ACCESS [2011-06-10 08:54:14.932907] D [nfs3-helpers.c:2292:nfs3_log_common_call] 0-nfs-nfsv3: XID: fa851fc2, ACCESS: args: FH: hashcount 3, exportid ea50df7c-ff08-4416-8fb3-59d09667cc51, gfid 74c48fb3-d065-462b-83a9-e4558b042465 [2011-06-10 08:54:14.961700] D [socket.c:193:__socket_rwv] 0-ksc-client-0: EOF from peer 10.0.1.198:24031 [2011-06-10 08:54:14.961741] W [socket.c:1494:__socket_proto_state_machine] 0-ksc-client-0: reading from socket failed. Error (Transport endpoint is not connected), peer (10.0.1.198:24031) [2011-06-10 08:54:14.961757] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.961858] E [rpc-clnt.c:338:saved_frames_unwind] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_notify+0x158) [0x7f140e8c0acc] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x101) [0x7f140e8c006a] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(saved_frames_destroy+0x1c) [0x7f140e8bfb78]))) 0-ksc-client-0: forced unwinding frame type(GlusterFS 3.1) op(SETATTR(38)) called at 2011-06-10 08:54:14.920644 [2011-06-10 08:54:14.961880] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-0: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.961880] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-0: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.961940] I [client.c:1883:client_rpc_notify] 0-ksc-client-0: disconnected [2011-06-10 08:54:14.988468] W [socket.c:204:__socket_rwv] 0-ksc-client-1: readv failed (Connection reset by peer) [2011-06-10 08:54:14.988490] W [socket.c:1494:__socket_proto_state_machine] 0-ksc-client-1: reading from socket failed. Error (Connection reset by peer), peer (10.0.1.199:24027) [2011-06-10 08:54:14.988501] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.988501] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now [2011-06-10 08:54:14.988551] E [rpc-clnt.c:338:saved_frames_unwind] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_notify+0x158) [0x7f140e8c0acc] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x101) [0x7f140e8c006a] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(saved_frames_destroy+0x1c) [0x7f140e8bfb78]))) 0-ksc-client-1: forced unwinding frame type(GlusterFS 3.1) op(SETATTR(38)) called at 2011-06-10 08:54:14.920655 [2011-06-10 08:54:14.988568] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-1: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.988568] I [client3_1-fops.c:1640:client3_1_setattr_cbk] 0-ksc-client-1: remote operation failed: Transport endpoint is not connected [2011-06-10 08:54:14.988599] D [client.c:77:client_submit_request] 0-ksc-client-0: connection in disconnected state [2011-06-10 08:54:14.988630] W [client3_1-fops.c:4379:client3_1_xattrop] 0-ksc-client-0: failed to send the fop: Transport endpoint is not connected [2011-06-10 08:54:14.988657] D [name.c:157:client_fill_address_family] 0-ksc-client-1: address-family not specified, guessing it to be inet/inet6 [2011-06-10 08:54:14.991719] D [common-utils.c:151:gf_resolve_ip6] 0-resolver: returning ip-10.0.1.199 (port-24007) for hostname: 10.0.1.199 and port: 24007 [2011-06-10 08:54:14.991786] I [socket.c:2272:socket_submit_request] 0-ksc-client-1: not connected (priv->connected = 0) [2011-06-10 08:54:14.991803] W [rpc-clnt.
Re: [Gluster-users] NFS problem
Hi Shehjar, That's good to know. I'll try the unfsd. Thanks for your help. Kind regards / Met vriendelijke groet, Jonas Bulthuis Shehjar Tikoo wrote: > Jonas Bulthuis wrote: >> Hi Shehjar, >> >> Thanks for your reply. We may be interested in testing a alpha version >> in the future. I cannot tell for sure right now, but if you can send me >> an e-mail at the time this version becomes available, we can see if we >> can fit it in. >> >> We're currently running the Gluster FS on Ubuntu (LTS) servers. I can >> access the volumes though the Gluster client on the same machines. Do >> you know whether it's possible to export the Gluster client mount point >> through nfs-kernel-server instead of the user space NFS server? or would >> that be unwise? >> > > It is possible but it is not real solution. Due to the way knfsd > talks to FUSE, some amount of state in the kernel needs to be kept > around indefinitely, which causes problems of excessive memory usage. > unfsd does not cause such a problem. > > > -Shehjar > > >> Kind regards / Met vriendelijke groet, >> Jonas Bulthuis >> >> >> Shehjar Tikoo wrote: >>> Hi >>> >>> Due to time constraints, booster has gone untested for the last couple >>> of months. I suggest using unfsd over fuse for the time >>> being. We'll be releasing an alpha of the NFS translator >>> somewhere in March. Let me know if you'd be interested in doing >>> early testing? >>> >>> Thanks >>> -Shehjar >>> >>> Jonas Bulthuis wrote: Hello, I'm using Gluster with cluster/replicate on two servers. On each of these servers I'm exporting the replicated volume through the UNFSv3 booster provided by Gluster. Multiple nfs clients are using these storage servers and most of the time it seems to work fine. However, sometimes the clients give error messages about a 'Stale NFS Handle' when trying to get a directory listing of some directory on the volume (not all directories gave this problem). Yesterday it happened after reinstalling the client machines. All the client machines had the same problem. Rebooting the client machines did not help. Eventually, restarting the UNFSv3 server solved the problem. At least the problem disappeared for now, but, as it happened twice in a short time now, it seems likely that it will occur again. Does anyone have any suggestion on how to permanently solve this problem? This is the nfs booster configuration we're currently using: /etc/glusterfs/cache_acceptation-tcp.vol /nfsexport_acceptation glusterfs subvolume=cache_acceptation,logfile=/usr/local/var/log/glusterfs/booster_acceptation.log,loglevel=DEBUG,attr_timeout=0 Any help will be very much appreciated. Thanks in advance. > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] NFS problem
Jonas Bulthuis wrote: Hi Shehjar, Thanks for your reply. We may be interested in testing a alpha version in the future. I cannot tell for sure right now, but if you can send me an e-mail at the time this version becomes available, we can see if we can fit it in. We're currently running the Gluster FS on Ubuntu (LTS) servers. I can access the volumes though the Gluster client on the same machines. Do you know whether it's possible to export the Gluster client mount point through nfs-kernel-server instead of the user space NFS server? or would that be unwise? It is possible but it is not real solution. Due to the way knfsd talks to FUSE, some amount of state in the kernel needs to be kept around indefinitely, which causes problems of excessive memory usage. unfsd does not cause such a problem. -Shehjar Kind regards / Met vriendelijke groet, Jonas Bulthuis Shehjar Tikoo wrote: Hi Due to time constraints, booster has gone untested for the last couple of months. I suggest using unfsd over fuse for the time being. We'll be releasing an alpha of the NFS translator somewhere in March. Let me know if you'd be interested in doing early testing? Thanks -Shehjar Jonas Bulthuis wrote: Hello, I'm using Gluster with cluster/replicate on two servers. On each of these servers I'm exporting the replicated volume through the UNFSv3 booster provided by Gluster. Multiple nfs clients are using these storage servers and most of the time it seems to work fine. However, sometimes the clients give error messages about a 'Stale NFS Handle' when trying to get a directory listing of some directory on the volume (not all directories gave this problem). Yesterday it happened after reinstalling the client machines. All the client machines had the same problem. Rebooting the client machines did not help. Eventually, restarting the UNFSv3 server solved the problem. At least the problem disappeared for now, but, as it happened twice in a short time now, it seems likely that it will occur again. Does anyone have any suggestion on how to permanently solve this problem? This is the nfs booster configuration we're currently using: /etc/glusterfs/cache_acceptation-tcp.vol /nfsexport_acceptation glusterfs subvolume=cache_acceptation,logfile=/usr/local/var/log/glusterfs/booster_acceptation.log,loglevel=DEBUG,attr_timeout=0 Any help will be very much appreciated. Thanks in advance. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] NFS problem
Hi Shehjar, Thanks for your reply. We may be interested in testing a alpha version in the future. I cannot tell for sure right now, but if you can send me an e-mail at the time this version becomes available, we can see if we can fit it in. We're currently running the Gluster FS on Ubuntu (LTS) servers. I can access the volumes though the Gluster client on the same machines. Do you know whether it's possible to export the Gluster client mount point through nfs-kernel-server instead of the user space NFS server? or would that be unwise? Kind regards / Met vriendelijke groet, Jonas Bulthuis Shehjar Tikoo wrote: > Hi > > Due to time constraints, booster has gone untested for the last couple > of months. I suggest using unfsd over fuse for the time > being. We'll be releasing an alpha of the NFS translator > somewhere in March. Let me know if you'd be interested in doing > early testing? > > Thanks > -Shehjar > > Jonas Bulthuis wrote: >> Hello, >> >> I'm using Gluster with cluster/replicate on two servers. On each of >> these servers I'm exporting the replicated volume through the UNFSv3 >> booster provided by Gluster. >> >> Multiple nfs clients are using these storage servers and most of the >> time it seems to work fine. However, sometimes the clients give error >> messages about a 'Stale NFS Handle' when trying to get a directory >> listing of some directory on the volume (not all directories gave this >> problem). Yesterday it happened after reinstalling the client machines. >> >> All the client machines had the same problem. Rebooting the client >> machines did not help. Eventually, restarting the UNFSv3 server solved >> the problem. >> >> At least the problem disappeared for now, but, as it happened twice in a >> short time now, it seems likely that it will occur again. >> >> Does anyone have any suggestion on how to permanently solve this problem? >> >> >> This is the nfs booster configuration we're currently using: >> >> /etc/glusterfs/cache_acceptation-tcp.vol /nfsexport_acceptation >> glusterfs >> subvolume=cache_acceptation,logfile=/usr/local/var/log/glusterfs/booster_acceptation.log,loglevel=DEBUG,attr_timeout=0 >> >> >> >> Any help will be very much appreciated. Thanks in advance. >> >> > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] NFS problem
Hi Due to time constraints, booster has gone untested for the last couple of months. I suggest using unfsd over fuse for the time being. We'll be releasing an alpha of the NFS translator somewhere in March. Let me know if you'd be interested in doing early testing? Thanks -Shehjar Jonas Bulthuis wrote: Hello, I'm using Gluster with cluster/replicate on two servers. On each of these servers I'm exporting the replicated volume through the UNFSv3 booster provided by Gluster. Multiple nfs clients are using these storage servers and most of the time it seems to work fine. However, sometimes the clients give error messages about a 'Stale NFS Handle' when trying to get a directory listing of some directory on the volume (not all directories gave this problem). Yesterday it happened after reinstalling the client machines. All the client machines had the same problem. Rebooting the client machines did not help. Eventually, restarting the UNFSv3 server solved the problem. At least the problem disappeared for now, but, as it happened twice in a short time now, it seems likely that it will occur again. Does anyone have any suggestion on how to permanently solve this problem? This is the nfs booster configuration we're currently using: /etc/glusterfs/cache_acceptation-tcp.vol /nfsexport_acceptation glusterfs subvolume=cache_acceptation,logfile=/usr/local/var/log/glusterfs/booster_acceptation.log,loglevel=DEBUG,attr_timeout=0 Any help will be very much appreciated. Thanks in advance. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users