While testing Ganesha NFS V2.4.0.3 using the CEPH FSAL to a ceph file
system, I am seeing the ganesha.nfsd process die due to an assert call
multiple times per hour. I have also seen it die at the same place in
the code using the VFS FSAL with a ext4 file system, but it dies much
less often.
It is dying at line 917 in src/SAL/state_misc.c, which is called by
src/SAL/state_misc.c at line 1010. The assert call is in
dec_state_owner_ref() at the line:
assert(refcount > 0);
Looking at the core files and adding in some debugging code confirms
that refcount is -1 when the assert call is made.
It looks like the owner count is trying to go to -1 in
uncache_nfs4_owner(), but as it occurs only on occasions, I think it
is a race condition.
Info on the build:
Host OS is Ubuntu 14.04 with a 4.8.2 x86_64 kernel on a 8 processor system
Cmake command:
# cmake -DCMAKE_INSTALL_PREFIX=/opt/keeper -DALLOCATOR=jemalloc
-DUSE_ADMIN_TOOLS=ON -DUSE_DBUS=ON ../src
# ganesha.nfsd -v
ganesha.nfsd compiled on Oct 17 2016 at 16:50:18
Release = V2.4.0.3
Release comment = GANESHA file server is 64 bits compliant and
supports NFS v3,4.0,4.1 (pNFS) and 9P
Git HEAD = 0f55a9a97a4bf232fb0e42542e4ca7491fbf84ce
Git Describe = V2.4.0.3-0-g0f55a9a
# ceph -v
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
# cat ganesha.conf
LOG {
components {
ALL = INFO;
}
}
EXPORT_DEFAULTS {
SecType = none, sys;
Protocols = 3, 4;
Transports = TCP;
}
# define CephFS export
EXPORT {
Export_ID = 42;
Path = /top;
Pseudo = /top;
Access_Type = RW;
Squash = No_Root_Squash;
FSAL {
Name = CEPH;
}
}
The VFS export for the ext4 tests was:
# define CephFS export
EXPORT {
Export_ID = 43;
Path = /var/top;
Pseudo = /var/top;
Access_Type = RW;
Squash = No_Root_Squash;
FSAL {
Name = VFS;
}
}
The test was 2 Ubuntu 14.04 NFS clients each having 6 processes,
writing 11,000 256k files in separate directory trees with 11 files
per lowest level node. On each Ubuntu client, 3 processes wrote to a
NFS 3 mount and 3 wrote to a NFS 4 mount. The files are then read and
verified, deleted, and the test restarts.
Regards,
Eric
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel