This list has been deprecated. Please subscribe to the new devel list at
lists.nfs-ganesha.org.
David, another option is to test with Ganesha2.7 as you are able to
recreate easily with V2.6.3.
On Mon, Oct 1, 2018 at 7:49 PM Daniel Gryniewicz <d...@redhat.com> wrote:
> This list has been deprecated. Please subscribe to the new devel list at
> lists.nfs-ganesha.org.
>
> I'm not seeing any easy way that cmpf could be corrupted. The structure
> before it is fairly complex, with it's last element being an integer, so
> it's unlikely that something wrote off the end of that. That leaves a
> random memory corruption, which is almost impossible to detect.
>
> David, can you rebuild your Ganesha? If so, can you build with the
> Address Sanitizer on? To do this, install libasan on your distro, and
> then pass -DSANITIZE_ADDRESS=ON to cmake. With ASAN enabled, you may
> get a crash at the time of corruption, rather than at some future point.
>
> Daniel
>
> On 10/01/2018 09:20 AM, Malahal Naineni wrote:
> > This list has been deprecated. Please subscribe to the new devel list at
> lists.nfs-ganesha.org.
> >
> >
> >
> > Looking at the code head->cmpf should be "clnt_req_xid_cmpf" function
> > address. Your gdb didn't show that, but I don't know how that could
> > happen with the V2.6.3 code though. @Dan, any insights for this issue?
> >
> > On Mon, Oct 1, 2018 at 2:22 PM David C <dcsysengin...@gmail.com
> > <mailto:dcsysengin...@gmail.com>> wrote:
> >
> > Hi Malahal
> >
> > Result of that command:
> >
> > (gdb) p head->cmpf
> > $1 = (opr_rbtree_cmpf_t) 0x31fb0b405ba000b7
> >
> > Thanks,
> >
> > On Mon, Oct 1, 2018 at 5:55 AM Malahal Naineni <mala...@gmail.com
> > <mailto:mala...@gmail.com>> wrote:
> >
> > Looks like the head is messed up. Run these in gdb and let us
> > know the second commands output. 1. "frame 0" 2.
> > "p head->cmpf". I believe, head->cmpf function is NULL or bad
> > leading to this segfault. I haven't seen this crash before and
> > never used Ganesha 2.6 version.
> >
> > Regards, Malahal.
> >
> > On Mon, Oct 1, 2018 at 1:25 AM David C <dcsysengin...@gmail.com
> > <mailto:dcsysengin...@gmail.com>> wrote:
> >
> > Hi Malahal
> >
> > I've set up ABRT so I'm now getting coredumps for the
> > crashes. I've installed debuginfo package for nfs-ganesha
> > and libntirpc.
> >
> > I'd be really grateful if you could give me some guidance on
> > debugging this.
> >
> > Some info on the latest crash:
> >
> > The following was echoed to the kernel log:
> >
> > traps: ganesha.nfsd[28589] general protection
> > ip:7fcf2421dded sp:7fcd9d4d03a0 error:0 in
> > libntirpc.so.1.6.3[7fcf2420d000+3d000]
> >
> >
> > Last lines of output from # gdb /usr/bin/ganesha.nfsd
> coredump:
> >
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > Core was generated by `/usr/bin/ganesha.nfsd -L
> > /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.c'.
> > Program terminated with signal 11, Segmentation fault.
> > #0 0x00007fcf2421dded in opr_rbtree_insert
> > (head=head@entry=0x7fcef800c528,
> > node=node@entry=0x7fce68004750) at
> > /usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
> > 271 switch (head->cmpf(node, parent)) {
> > Missing separate debuginfos, use: debuginfo-install
> > bzip2-libs-1.0.6-13.el7.x86_64
> > dbus-libs-1.10.24-7.el7.x86_64
> > elfutils-libelf-0.170-4.el7.x86_64
> > elfutils-libs-0.170-4.el7.x86_64 glibc-2.17-222.el7.x86_64
> > gssproxy-0.7.0-17.el7.x86_64
> > keyutils-libs-1.5.8-3.el7.x86_64
> > krb5-libs-1.15.1-19.el7.x86_64 libattr-2.4.46-13.el7.x86_64
> > libblkid-2.23.2-52.el7.x86_64 libcap-2.22-9.el7.x86_64
> > libcom_err-1.42.9-12.el7_5.x86_64
> > libgcc-4.8.5-28.el7_5.1.x86_64 libgcrypt-1.5.3-14.el7.x86_64
> > libgpg-error-1.12-3.el7.x86_64
> > libnfsidmap-0.25-19.el7.x86_64 libselinux-2.5-12.el7.x86_64
> > libuuid-2.23.2-52.el7.x86_64 lz4-1.7.5-2.el7.x86_64
> > pcre-8.32-17.el7.x86_64 systemd-libs-219-57.el7.x86_64
> > xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
> >
> > Output from bt:
> >
> > (gdb) bt
> > #0 0x00007fcf2421dded in opr_rbtree_insert
> > (head=head@entry=0x7fcef800c528,
> > node=node@entry=0x7fce68004750) at
> > /usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
> > #1 0x00007fcf24218eac in clnt_req_setup
> > (cc=cc@entry=0x7fce68004720, timeout=...) at
> > /usr/src/debug/ntirpc-1.6.3/src/clnt_generic.c:515
> > #2 0x000055d62490347f in nsm_unmonitor
> > (host=host@entry=0x7fce00018ea0) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/Protocols/NLM/nsm.c:219
> > #3 0x000055d6249425cf in dec_nsm_client_ref
> > (client=0x7fce00018ea0) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:857
> > #4 0x000055d624942f61 in free_nlm_client
> > (client=0x7fce00017500) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1039
> > #5 0x000055d6249431d3 in dec_nlm_client_ref
> > (client=0x7fce00017500) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1130
> > #6 0x000055d6249439ae in free_nlm_owner
> > (owner=owner@entry=0x7fce00024bc0) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1314
> > #7 0x000055d624929a48 in free_state_owner
> > (owner=0x7fce00024bc0) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/state_misc.c:818
> > #8 0x000055d624929dc0 in dec_state_owner_ref
> > (owner=0x7fce00024bc0) at
> > /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/state_misc.c:968
> > #9 0x000055d6248ff173 in nlm4_Unlock (args=0x7fce68003b98,
> > req=0x7fce68003490, res=0x7fce68000d70) at
> >
> /usr/src/debug/nfs-ganesha-2.6.3/src/Protocols/NLM/nlm_Unlock.c:127
> > #10 0x000055d6248c0f0f in nfs_rpc_process_request
> > (reqdata=0x7fce68003490) at
> >
> /usr/src/debug/nfs-ganesha-2.6.3/src/MainNFSD/nfs_worker_thread.c:1329
> > #11 0x000055d6248c02ba in nfs_rpc_decode_request
> > (xprt=0x7fcef011b600, xdrs=0x7fce68001480)
> > at
> >
>
> /usr/src/debug/nfs-ganesha-2.6.3/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
> > #12 0x00007fcf2422dbcd in svc_rqst_xprt_task
> > (wpe=0x7fcef011b818) at
> > /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:751
> > #13 0x00007fcf2422df2a in svc_rqst_epoll_events
> > (n_events=<optimized out>, sr_rec=0x55d6253b3fd0) at
> > /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:923
> > #14 svc_rqst_epoll_loop (sr_rec=<optimized out>) at
> > /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:996
> > #15 svc_rqst_run_task (wpe=0x55d6253b3fd0) at
> > /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:1032
> > #16 0x00007fcf2423671a in work_pool_thread
> > (arg=0x55d6282753f0) at
> > /usr/src/debug/ntirpc-1.6.3/src/work_pool.c:176
> > #17 0x00007fcf2465ce25 in start_thread () from
> > /lib64/libpthread.so.0
> > #18 0x00007fcf23d28bad in clone () from /lib64/libc.so.6
> >
> > Thanks for your assistance so far on this
> > David
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Sep 28, 2018 at 8:06 PM David C
> > <dcsysengin...@gmail.com <mailto:dcsysengin...@gmail.com>>
> > wrote:
> >
> > Thanks, Malahal. I'll get the coredumps enabled. I've
> > had a few more crashes today, hopefully they'll shed
> > some light on the issue.
> >
> > On Fri, Sep 28, 2018 at 1:20 PM Malahal Naineni
> > <mala...@gmail.com <mailto:mala...@gmail.com>> wrote:
> >
> > You need to enable coredumps for ganesha. Here are
> > some instructions! Step2 is NOT needed as your
> > packages are signed:
> >
> >
> https://ganltc.github.io/setup-to-take-ganesha-coredumps.html
> >
> > On Fri, Sep 28, 2018 at 4:38 PM David C
> > <dcsysengin...@gmail.com
> > <mailto:dcsysengin...@gmail.com>> wrote:
> >
> > This list has been deprecated. Please subscribe
> > to the new devel list at lists.nfs-ganesha.org
> > <http://lists.nfs-ganesha.org>.
> > Hi All
> >
> > CentOS 7.5
> > nfs-ganesha-2.6.3-1.el7.x86_64
> > nfs-ganesha-vfs-2.6.3-1.el7.x86_64
> > libntirpc-1.6.3-1.el7.x86_64
> >
> > My Ganesha service crashed and the following was
> > echoed to my kernel log:
> >
> > ganesha.nfsd[28752]: segfault at 0
> > ip (null) sp 00007ff9a2af8458
> > error 14 in ganesha.nfsd[559170ef3000+1a4000]
> >
> >
> > Nothing in my ganesha.log
> >
> > These are the log settings from my ganesha.conf:
> >
> > LOG {
> > ## Default log level for all
> components
> > Default_Log_Level = DEBUG;
> >
> > ## Configure per-component log
> levels.
> > #Components {
> > #FSAL = INFO;
> > #NFS4 = EVENT;
> > #}
> >
> > ## Where to log
> > Facility {
> > name = FILE;
> > destination =
> > "/var/log/ganesha.log";
> > enable = active;
> > }
> > }
> >
> >
> > This is an example of one of my exports (they're
> > all Nfsv3 with VFS FSAL):
> >
> > EXPORT
> > {
> > Export_Id = 80;
> > Path = /mnt/dir;
> > Pseudo = /mnt/dir;
> > Access_Type = RW;
> > Protocols = 3;
> > Transports = TCP;
> > Squash = no_root_squash;
> > Disable_ACL=False;
> > Filesystem_Id = 101.1;
> > CLIENT {
> > Clients = *;
> > Squash = None;
> > Access_Type = RW;
> > }
> > FSAL {
> > Name = VFS;
> > }
> > }
> >
> >
> > The exports are mounted on CentOS 7.4 clients
> > with autofs-5.0.7 and
> > nfs-utils-1.3.0-0.48.el7_4.x86_64
> >
> > This crashed occurred approx 2 hours after I
> > increased the number of clients accessing the
> > server by approx five clients, don't know if
> > that's related
> >
> > Could someone help me troubleshoot this please?
> >
> > Many thanks
> > David
> >
> >
> >
> >
> >
> > _______________________________________________
> > Nfs-ganesha-devel mailing list
> > Nfs-ganesha-devel@lists.sourceforge.net
> > <mailto:Nfs-ganesha-devel@lists.sourceforge.net>
> >
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >
> >
> >
> >
> >
> > _______________________________________________
> > Nfs-ganesha-devel mailing list
> > Nfs-ganesha-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >
>
>
>
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel