Last week, Daniel G had to revert one of my patches, because the CEA
tests segfaulted during shutdown.  Neither of us could reproduce.

Yesterday, I updated my Fedora debuginfos.  Suddenly, I could reproduce!

After _many_ *MANY* hours, I've discovered that it wasn't in my patch. :(

ntirpc git log b4254b92 claims that it is a patch based upon valgrind.
The long log message is almost entirely about Ganesha, not ntirpc.  It
was approved by Matt Benjamin.

In fact, the commit is almost entirely useless for ntirpc.

Here's the change in its entirely.  It's shorter than the log message:

git diff b4254b92^ b4254b92
diff --git a/src/clnt_vc.c b/src/clnt_vc.c
index 174e7e2..57f2b23 100644
--- a/src/clnt_vc.c
+++ b/src/clnt_vc.c
@@ -312,7 +312,7 @@ clnt_vc_ncreate2(int fd,    /* open file descriptor */
                 if (ct->ct_addr.len)
                         mem_free(ct->ct_addr.buf, ct->ct_addr.len);

-               if (xd->refcnt == 1) {
+               if (xd->refcnt == 0) {
                         XDR_DESTROY(&xd->shared.xdrs_in);
                         XDR_DESTROY(&xd->shared.xdrs_out);
                         free_x_vc_data(xd);
diff --git a/src/svc_vc.c b/src/svc_vc.c
index cabd325..e5071c1 100644
--- a/src/svc_vc.c
+++ b/src/svc_vc.c
@@ -658,10 +658,13 @@ svc_rdvs_destroy(SVCXPRT *xprt, u_int flags, const char 
*tag, const int line)
         mem_free(rdvs, sizeof(struct cf_rendezvous));
         mem_free(xprt, sizeof(SVCXPRT));

-       refcnt = rpc_dplx_unref(rec,
-                               RPC_DPLX_FLAG_LOCKED | RPC_DPLX_FLAG_UNLOCK);
-       if (!refcnt)
-               mem_free(xd, sizeof(struct x_vc_data));
+       (void)rpc_dplx_unref(rec, RPC_DPLX_FLAG_LOCKED | RPC_DPLX_FLAG_UNLOCK);
+
+       if (xd->refcnt == 0) {
+               XDR_DESTROY(&xd->shared.xdrs_in);
+               XDR_DESTROY(&xd->shared.xdrs_out);
+               free_x_vc_data(xd);
+       }
  }

  static void
===

It's crashing on the first XDR_DESTROY().

That will never work, because as the code states in svc_vc_ncreate2():

         /* duplex streams are not used by the rendevous transport */
         memset(&xd->shared.xdrs_in, 0, sizeof xd->shared.xdrs_in);
         memset(&xd->shared.xdrs_out, 0, sizeof xd->shared.xdrs_out);

===

That code was copy and pasted from clnt_vc.c in the same patch?!?!

Why did my patch discover the problem?  Because svc_rdvs_destroy() was
not being called on shutdown....  My patch fixes that bug.

*** So valgrind couldn't possibly have found this as an error. ***

I'll fix it in my future patch.  But we need to be a lot more careful,
and should not have irrelevant commit log messages that obscure the
actual underlying changes in the patch.

Time for bed....

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to