Re: unp_gc kernel crash on -current

Martin Pieuchot Tue, 05 Jul 2016 05:15:33 -0700

On 05/07/16(Tue) 12:28, Dimitris Papastamos wrote:
> On Mon, Jun 13, 2016 at 08:42:35AM -0700, Philip Guenther wrote:
> > On Sun, 12 Jun 2016, Dimitris Papastamos wrote:
> > > I was building ports and at the same time started chromium and the 
> > > kernel panic-ed.
> > > 
> > > This has happened a few times in the last week.  It started happening at 
> > > some point towards the end of May.  It is not easy to reproduce locally.
> > 
> > Thank you for the report!  When you say "not easy to reproduce locally", 
> > do you mean you _can_ semireliably reproduce it, such that if we found a 
> > suspicious commit to revert you would be able to be pretty sure whether it 
> > was really fixed?  If so, what's your best guess on how to tickle it?
> 
> I had a look at this.  I am currently exploring the possibility that
> the crash could be due to this patch from May 26th 2016:


Without more information it's hard to find what could be the reason
for this crash.  Being able to reproduce the crash easily is the key
to debugging.  Can you do that?

> [...]
> but I did not see an issue with it.  In the trace that I posted it
> seems to crash on line 905 in unp_gc() in sys/kern/uipc_usrreq.c:
> 
>    899          /* close any fds on the deferred list */
>    900          while ((defer = SLIST_FIRST(&unp_deferred)) != NULL) {
>    901                  SLIST_REMOVE_HEAD(&unp_deferred, ud_link);
>    902                  for (i = 0; i < defer->ud_n; i++) {
>    903                          memcpy(&fp, &((struct file **)(defer + 1))[i],
>    904                              sizeof(fp));
>    905                          FREF(fp); <-- here
>    906                          if ((unp = fptounp(fp)) != NULL)
>    907                                  unp->unp_msgcount--;
>    908                          unp_rights--;
>    909                          (void) closef(fp, NULL);
>    910                  }
>    911                  free(defer, M_TEMP, sizeof(*defer) + sizeof(fp) * 
> defer->ud_n);
>    912          }
> 
> The address that it crashes on seems to be aligned and not fiddled
> with.  So I suspect memory was unmapped and later on triggered a uvm
> fault when accessed via FREF().

Well the panic you reported, I don't know if you encountered any other,
was triggered by a NULL dereference.  That means that the defer heap was
containing at least a NULL pointer.

My bet is that unp_discard() is called twice for a set of fps, because as
you can see the set of fps is cleared after being enqueued.

If you can reproduce the crash, could you run with the diff below and
see if you can trigger the panic?

Index: kern/uipc_usrreq.c
===================================================================
RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.97
diff -u -p -u -6 -r1.97 uipc_usrreq.c
--- kern/uipc_usrreq.c  25 Apr 2016 20:18:31 -0000      1.97
+++ kern/uipc_usrreq.c  5 Jul 2016 12:06:59 -0000
@@ -1054,12 +1054,22 @@ unp_mark(struct file **rp, int nfds)
 }
 
 void
 unp_discard(struct file **rp, int nfds)
 {
        struct unp_deferral *defer;
+#ifdef DIAGNOSTIC
+       struct file *fp;
+       int i;
+
+       for (i = 0; i < nfds; i++) {
+               memcpy(&fp, &((struct file **)(rp))[i], sizeof(fp));
+               if (fp == NULL)
+                       panic("tell me what you really really want");
+       }
+#endif
 
        /* copy the file pointers to a deferral structure */
        defer = malloc(sizeof(*defer) + sizeof(*rp) * nfds, M_TEMP, M_WAITOK);
        defer->ud_n = nfds;
        memcpy(defer + 1, rp, sizeof(*rp) * nfds);
        memset(rp, 0, sizeof(*rp) * nfds);

Re: unp_gc kernel crash on -current

Reply via email to