On 05/07/16(Tue) 12:28, Dimitris Papastamos wrote: > On Mon, Jun 13, 2016 at 08:42:35AM -0700, Philip Guenther wrote: > > On Sun, 12 Jun 2016, Dimitris Papastamos wrote: > > > I was building ports and at the same time started chromium and the > > > kernel panic-ed. > > > > > > This has happened a few times in the last week. It started happening at > > > some point towards the end of May. It is not easy to reproduce locally. > > > > Thank you for the report! When you say "not easy to reproduce locally", > > do you mean you _can_ semireliably reproduce it, such that if we found a > > suspicious commit to revert you would be able to be pretty sure whether it > > was really fixed? If so, what's your best guess on how to tickle it? > > I had a look at this. I am currently exploring the possibility that > the crash could be due to this patch from May 26th 2016:
Without more information it's hard to find what could be the reason for this crash. Being able to reproduce the crash easily is the key to debugging. Can you do that? > [...] > but I did not see an issue with it. In the trace that I posted it > seems to crash on line 905 in unp_gc() in sys/kern/uipc_usrreq.c: > > 899 /* close any fds on the deferred list */ > 900 while ((defer = SLIST_FIRST(&unp_deferred)) != NULL) { > 901 SLIST_REMOVE_HEAD(&unp_deferred, ud_link); > 902 for (i = 0; i < defer->ud_n; i++) { > 903 memcpy(&fp, &((struct file **)(defer + 1))[i], > 904 sizeof(fp)); > 905 FREF(fp); <-- here > 906 if ((unp = fptounp(fp)) != NULL) > 907 unp->unp_msgcount--; > 908 unp_rights--; > 909 (void) closef(fp, NULL); > 910 } > 911 free(defer, M_TEMP, sizeof(*defer) + sizeof(fp) * > defer->ud_n); > 912 } > > The address that it crashes on seems to be aligned and not fiddled > with. So I suspect memory was unmapped and later on triggered a uvm > fault when accessed via FREF(). Well the panic you reported, I don't know if you encountered any other, was triggered by a NULL dereference. That means that the defer heap was containing at least a NULL pointer. My bet is that unp_discard() is called twice for a set of fps, because as you can see the set of fps is cleared after being enqueued. If you can reproduce the crash, could you run with the diff below and see if you can trigger the panic? Index: kern/uipc_usrreq.c =================================================================== RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.97 diff -u -p -u -6 -r1.97 uipc_usrreq.c --- kern/uipc_usrreq.c 25 Apr 2016 20:18:31 -0000 1.97 +++ kern/uipc_usrreq.c 5 Jul 2016 12:06:59 -0000 @@ -1054,12 +1054,22 @@ unp_mark(struct file **rp, int nfds) } void unp_discard(struct file **rp, int nfds) { struct unp_deferral *defer; +#ifdef DIAGNOSTIC + struct file *fp; + int i; + + for (i = 0; i < nfds; i++) { + memcpy(&fp, &((struct file **)(rp))[i], sizeof(fp)); + if (fp == NULL) + panic("tell me what you really really want"); + } +#endif /* copy the file pointers to a deferral structure */ defer = malloc(sizeof(*defer) + sizeof(*rp) * nfds, M_TEMP, M_WAITOK); defer->ud_n = nfds; memcpy(defer + 1, rp, sizeof(*rp) * nfds); memset(rp, 0, sizeof(*rp) * nfds);