Re: debugging frequent kernel panics on 8.2-RELEASE
On Sat, 20 Aug 2011, Steven Hartland wrote: Are you seeing a double fault panic? We're seeing both. At least one double (or more) fault finishing with "Fatal Trap 12: page fault while in kernel mode". Subsequent panics have been single fault (all visible on the IPMI console) "Fatal Trap 9: general protection fault while in kernel mode". Could well be unrelated. The system is undergoing hardware diags now. Roger Marquis ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On 08/21/11 05:01, Steven Hartland wrote: - Original Message - From: "Jamie Gritton" The problem isn't with the conditional locking of tpr in prison_deref. That locking is actually correct, and there's no race condition. Are you sure? I do think that unlocking the mtx half way through the call allows the above scenario to create a race condition, all be it very briefly, when ignoring the overriding issue. In addition if the code where changed to so that the pr_uref++ also maintained the parents uref this would definitely lead to a potential problems in my mind, especially if you had more than one child prison, of a given parent, entering the dying state at any one time. In this case I believe you would have to acquire the locks of all the parent prisons before it would be safe to precede. Lock order requires that I unlock the child if I want to lock the parent. While that does allow periods where neither is locked, it's safe in this case. There may be multiple processes dying in one jail, or in multiple children of a single jail. But as long as a parent jail is locked while decrementing pr_uref, then only one of these simultaneous prison_deref calls would set pr_uref to zero and continue in the loop to that prison's parent. This might be mixed with pr_uref being incremented elsewhere, but that's not a problem either as long as the jail in question is locked. The trouble lies in the resurrection of dead jails, as Andriy has noted (though not just attaching, but also by setting its persist flag causes the same problem). I not sure that persistent prisons actually suffer from this in any different way tbh, as they have an additional uref increment so would never hit this case unless they have been actively removed and hence unpersisted first. Right - both the attach and persist cases are only a problem when a jail has disappeared. There are various ways for a jail to be removed, potentially to be kept around but in the dying state, but only two related ways for it to be resurrected: attaching a new process or setting the persist flag, both via jail_set with the JAIL_DYING flag passed. There are two possible fixes to this. One is the patch you've given, which only decrements a parent jail's pr_uref when the child jail completely goes away (as opposed to when it loses its last uref). This provides symmetry with the current way pr_uref is incremented on the parent, which is only when a jail is created. The other fix is to increment a parent's pr_uref when a jail is resurrected, which will match the current logic in prison_deref. I like the external semantics of this solution: a jail isn't visible if it is not persistent and has no processes and no *visible* sub-jails, as opposed to having no sub-jails at all. But this solution ends up pretty complicated - there are a few places where pr_uref is incremented, where I might need to increment parent jails' pr_uref as well, much like the current tpr loop in prison_deref decrements them. Ahh yes in the hierarchical case my patch would indeed mean that none persistent parent jails would remain visible even when its last child jail is in a dying state. As you say making this not the case would likely require replacing all instances of pr_uref++ with a prison_uref method that implements the opposite of the loop in prison_dref should the prisons pr_uref be 0 when called. Yes, that's the problem. Maybe not all instances, but at least most have enough times a jail is unlocked that we can't assume the pr_uref hasn't been set to zero somewhere else, and so we need to do that loop. Your solution removes code instead of adding it, which is generally a good thing. While it does change the semantics of pr_uref in the hierarchical case at least from what I thought it was, those semantics haven't been working properly anyway. Good to know my interpretation was correct, even if I was missing the visibility factor in the hierarchical case :) Bjoern, I'm adding you to the CC list for this because the whole pr_uref thing was your idea (though it was pr_nprocs at the time), so you might care about the hierarchical semantics of it - or you may not. Also, this is a panic-inducing bug in current and may interest you for that reason. From an admin perspective the current jail dying state does cause confusion when your not aware of its existence. You ask a jail to stop it appears to have completed that request, but really hasn't, an generally due to just a lingering tcp connection. With the introduction of hierarchical jails that gets a little worse where a whole series of jails could disappear from normal view only to be resurrected shortly after. Something to bear in mind when deciding which solution of the two presented to use. The good news is that the only time a jail (or perhaps a whole set of jails) can only come back from the dead when the administrator makes a concerted effort to do so. So it at least shouldn't surprise the administrator w
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Jamie Gritton" In essence I think we can get the following flow where 1# = process1 and 2# = process2 1#1. prison1.pr_uref = 1 (single process jail) 1#2. prison_deref( prison1,... 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 1#3. prison0.pr_uref-- 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) 2#2. process1.exit 2#3. prison_deref( prison1,... 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) First off thanks for the feedback Jamie most appreciated :) The problem isn't with the conditional locking of tpr in prison_deref. That locking is actually correct, and there's no race condition. Are you sure? I do think that unlocking the mtx half way through the call allows the above scenario to create a race condition, all be it very briefly, when ignoring the overriding issue. In addition if the code where changed to so that the pr_uref++ also maintained the parents uref this would definitely lead to a potential problems in my mind, especially if you had more than one child prison, of a given parent, entering the dying state at any one time. In this case I believe you would have to acquire the locks of all the parent prisons before it would be safe to precede. The trouble lies in the resurrection of dead jails, as Andriy has noted (though not just attaching, but also by setting its persist flag causes the same problem). I not sure that persistent prisons actually suffer from this in any different way tbh, as they have an additional uref increment so would never hit this case unless they have been actively removed and hence unpersisted first. There are two possible fixes to this. One is the patch you've given, which only decrements a parent jail's pr_uref when the child jail completely goes away (as opposed to when it loses its last uref). This provides symmetry with the current way pr_uref is incremented on the parent, which is only when a jail is created. The other fix is to increment a parent's pr_uref when a jail is resurrected, which will match the current logic in prison_deref. I like the external semantics of this solution: a jail isn't visible if it is not persistent and has no processes and no *visible* sub-jails, as opposed to having no sub-jails at all. But this solution ends up pretty complicated - there are a few places where pr_uref is incremented, where I might need to increment parent jails' pr_uref as well, much like the current tpr loop in prison_deref decrements them. Ahh yes in the hierarchical case my patch would indeed mean that none persistent parent jails would remain visible even when its last child jail is in a dying state. As you say making this not the case would likely require replacing all instances of pr_uref++ with a prison_uref method that implements the opposite of the loop in prison_dref should the prisons pr_uref be 0 when called. Your solution removes code instead of adding it, which is generally a good thing. While it does change the semantics of pr_uref in the hierarchical case at least from what I thought it was, those semantics haven't been working properly anyway. Good to know my interpretation was correct, even if I was missing the visibility factor in the hierarchical case :) Bjoern, I'm adding you to the CC list for this because the whole pr_uref thing was your idea (though it was pr_nprocs at the time), so you might care about the hierarchical semantics of it - or you may not. Also, this is a panic-inducing bug in current and may interest you for that reason. From an admin perspective the current jail dying state does cause confusion when your not aware of its existence. You ask a jail to stop it appears to have completed that request, but really hasn't, an generally due to just a lingering tcp connection. With the introduction of hierarchical jails that gets a little worse where a whole series of jails could disappear from normal view only to be resurrected shortly after. Something to bear in mind when deciding which solution of the two presented to use. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On 08/20/11 19:19, Steven Hartland wrote: - Original Message - From: "Andriy Gapon" on 20/08/2011 23:24 Steven Hartland said the following: - Original Message - From: "Steven Hartland" Looking through the code I believe I may have noticed a scenario which could trigger the problem. Given the following code:- static void prison_deref(struct prison *pr, int flags) { struct prison *ppr, *tpr; int vfslocked; if (!(flags & PD_LOCKED)) mtx_lock(&pr->pr_mtx); /* Decrement the user references in a separate loop. */ if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } If you take a scenario of a simple one level prison setup running a single process where a prison has just been stopped. In the above code pr_uref of the processes prison is decremented. As this is the last process then pr_uref will hit 0 and the loop continues instead of breaking early. Now at the end of the loop iteration the mtx is unlocked so other process can now manipulate the jail, this is where I think the problem may be. If we now have another process come in and attach to the jail but then instantly exit, this process may allow another kernel thread to hit this same bit of code and so two process for the same prison get into the section which decrements prison0's pr_uref, instead of only one. In essence I think we can get the following flow where 1# = process1 and 2# = process2 1#1. prison1.pr_uref = 1 (single process jail) 1#2. prison_deref( prison1,... 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 1#3. prison0.pr_uref-- 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) 2#2. process1.exit 2#3. prison_deref( prison1,... 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) It seems like the action on the parent prison to decrement the pr_uref is happening too early, while the jail can still be used and without the lock on the child jails mtx, so causing a race condition. I think the fix is to the move the decrement of parent prison pr_uref's down so it only takes place if the jail is "really" being removed. Either that or to change the locking semantics so that once the lock is aquired in this prison_deref its not unlocked until the function completes. What do people think? After reviewing the changes to prison_deref in commit which added hierarchical jails, the removal of the lock by the inital loop on the passed in prison may be unintentional. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_jail.c.diff?r1=1.101;r2=1.102;f=h If so the following may be all that's needed to fix this issue:- diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 +++ sys/kern/kern_jail.c 2011-08-20 21:18:35.307201425 +0100 @@ -2455,7 +2455,8 @@ if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); - mtx_unlock(&tpr->pr_mtx); + if (tpr != pr) + mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { Not sure if this would fly as is - please double check the later block where pr->pr_mtx is re-locked. Your right, and its actually more complex than that. Although changing it to not unlock in the middle of prison_deref fixes that race condition it doesn't prevent pr_uref being incorrectly decremented each time the jail gets into the dying state, which is really the problem we are seeing. If hierarchical prisons are used there seems to be an additional problem where the counter of all prisons in the hierarchy are decremented, but as far as I can tell only the immediate parent is ever incremented, so another reference problem there as well I think. The following patch I believe fixes both of these issues. I've testing with debug added and confirmed prison0's pr_uref is maintained correctly even when a jail hits dying state multiple times. It essentially reverts the changes to the "if (flags & PD_DEUREF)" by 192895 and moves it to after the jail has been actually removed. diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 +++ sys/kern/kern_jail.c 2011-08-21 01:56:58.429894825 +0100 @@ -2449,27 +2449,16 @@ mtx_lock(&pr->pr_mtx); /* Decrement the user references in a separate loop. */ if (flags & PD_DEUREF) { - for
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" on 20/08/2011 23:24 Steven Hartland said the following: - Original Message - From: "Steven Hartland" Looking through the code I believe I may have noticed a scenario which could trigger the problem. Given the following code:- static void prison_deref(struct prison *pr, int flags) { struct prison *ppr, *tpr; int vfslocked; if (!(flags & PD_LOCKED)) mtx_lock(&pr->pr_mtx); /* Decrement the user references in a separate loop. */ if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } If you take a scenario of a simple one level prison setup running a single process where a prison has just been stopped. In the above code pr_uref of the processes prison is decremented. As this is the last process then pr_uref will hit 0 and the loop continues instead of breaking early. Now at the end of the loop iteration the mtx is unlocked so other process can now manipulate the jail, this is where I think the problem may be. If we now have another process come in and attach to the jail but then instantly exit, this process may allow another kernel thread to hit this same bit of code and so two process for the same prison get into the section which decrements prison0's pr_uref, instead of only one. In essence I think we can get the following flow where 1# = process1 and 2# = process2 1#1. prison1.pr_uref = 1 (single process jail) 1#2. prison_deref( prison1,... 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 1#3. prison0.pr_uref-- 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) 2#2. process1.exit 2#3. prison_deref( prison1,... 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) It seems like the action on the parent prison to decrement the pr_uref is happening too early, while the jail can still be used and without the lock on the child jails mtx, so causing a race condition. I think the fix is to the move the decrement of parent prison pr_uref's down so it only takes place if the jail is "really" being removed. Either that or to change the locking semantics so that once the lock is aquired in this prison_deref its not unlocked until the function completes. What do people think? After reviewing the changes to prison_deref in commit which added hierarchical jails, the removal of the lock by the inital loop on the passed in prison may be unintentional. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_jail.c.diff?r1=1.101;r2=1.102;f=h If so the following may be all that's needed to fix this issue:- diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 +++ sys/kern/kern_jail.c2011-08-20 21:18:35.307201425 +0100 @@ -2455,7 +2455,8 @@ if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); - mtx_unlock(&tpr->pr_mtx); + if (tpr != pr) + mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { Not sure if this would fly as is - please double check the later block where pr->pr_mtx is re-locked. Your right, and its actually more complex than that. Although changing it to not unlock in the middle of prison_deref fixes that race condition it doesn't prevent pr_uref being incorrectly decremented each time the jail gets into the dying state, which is really the problem we are seeing. If hierarchical prisons are used there seems to be an additional problem where the counter of all prisons in the hierarchy are decremented, but as far as I can tell only the immediate parent is ever incremented, so another reference problem there as well I think. The following patch I believe fixes both of these issues. I've testing with debug added and confirmed prison0's pr_uref is maintained correctly even when a jail hits dying state multiple times. It essentially reverts the changes to the "i
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Steven Hartland" Something else you many be more interested in Andriy:- I added in debugging options DDB & INVARIANTS to see if I can get a more useful info and the panic results in a looping panic constantly scrolling up the console. Not sure if this is a side effect of the patches we've been trying. Going to see if I can confirm that, lmk if there's something you want me to try? Seems the stop_scheduler_on_panic.8.x.patch is the cause of this. Removing it allows me to drop to ddb when the panic due to the KASSERT happens. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 +++ sys/kern/kern_jail.c2011-08-20 21:18:35.307201425 +0100 @@ -2455,7 +2455,8 @@ if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); - mtx_unlock(&tpr->pr_mtx); + if (tpr != pr) + mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { Not sure if this would fly as is - please double check the later block where pr->pr_mtx is re-locked. Will do, I'm now 99.9% sure this is the problem and even better I now have a reproducible scenario :) Something else you many be more interested in Andriy:- I added in debugging options DDB & INVARIANTS to see if I can get a more useful info and the panic results in a looping panic constantly scrolling up the console. Not sure if this is a side effect of the patches we've been trying. Going to see if I can confirm that, lmk if there's something you want me to try? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 20/08/2011 23:24 Steven Hartland said the following: > - Original Message - From: "Steven Hartland" >> Looking through the code I believe I may have noticed a scenario which could >> trigger the problem. >> >> Given the following code:- >> >> static void >> prison_deref(struct prison *pr, int flags) >> { >>struct prison *ppr, *tpr; >>int vfslocked; >> >>if (!(flags & PD_LOCKED)) >>mtx_lock(&pr->pr_mtx); >>/* Decrement the user references in a separate loop. */ >>if (flags & PD_DEUREF) { >>for (tpr = pr;; tpr = tpr->pr_parent) { >>if (tpr != pr) >>mtx_lock(&tpr->pr_mtx); >>if (--tpr->pr_uref > 0) >>break; >>KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); >>mtx_unlock(&tpr->pr_mtx); >>} >>/* Done if there were only user references to remove. */ >>if (!(flags & PD_DEREF)) { >>mtx_unlock(&tpr->pr_mtx); >>if (flags & PD_LIST_SLOCKED) >>sx_sunlock(&allprison_lock); >>else if (flags & PD_LIST_XLOCKED) >>sx_xunlock(&allprison_lock); >>return; >>} >>if (tpr != pr) { >>mtx_unlock(&tpr->pr_mtx); >>mtx_lock(&pr->pr_mtx); >>} >>} >> >> If you take a scenario of a simple one level prison setup running a single >> process >> where a prison has just been stopped. >> >> In the above code pr_uref of the processes prison is decremented. As this is >> the >> last process then pr_uref will hit 0 and the loop continues instead of >> breaking >> early. >> >> Now at the end of the loop iteration the mtx is unlocked so other process can >> now manipulate the jail, this is where I think the problem may be. >> >> If we now have another process come in and attach to the jail but then >> instantly >> exit, this process may allow another kernel thread to hit this same bit of >> code >> and so two process for the same prison get into the section which decrements >> prison0's pr_uref, instead of only one. >> >> In essence I think we can get the following flow where 1# = process1 >> and 2# = process2 >> 1#1. prison1.pr_uref = 1 (single process jail) >> 1#2. prison_deref( prison1,... >> 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) >> 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref >> 1#3. prison0.pr_uref-- >> 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) >> 2#2. process1.exit >> 2#3. prison_deref( prison1,... >> 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) >> 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref >> 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by >> prison1) >> >> It seems like the action on the parent prison to decrement the pr_uref is >> happening too early, while the jail can still be used and without the lock on >> the child jails mtx, so causing a race condition. >> >> I think the fix is to the move the decrement of parent prison pr_uref's down >> so it only takes place if the jail is "really" being removed. Either that or >> to change the locking semantics so that once the lock is aquired in this >> prison_deref its not unlocked until the function completes. >> >> What do people think? > > After reviewing the changes to prison_deref in commit which added hierarchical > jails, the removal of the lock by the inital loop on the passed in prison may > be unintentional. > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_jail.c.diff?r1=1.101;r2=1.102;f=h > > > If so the following may be all that's needed to fix this issue:- > > diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c > --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 > +++ sys/kern/kern_jail.c2011-08-20 21:18:35.307201425 +0100 > @@ -2455,7 +2455,8 @@ >if (--tpr->pr_uref > 0) >break; >KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); > - mtx_unlock(&tpr->pr_mtx); > + if (tpr != pr) > + mtx_unlock(&tpr->pr_mtx); >} >/* Done if there were only user references to remove. */ >if (!(flags & PD_DEREF)) { Not sure if this would fly as is - please double check the later block where pr->pr_mtx is re-locked. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Steven Hartland" Looking through the code I believe I may have noticed a scenario which could trigger the problem. Given the following code:- static void prison_deref(struct prison *pr, int flags) { struct prison *ppr, *tpr; int vfslocked; if (!(flags & PD_LOCKED)) mtx_lock(&pr->pr_mtx); /* Decrement the user references in a separate loop. */ if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } If you take a scenario of a simple one level prison setup running a single process where a prison has just been stopped. In the above code pr_uref of the processes prison is decremented. As this is the last process then pr_uref will hit 0 and the loop continues instead of breaking early. Now at the end of the loop iteration the mtx is unlocked so other process can now manipulate the jail, this is where I think the problem may be. If we now have another process come in and attach to the jail but then instantly exit, this process may allow another kernel thread to hit this same bit of code and so two process for the same prison get into the section which decrements prison0's pr_uref, instead of only one. In essence I think we can get the following flow where 1# = process1 and 2# = process2 1#1. prison1.pr_uref = 1 (single process jail) 1#2. prison_deref( prison1,... 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 1#3. prison0.pr_uref-- 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) 2#2. process1.exit 2#3. prison_deref( prison1,... 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) It seems like the action on the parent prison to decrement the pr_uref is happening too early, while the jail can still be used and without the lock on the child jails mtx, so causing a race condition. I think the fix is to the move the decrement of parent prison pr_uref's down so it only takes place if the jail is "really" being removed. Either that or to change the locking semantics so that once the lock is aquired in this prison_deref its not unlocked until the function completes. What do people think? After reviewing the changes to prison_deref in commit which added hierarchical jails, the removal of the lock by the inital loop on the passed in prison may be unintentional. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_jail.c.diff?r1=1.101;r2=1.102;f=h If so the following may be all that's needed to fix this issue:- diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 +++ sys/kern/kern_jail.c2011-08-20 21:18:35.307201425 +0100 @@ -2455,7 +2455,8 @@ if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); - mtx_unlock(&tpr->pr_mtx); + if (tpr != pr) + mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" thanks for doing this! I'll reiterate my suspicion just in case - I think that you should look for the cases where you stop a jail, but then re-attach and resurrect the jail before it's completely dead. Yer that's where I think its happening too, but I also suspect its not just dieing jail that's needed, I think its a dieing jail in the final stages of cleanup. Looking through the code I believe I may have noticed a scenario which could trigger the problem. Given the following code:- static void prison_deref(struct prison *pr, int flags) { struct prison *ppr, *tpr; int vfslocked; if (!(flags & PD_LOCKED)) mtx_lock(&pr->pr_mtx); /* Decrement the user references in a separate loop. */ if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } If you take a scenario of a simple one level prison setup running a single process where a prison has just been stopped. In the above code pr_uref of the processes prison is decremented. As this is the last process then pr_uref will hit 0 and the loop continues instead of breaking early. Now at the end of the loop iteration the mtx is unlocked so other process can now manipulate the jail, this is where I think the problem may be. If we now have another process come in and attach to the jail but then instantly exit, this process may allow another kernel thread to hit this same bit of code and so two process for the same prison get into the section which decrements prison0's pr_uref, instead of only one. In essence I think we can get the following flow where 1# = process1 and 2# = process2 1#1. prison1.pr_uref = 1 (single process jail) 1#2. prison_deref( prison1,... 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 1#3. prison0.pr_uref-- 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) 2#2. process1.exit 2#3. prison_deref( prison1,... 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) It seems like the action on the parent prison to decrement the pr_uref is happening too early, while the jail can still be used and without the lock on the child jails mtx, so causing a race condition. I think the fix is to the move the decrement of parent prison pr_uref's down so it only takes place if the jail is "really" being removed. Either that or to change the locking semantics so that once the lock is aquired in this prison_deref its not unlocked until the function completes. What do people think? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Roger Marquis" To: ; Sent: Saturday, August 20, 2011 7:10 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE Repeat this enough times and prison0.pr_uref reaches zero. To reach zero even sooner just kill enough of non-jailed processes. Interesting. We've been getting kernel panics in -stable but with only one jail started at boot without being restarted. Are you using SAS drives by any chance? Setting ethernet polling and HZ? How about softupdates, gmirror, and/or anything in sysctl.conf? If your not restarting things it may be unrelated. No SAS, polling is compiled in but no devices have it active and using ZFS only. Are you seeing a double fault panic? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
Repeat this enough times and prison0.pr_uref reaches zero. To reach zero even sooner just kill enough of non-jailed processes. Interesting. We've been getting kernel panics in -stable but with only one jail started at boot without being restarted. Are you using SAS drives by any chance? Setting ethernet polling and HZ? How about softupdates, gmirror, and/or anything in sysctl.conf? Roger Marquis ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 20/08/2011 18:51 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" > >> BTW, I suspect the following scenario, but I am not able to verify it either >> via >> testing or in the code: >> - last process in a dying jail exits >> - pr_uref of the jail reaches zero >> - pr_uref of prison0 gets decremented >> - you attach to the jail and resurrect it >> - but pr_uref of prison0 stays decremented >> >> Repeat this enough times and prison0.pr_uref reaches zero. >> To reach zero even sooner just kill enough of non-jailed processes. > > I've just checked across a number of the panic dumps from the > past few days and they all have prison0.pr_uref = 0 which confirms > the cause of the panic. > > I've tried scripting continuous jail start stops, but even after 1000's > of iterations have been unable to trigger this on my test machine, so > I'm going to dig into the jail code to see if I can find out how its > incorrectly decrementing prison0 via inspection. Steve, thanks for doing this! I'll reiterate my suspicion just in case - I think that you should look for the cases where you stop a jail, but then re-attach and resurrect the jail before it's completely dead. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" BTW, I suspect the following scenario, but I am not able to verify it either via testing or in the code: - last process in a dying jail exits - pr_uref of the jail reaches zero - pr_uref of prison0 gets decremented - you attach to the jail and resurrect it - but pr_uref of prison0 stays decremented Repeat this enough times and prison0.pr_uref reaches zero. To reach zero even sooner just kill enough of non-jailed processes. I've just checked across a number of the panic dumps from the past few days and they all have prison0.pr_uref = 0 which confirms the cause of the panic. I've tried scripting continuous jail start stops, but even after 1000's of iterations have been unable to trigger this on my test machine, so I'm going to dig into the jail code to see if I can find out how its incorrectly decrementing prison0 via inspection. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" BTW, I suspect the following scenario, but I am not able to verify it either via testing or in the code: - last process in a dying jail exits - pr_uref of the jail reaches zero - pr_uref of prison0 gets decremented - you attach to the jail and resurrect it - but pr_uref of prison0 stays decremented Repeat this enough times and prison0.pr_uref reaches zero. To reach zero even sooner just kill enough of non-jailed processes. Ahh now that explains all of our experienced panic scenarios:- 1. jail stop / start causing the panic but only after at least a few days worth of uptime. Here what we're seeing is enough "leak" of pr_uref from the restarted jails to decrement prison0.pr_uref to 0 even with all the standard unjailed processes still running. 2. A machine reboot, after all jails have been stopped but after less time than #2. In this case we haven't seen enough leakage to decrement prison0.pr_uref to 0 given the number or prison0 process but it has been incorrectly decremented, so as soon as the reboot kicks in and prison0 processes start exiting, prison0.pr_uref gets further decremented and again hits 0 when it shouldn't Now if this is the case, we should be able to confirm it with a little more info. 1. What exactly does pr_uref represent? 2. Can what its value should be, be calculated from examining other details of the system i.e. number of running processes, number of running jails? If we can calculate the value that prison0.pr_uref should be, then by examining the machines we have which have been up for a while, we should be able to confirm if an incorrect value is present on them and hence prove this is the case. Ideally a little script to run in kgdb to test this would be the best way to go. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 20/08/2011 13:02 Andriy Gapon said the following: > on 18/08/2011 02:15 Steven Hartland said the following: >> In a nutshell the jail manager we're using will attempt to resurrect the jail >> from a dieing state in a few specific scenarios. >> >> Here's an exmaple:- >> 1. jail restart requested >> 2. jail is stopped, so the java processes is killed off, but active tcp >> sessions >> may prevent the timely full shutdown of the jail. >> 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of >> starting a new jail we attach to the old one and exec the new java process. >> 4. if an existing jail isnt detected, i.e. where there where not hanging tcp >> sessions and #2 cleanly shutdown the jail, a new jail is created, attached to >> and the java exec'ed. >> >> The system uses static jailid's so its possible to determine if an existing >> jail for this "service" exists or not. This prevents duplicate services as >> well as making services easy to identify by their jailid. >> >> So what we could be seeing is a race between the jail shutdown and the attach >> of the new process? > > Not a jail expert at all, but a few suggestions... > > First, wouldn't the 'persist' jail option simplify your life a little bit? > > Second, you may want to try to monitor value of prison0.pr_uref variable (e.g. > via kgdb) while executing various scenarios of what you do now. If after > finishing a certain scenario you end up with a value lower than at the start > of > scenario, then this is the troublesome one. > Please note that prison0.pr_uref is composed from a number of non-jailed > processes plus a number of top-level jails. So take this into account when > comparing prison0.pr_uref values - it's better to record the initial value > when > no jails are started and it's important to keep the number of non-jailed > processes the same (or to account for its changes). BTW, I suspect the following scenario, but I am not able to verify it either via testing or in the code: - last process in a dying jail exits - pr_uref of the jail reaches zero - pr_uref of prison0 gets decremented - you attach to the jail and resurrect it - but pr_uref of prison0 stays decremented Repeat this enough times and prison0.pr_uref reaches zero. To reach zero even sooner just kill enough of non-jailed processes. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 18/08/2011 02:15 Steven Hartland said the following: > In a nutshell the jail manager we're using will attempt to resurrect the jail > from a dieing state in a few specific scenarios. > > Here's an exmaple:- > 1. jail restart requested > 2. jail is stopped, so the java processes is killed off, but active tcp > sessions > may prevent the timely full shutdown of the jail. > 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of > starting a new jail we attach to the old one and exec the new java process. > 4. if an existing jail isnt detected, i.e. where there where not hanging tcp > sessions and #2 cleanly shutdown the jail, a new jail is created, attached to > and the java exec'ed. > > The system uses static jailid's so its possible to determine if an existing > jail for this "service" exists or not. This prevents duplicate services as > well as making services easy to identify by their jailid. > > So what we could be seeing is a race between the jail shutdown and the attach > of the new process? Not a jail expert at all, but a few suggestions... First, wouldn't the 'persist' jail option simplify your life a little bit? Second, you may want to try to monitor value of prison0.pr_uref variable (e.g. via kgdb) while executing various scenarios of what you do now. If after finishing a certain scenario you end up with a value lower than at the start of scenario, then this is the troublesome one. Please note that prison0.pr_uref is composed from a number of non-jailed processes plus a number of top-level jails. So take this into account when comparing prison0.pr_uref values - it's better to record the initial value when no jails are started and it's important to keep the number of non-jailed processes the same (or to account for its changes). -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 19/08/2011 15:14 John Baldwin said the following: > Yes, it is a bug in kgdb that it only walks allproc and not zombproc. Try > this: The patch worked perfectly well for me, thank you! > Index: kthr.c > === > --- kthr.c(revision 224879) > +++ kthr.c(working copy) > @@ -73,11 +73,52 @@ kgdb_thr_first(void) > return (first); > } > > +static void > +kgdb_thr_add_procs(uintptr_t paddr) > +{ > + struct proc p; > + struct thread td; > + struct kthr *kt; > + CORE_ADDR addr; > + > + while (paddr != 0) { > + if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) { > + warnx("kvm_read: %s", kvm_geterr(kvm)); > + break; > + } > + addr = (uintptr_t)TAILQ_FIRST(&p.p_threads); > + while (addr != 0) { > + if (kvm_read(kvm, addr, &td, sizeof(td)) != > + sizeof(td)) { > + warnx("kvm_read: %s", kvm_geterr(kvm)); > + break; > + } > + kt = malloc(sizeof(*kt)); > + kt->next = first; > + kt->kaddr = addr; > + if (td.td_tid == dumptid) > + kt->pcb = dumppcb; > + else if (td.td_state == TDS_RUNNING && stoppcbs != 0 && > + CPU_ISSET(td.td_oncpu, &stopped_cpus)) > + kt->pcb = (uintptr_t)stoppcbs + > + sizeof(struct pcb) * td.td_oncpu; > + else > + kt->pcb = (uintptr_t)td.td_pcb; > + kt->kstack = td.td_kstack; > + kt->tid = td.td_tid; > + kt->pid = p.p_pid; > + kt->paddr = paddr; > + kt->cpu = td.td_oncpu; > + first = kt; > + addr = (uintptr_t)TAILQ_NEXT(&td, td_plist); > + } > + paddr = (uintptr_t)LIST_NEXT(&p, p_list); > + } > +} > + > struct kthr * > kgdb_thr_init(void) > { > - struct proc p; > - struct thread td; > long cpusetsize; > struct kthr *kt; > CORE_ADDR addr; > @@ -113,37 +154,11 @@ kgdb_thr_init(void) > > stoppcbs = kgdb_lookup("stoppcbs"); > > - while (paddr != 0) { > - if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) { > - warnx("kvm_read: %s", kvm_geterr(kvm)); > - break; > - } > - addr = (uintptr_t)TAILQ_FIRST(&p.p_threads); > - while (addr != 0) { > - if (kvm_read(kvm, addr, &td, sizeof(td)) != > - sizeof(td)) { > - warnx("kvm_read: %s", kvm_geterr(kvm)); > - break; > - } > - kt = malloc(sizeof(*kt)); > - kt->next = first; > - kt->kaddr = addr; > - if (td.td_tid == dumptid) > - kt->pcb = dumppcb; > - else if (td.td_state == TDS_RUNNING && stoppcbs != 0 && > - CPU_ISSET(td.td_oncpu, &stopped_cpus)) > - kt->pcb = (uintptr_t) stoppcbs + sizeof(struct > pcb) * td.td_oncpu; > - else > - kt->pcb = (uintptr_t)td.td_pcb; > - kt->kstack = td.td_kstack; > - kt->tid = td.td_tid; > - kt->pid = p.p_pid; > - kt->paddr = paddr; > - kt->cpu = td.td_oncpu; > - first = kt; > - addr = (uintptr_t)TAILQ_NEXT(&td, td_plist); > - } > - paddr = (uintptr_t)LIST_NEXT(&p, p_list); > + kgdb_thr_add_procs(paddr); > + addr = kgdb_lookup("zombproc"); > + if (addr != 0) { > + kvm_read(kvm, addr, &paddr, sizeof(paddr)); > + kgdb_thr_add_procs(paddr); > } > curkthr = kgdb_thr_lookup_tid(dumptid); > if (curkthr == NULL) > >> is there an easy way to examine its stack in this case? > > Hmm, you can use something like this from my kgdb macros. Oh, I completely forgot about them. I hope I will remember where to search for the tricks next time I need them :-) Thank you again! > For amd64: > > # Do a backtrace given %rip and %rbp as args > define bt > set $_rip = $arg0 > set $_rbp = $arg1 > set $i = 0 > while ($_rbp != 0 || $_rip != 0) > printf "%2d: pc ", $i > if ($_rip != 0) > x/1i $_rip > else > printf "\n" > end > if ($_rbp == 0) > set $_rip = 0 > else > set $fr = (struct amd64_frame *)$_rbp > set $_rbp = $fr->f_frame >
Re: debugging frequent kernel panics on 8.2-RELEASE
On Thursday, August 18, 2011 4:09:35 pm Andriy Gapon wrote: > on 17/08/2011 23:21 Andriy Gapon said the following: > > It seems like everything starts with some kind of a race between terminating > > processes in a jail and termination of the jail itself. This is where the > > details are very thin so far. What we see is that a process (http) is in > > exit(2) syscall, in exit1() function actually, and past the place where > > P_WEXIT > > flag is set and even past the place where p_limit is freed and reset to > > NULL. > > At that place the thread calls prison_proc_free(), which calls > > prison_deref(). > > Then, we see that in prison_deref() the thread gets a page fault because of > > what > > seems like a NULL pointer dereference. That's just the start of the > > problem and > > its root cause. > > > > Then, trap_pfault() gets invoked and, because addresses close to NULL look > > like > > userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn > > goes > > on to call vm_map_growstack. First thing that vm_map_growstack does is a > > call > > to lim_cur(), but because p_limit is already NULL, that call results in a > > NULL > > pointer dereference and a page fault. Goto the beginning of this paragraph. > > > > So we get this recursion of sorts, which only ends when a stack is > > exhausted and > > a CPU generates a double-fault. > > BTW, does anyone has an idea why the thread in question would "disappear" from > the kgdb's point of view? > > (kgdb) p cpuid_to_pcpu[2]->pc_curthread->td_tid > $3 = 102057 > (kgdb) tid 102057 > invalid tid > > info threads also doesn't list the thread. > > Is it because the panic happened while the thread was somewhere in exit1()? Yes, it is a bug in kgdb that it only walks allproc and not zombproc. Try this: Index: kthr.c === --- kthr.c (revision 224879) +++ kthr.c (working copy) @@ -73,11 +73,52 @@ kgdb_thr_first(void) return (first); } +static void +kgdb_thr_add_procs(uintptr_t paddr) +{ + struct proc p; + struct thread td; + struct kthr *kt; + CORE_ADDR addr; + + while (paddr != 0) { + if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) { + warnx("kvm_read: %s", kvm_geterr(kvm)); + break; + } + addr = (uintptr_t)TAILQ_FIRST(&p.p_threads); + while (addr != 0) { + if (kvm_read(kvm, addr, &td, sizeof(td)) != + sizeof(td)) { + warnx("kvm_read: %s", kvm_geterr(kvm)); + break; + } + kt = malloc(sizeof(*kt)); + kt->next = first; + kt->kaddr = addr; + if (td.td_tid == dumptid) + kt->pcb = dumppcb; + else if (td.td_state == TDS_RUNNING && stoppcbs != 0 && + CPU_ISSET(td.td_oncpu, &stopped_cpus)) + kt->pcb = (uintptr_t)stoppcbs + + sizeof(struct pcb) * td.td_oncpu; + else + kt->pcb = (uintptr_t)td.td_pcb; + kt->kstack = td.td_kstack; + kt->tid = td.td_tid; + kt->pid = p.p_pid; + kt->paddr = paddr; + kt->cpu = td.td_oncpu; + first = kt; + addr = (uintptr_t)TAILQ_NEXT(&td, td_plist); + } + paddr = (uintptr_t)LIST_NEXT(&p, p_list); + } +} + struct kthr * kgdb_thr_init(void) { - struct proc p; - struct thread td; long cpusetsize; struct kthr *kt; CORE_ADDR addr; @@ -113,37 +154,11 @@ kgdb_thr_init(void) stoppcbs = kgdb_lookup("stoppcbs"); - while (paddr != 0) { - if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) { - warnx("kvm_read: %s", kvm_geterr(kvm)); - break; - } - addr = (uintptr_t)TAILQ_FIRST(&p.p_threads); - while (addr != 0) { - if (kvm_read(kvm, addr, &td, sizeof(td)) != - sizeof(td)) { - warnx("kvm_read: %s", kvm_geterr(kvm)); - break; - } - kt = malloc(sizeof(*kt)); - kt->next = first; - kt->kaddr = addr; - if (td.td_tid == dumptid) - kt->pcb = dumppcb; - else if (td.td_state == TDS_RUNNING && stoppcbs != 0 && - CPU_ISSET(td.td_oncpu, &stopped_cpus)) - kt
Re: debugging frequent kernel panics on 8.2-RELEASE
2011/8/18 Andriy Gapon : > on 17/08/2011 23:21 Andriy Gapon said the following: >> >> It seems like everything starts with some kind of a race between >> terminating >> processes in a jail and termination of the jail itself. This is where the >> details are very thin so far. What we see is that a process (http) is in >> exit(2) syscall, in exit1() function actually, and past the place where >> P_WEXIT >> flag is set and even past the place where p_limit is freed and reset to >> NULL. >> At that place the thread calls prison_proc_free(), which calls >> prison_deref(). >> Then, we see that in prison_deref() the thread gets a page fault because >> of what >> seems like a NULL pointer dereference. That's just the start of the >> problem and >> its root cause. >> >> Then, trap_pfault() gets invoked and, because addresses close to NULL look >> like >> userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn >> goes >> on to call vm_map_growstack. First thing that vm_map_growstack does is a >> call >> to lim_cur(), but because p_limit is already NULL, that call results in a >> NULL >> pointer dereference and a page fault. Goto the beginning of this >> paragraph. >> >> So we get this recursion of sorts, which only ends when a stack is >> exhausted and >> a CPU generates a double-fault. > > BTW, does anyone has an idea why the thread in question would "disappear" > from > the kgdb's point of view? > > (kgdb) p cpuid_to_pcpu[2]->pc_curthread->td_tid > $3 = 102057 > (kgdb) tid 102057 > invalid tid > > info threads also doesn't list the thread. > > Is it because the panic happened while the thread was somewhere in exit1()? > is there an easy way to examine its stack in this case? Yes it is likely it. 'tid' command should lookup the tid_to_thread() table (or similar name) which returns NULL, which means the thread has past beyond the point it was in the lookup table. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 17/08/2011 23:21 Andriy Gapon said the following: It seems like everything starts with some kind of a race between terminating processes in a jail and termination of the jail itself. This is where the details are very thin so far. What we see is that a process (http) is in exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT flag is set and even past the place where p_limit is freed and reset to NULL. At that place the thread calls prison_proc_free(), which calls prison_deref(). Then, we see that in prison_deref() the thread gets a page fault because of what seems like a NULL pointer dereference. That's just the start of the problem and its root cause. Then, trap_pfault() gets invoked and, because addresses close to NULL look like userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn goes on to call vm_map_growstack. First thing that vm_map_growstack does is a call to lim_cur(), but because p_limit is already NULL, that call results in a NULL pointer dereference and a page fault. Goto the beginning of this paragraph. So we get this recursion of sorts, which only ends when a stack is exhausted and a CPU generates a double-fault. BTW, does anyone has an idea why the thread in question would "disappear" from the kgdb's point of view? (kgdb) p cpuid_to_pcpu[2]->pc_curthread->td_tid $3 = 102057 (kgdb) tid 102057 invalid tid info threads also doesn't list the thread. Is it because the panic happened while the thread was somewhere in exit1()? is there an easy way to examine its stack in this case? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 18/08/2011 14:11 Andriy Gapon said the following: > Probably I have mistakenly assumed that the 'prison' in prison_derefer() has > something to do with an actual jail, while it could have been just prison0 > where > all non-jailed processes belong. So, indeed: (kgdb) p $2->p_ucred->cr_prison $10 = (struct prison *) 0x807d5080 (kgdb) p &prison0 $11 = (struct prison *) 0x807d5080 (kgdb) p *$2->p_ucred->cr_prison $12 = {pr_list = {tqe_next = 0x0, tqe_prev = 0x0}, pr_id = 0, pr_ref = 398, pr_uref = 0, pr_flags = 386, pr_children = {lh_first = 0x0}, pr_sibling = {le_next = 0x0, le_prev = 0x0}, pr_parent = 0x0, pr_mtx = {lock_object = {lo_name = 0x8063007c "jail mutex", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, pr_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0, ta_context = 0x0}, pr_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, pr_cpuset = 0xff0012d65dc8, pr_vnet = 0x0, pr_root = 0xff00166ebce8, pr_ip4s = 0, pr_ip6s = 0, pr_ip4 = 0x0, pr_ip6 = 0x0, pr_sparep = {0x0, 0x0, 0x0, 0x0}, pr_childcount = 0, pr_childmax = 99, pr_allow = 127, pr_securelevel = -1, pr_enforce_statfs = 0, pr_spare = {0, 0, 0, 0, 0}, pr_hostid = 3251597242, pr_name = "0", '\0' , pr_path = "/", '\0' , pr_hostname = "censored", '\0' , pr_domainname = '\0' , pr_hostuuid = "54443842-0054-2500-902c-0025902c3cb0", '\0' } Also, let's consider this code: if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } The most suspicious thing is that pr_uref is zero in the debug data. With INVARIANTS we would hit the "prison0 pr_uref=0" KASSERT. Then, because this is prison0 and because pr_uref reached zero, tpr gets assigned to NULL. And then because tpr != pr we try to execute mtx_unlock(&tpr->pr_mtx). That's where the NULL pointer deref happens. So, now the big question is how/why we reached pr_uref == 0. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" Probably I have mistakenly assumed that the 'prison' in prison_derefer() has something to do with an actual jail, while it could have been just prison0 where all non-jailed processes belong. That makes sense as this particular panic was caused by a machine reboot, which is slightly different from the more common jail panic we're seeing. Doesn't help with our reproduction scenario though unfortunately. If we don't have any joy reproducing on our single test machine I'll have this kernel rolled out across a portion of the farm, which should mean we see the panic results in a few days time. I understand there's a risk involved in this but, its important for us to determine the cause and get a confirmed fix, as well as being able to prove that the panic fix works which will help everyone in the long run. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 18/08/2011 13:35 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" >>> Thats interesting, are you using http as an example or is that something >>> thats >>> been gleaned from the debugging of our output? I ask as there's only one >>> process >>> running in each of our jails and thats a single java process. >> >> >> It's from the debug data: p_comm = "httpd" > > Hmm, there's only one httpd thats ever run on the machine and thats not in > the jail > its on the raw machine. Probably I have mistakenly assumed that the 'prison' in prison_derefer() has something to do with an actual jail, while it could have been just prison0 where all non-jailed processes belong. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" Thats interesting, are you using http as an example or is that something thats been gleaned from the debugging of our output? I ask as there's only one process running in each of our jails and thats a single java process. It's from the debug data: p_comm = "httpd" Hmm, there's only one httpd thats ever run on the machine and thats not in the jail its on the raw machine. I also would like to ask you to revert the last patch that I sent you (with tf_rip comparisons) and try the patch from Kostik instead. Sure. Given what we suspect about the problem, can please also try to provoke the problem by e.g. doing frequent jail restarts or something else that supposedly should hit the bug. I've tried doing this for quite some days on the test machine, but I've been unable to provoke it, will continue to try. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 18/08/2011 02:15 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" > >> Thanks to the debug that Steven provided and to the help that I received from >> Kostik, I think that now I understand the basic mechanics of this panic, but, >> unfortunately, not the details of its root cause. >> >> It seems like everything starts with some kind of a race between terminating >> processes in a jail and termination of the jail itself. This is where the >> details are very thin so far. What we see is that a process (http) is in >> exit(2) syscall, in exit1() function actually, and past the place where >> P_WEXIT >> flag is set and even past the place where p_limit is freed and reset to NULL. >> At that place the thread calls prison_proc_free(), which calls >> prison_deref(). >> Then, we see that in prison_deref() the thread gets a page fault because of >> what >> seems like a NULL pointer dereference. That's just the start of the problem >> and >> its root cause. > > Thats interesting, are you using http as an example or is that something thats > been gleaned from the debugging of our output? I ask as there's only one > process > running in each of our jails and thats a single java process. It's from the debug data: p_comm = "httpd" I also would like to ask you to revert the last patch that I sent you (with tf_rip comparisons) and try the patch from Kostik instead. Given what we suspect about the problem, can please also try to provoke the problem by e.g. doing frequent jail restarts or something else that supposedly should hit the bug. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" Thanks to the debug that Steven provided and to the help that I received from Kostik, I think that now I understand the basic mechanics of this panic, but, unfortunately, not the details of its root cause. It seems like everything starts with some kind of a race between terminating processes in a jail and termination of the jail itself. This is where the details are very thin so far. What we see is that a process (http) is in exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT flag is set and even past the place where p_limit is freed and reset to NULL. At that place the thread calls prison_proc_free(), which calls prison_deref(). Then, we see that in prison_deref() the thread gets a page fault because of what seems like a NULL pointer dereference. That's just the start of the problem and its root cause. Thats interesting, are you using http as an example or is that something thats been gleaned from the debugging of our output? I ask as there's only one process running in each of our jails and thats a single java process. Now given your description there may be something I can add that may help clarify what the cause could be. In a nutshell the jail manager we're using will attempt to resurrect the jail from a dieing state in a few specific scenarios. Here's an exmaple:- 1. jail restart requested 2. jail is stopped, so the java processes is killed off, but active tcp sessions may prevent the timely full shutdown of the jail. 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of starting a new jail we attach to the old one and exec the new java process. 4. if an existing jail isnt detected, i.e. where there where not hanging tcp sessions and #2 cleanly shutdown the jail, a new jail is created, attached to and the java exec'ed. The system uses static jailid's so its possible to determine if an existing jail for this "service" exists or not. This prevents duplicate services as well as making services easy to identify by their jailid. So what we could be seeing is a race between the jail shutdown and the attach of the new process? Now man 2 jail seems to indicate this is a valid use case for jail_set, as it documents its support for JAIL_DYING as a valid option for flags, but I suspect its something quite out of the ordinary to actually do, which may be why this panic hasnt been seen before now. As some background the reason we use static jailid's is to ensure only one instance of the jailed service is running, and the reason we re-attach to the dieing jail is so that jails can be restarted in a timely manor. Without using the re-attach we would need to wait of all tcp sessions which have been aborted to timeout. So, of course, Steven is interested in finding and fixing the root cause. I hope we will get to that with some help from the "prison guards" :-) Does the above potentially explain how we're getting to the situation which generates the panic? If so we can certainly look at using alternatives to the current design to workaround this issue. Flagging the jail as permanent and using manual process management and additional external locking to prevent duplicates, is what instantly springs to mind. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On Wed, Aug 17, 2011 at 11:21:42PM +0300, Andriy Gapon wrote: [skip] > But I also would like to use this opportunity to discuss how we can > make it easier to debug such issue as this. I think that this problem > demonstrates that when we treat certain junk in kernel address value > as a userland address value, we throw additional heaps of irrelevant > stuff on top of an actual problem. One solution could be to use a > special flag that would mark all actual attempts to access userland > address (e.g. setting the flag on entrance to copyin and clearing it > upon return), so that in the page fault handler we could distinguish > actual faults on userland addresses from faults on garbage kernel > addresses. I am sure that there could be other clever techniques to > catch such garbage addresses early. We already have such mechanism, the kernel code aware of the usermode page access sets pcb_onfault. See the end of trap_pfault() handler. In fact, we can catch it earlier, before even calling vm_fault(). BTW, I think this is esp. useful in the combination with the support for the SMEP in recent Intel CPUs. commit 2e1b36fa93f9499e37acf04a66ff0646d4f13536 Author: Konstantin Belousov Date: Thu Aug 18 00:08:50 2011 +0300 Assert that the exiting process does not return to usermode. On x86, do not call vm_fault() when the kernel is not prepared to handle unsuccessful page fault. diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c index 4e5f8b8..55e1e5a 100644 --- a/sys/amd64/amd64/trap.c +++ b/sys/amd64/amd64/trap.c @@ -674,6 +674,19 @@ trap_pfault(frame, usermode) goto nogo; map = &vm->vm_map; + + /* +* When accessing a usermode address, kernel must be +* ready to accept the page fault, and provide a +* handling routine. Since accessing the address +* without the handler is a bug, do not try to handle +* it normally, and panic immediately. +*/ + if (!usermode && (td->td_intr_nesting_level != 0 || + PCPU_GET(curpcb)->pcb_onfault == NULL)) { + trap_fatal(frame, eva); + return (-1); + } } /* diff --git a/sys/i386/i386/trap.c b/sys/i386/i386/trap.c index 5a8016c..e6d2b5a 100644 --- a/sys/i386/i386/trap.c +++ b/sys/i386/i386/trap.c @@ -831,6 +831,11 @@ trap_pfault(frame, usermode, eva) goto nogo; map = &vm->vm_map; + if (!usermode && (td->td_intr_nesting_level != 0 || + PCPU_GET(curpcb)->pcb_onfault == NULL)) { + trap_fatal(frame, eva); + return (-1); + } } /* diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c index 3527ed1..a69b7b8 100644 --- a/sys/kern/subr_trap.c +++ b/sys/kern/subr_trap.c @@ -99,6 +99,8 @@ userret(struct thread *td, struct trapframe *frame) CTR3(KTR_SYSC, "userret: thread %p (pid %d, %s)", td, p->p_pid, td->td_name); + KASSERT((p->p_flag & P_WEXIT) == 0, + ("Exiting process returns to usermode")); #if 0 #ifdef DIAGNOSTIC /* Check that we called signotify() enough. */ pgpMIIm18QgD2.pgp Description: PGP signature
Re: debugging frequent kernel panics on 8.2-RELEASE
Thanks to the debug that Steven provided and to the help that I received from Kostik, I think that now I understand the basic mechanics of this panic, but, unfortunately, not the details of its root cause. It seems like everything starts with some kind of a race between terminating processes in a jail and termination of the jail itself. This is where the details are very thin so far. What we see is that a process (http) is in exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT flag is set and even past the place where p_limit is freed and reset to NULL. At that place the thread calls prison_proc_free(), which calls prison_deref(). Then, we see that in prison_deref() the thread gets a page fault because of what seems like a NULL pointer dereference. That's just the start of the problem and its root cause. Then, trap_pfault() gets invoked and, because addresses close to NULL look like userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn goes on to call vm_map_growstack. First thing that vm_map_growstack does is a call to lim_cur(), but because p_limit is already NULL, that call results in a NULL pointer dereference and a page fault. Goto the beginning of this paragraph. So we get this recursion of sorts, which only ends when a stack is exhausted and a CPU generates a double-fault. So, of course, Steven is interested in finding and fixing the root cause. I hope we will get to that with some help from the "prison guards" :-) But I also would like to use this opportunity to discuss how we can make it easier to debug such issue as this. I think that this problem demonstrates that when we treat certain junk in kernel address value as a userland address value, we throw additional heaps of irrelevant stuff on top of an actual problem. One solution could be to use a special flag that would mark all actual attempts to access userland address (e.g. setting the flag on entrance to copyin and clearing it upon return), so that in the page fault handler we could distinguish actual faults on userland addresses from faults on garbage kernel addresses. I am sure that there could be other clever techniques to catch such garbage addresses early. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Wednesday, August 17, 2011 1:56 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 17/08/2011 15:15 Steven Hartland said the following: define allpcpu set $i = 0 while ($i <= mp_maxid) p *cpuid_to_pcpu[$i] set $i = $i + 1 end end allpcpu Here's the output. [snip] $3 = {pc_curthread = 0xff06b7f9c000, pc_idlethread = 0xff0012d85460, pc_fpcurthread = 0x0, pc_deadthread = 0x0, pc_curpcb = 0xff8d8f35ad00, pc_switchtime = 564139963042291, pc_switchticks = 247796550, pc_cpuid = 2, pc_cpumask = 4, pc_other_cpus = 16777211, pc_allcpu = {sle_next = 0x808af680}, pc_spinlocks = 0x0, pc_cnt = {v_swtch = 1005391948, v_trap = 95927887, v_syscall = 2033274537, v_intr = 137253, v_soft = 151981308, v_vm_faults = 14199910, v_cow_faults = 1468132, v_cow_optim = 533, v_zfod = 11032593, v_ozfod = 0, v_swapin = 0, v_swapout = 0, v_swappgsin = 0, v_swappgsout = 0, v_vnodein = 17238, v_vnodeout = 48, v_vnodepgsin = 17238, v_vnodepgsout = 378, v_intrans = 6753, v_reactivated = 0, v_pdwakeups = 0, v_pdpages = 0, v_tcached = 0, v_dfree = 0, v_pfree = 0, v_tfree = 15435380, v_page_size = 0, v_page_count = 0, v_free_reserved = 0, v_free_target = 0, v_free_min = 0, v_free_count = 0, v_wire_count = 0, v_active_count = 0, v_inactive_target = 0, v_inactive_count = 0, v_cache_count = 0, v_cache_min = 0, v_cache_max = 0, v_pageout_free_min = 0, v_interrupt_free_min = 0, v_free_severe = 0, v_forks = 24041, v_vforks = 16857, v_rforks = 0, v_kthreads = 0, v_forkpages = 6281292, v_vforkpages = 3606842, v_rforkpages = 0, v_kthreadpages = 0}, pc_cp_time = {8629094, 693, 594838, 24425, 23707811}, pc_device = 0xff0012da2500, pc_netisr = 0x0, pc_rm_queue = {rmq_next = 0x808afa50, rmq_prev = 0x808afa50}, pc_dynamic = 18446743526093326592, pc_monitorbuf = '\0' , pc_prvspace = 0x808af900, pc_curpmap = 0x8083ea50, pc_tssp = 0x808ae7d0, pc_commontssp = 0x808ae7d0, pc_rsp0 = -491518579456, pc_scratch_rsp = 140737488347240, pc_apic_id = 2, pc_acpi_id = 2, pc_fs32p = 0x808ad600, pc_gs32p = 0x808ad608, pc_ldt = 0x808ad648, pc_tss = 0x808ad638, pc_cmci_mask = 8} [snip] Thank you. A few more questions: 1. more kgdb info for the core: p *(cpuid_to_pcpu[2]->pc_curthread) p *(cpuid_to_pcpu[2]->pc_curthread->td_proc) p *(cpuid_to_pcpu[2]->pc_curthread->td_proc->p_limit) (kgdb) p *(cpuid_to_pcpu[2]->pc_curthread) $1 = {td_lock = 0x8084a440, td_proc = 0xff070b5a48c0, td_plist = {tqe_next = 0x0, tqe_prev = 0xff070b5a48d0}, td_runq = {tqe_next = 0x0, tqe_prev = 0x8084a688}, td_slpq = {tqe_next = 0x0, tqe_prev = 0xff0296460900}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xff8d8fb5c8b0}, td_cpuset = 0xff0012d65dc8, td_sel = 0xff0a1b76c700, td_sleepqueue = 0xff0296460900, td_turnstile = 0xff05f31d8000, td_umtxq = 0xff05513d9780, td_tid = 102057, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xff06b7f9c0a0}, sq_proc = 0xff070b5a48c0, sq_flags = 1}, td_flags = 6, td_inhibitors = 0, td_pflags = 0, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 2 '\002', td_oncpu = 2 '\002', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 998, td_rw_rlocks = 0, td_lk_slocks = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 1, td_ucred = 0xff0551cf9900, td_estcpu = 0, td_slptick = 0, td_blktick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 2068, ru_ixrss = 5280, ru_idrss = 19296, ru_isrss = 6144, ru_minflt = 5015, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 241, ru_msgrcv = 2076, ru_nsignals = 1, ru_nvcsw = 2264, ru_nivcsw = 159}, td_incruntime = 4257692, td_runtime = 487523210, td_pticks = 0, td_sticks = 0, td_iticks = 0, td_uticks = 0, td_intrval = 4, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {16384, 0, 0, 0}}, td_generation = 2423, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "httpd", '\0' , td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = { ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0, ksi_flags = 0, k
Re: debugging frequent kernel panics on 8.2-RELEASE
on 17/08/2011 15:15 Steven Hartland said the following: >> define allpcpu >> set $i = 0 >> while ($i <= mp_maxid) >> p *cpuid_to_pcpu[$i] >> set $i = $i + 1 >> end >> end >> allpcpu > > Here's the output. [snip] > $3 = {pc_curthread = 0xff06b7f9c000, pc_idlethread = 0xff0012d85460, > pc_fpcurthread = 0x0, pc_deadthread = 0x0, pc_curpcb = 0xff8d8f35ad00, > pc_switchtime = 564139963042291, pc_switchticks = 247796550, pc_cpuid = 2, > pc_cpumask = 4, pc_other_cpus = 16777211, pc_allcpu = {sle_next = > 0x808af680}, pc_spinlocks = 0x0, pc_cnt = {v_swtch = 1005391948, > v_trap = > 95927887, v_syscall = 2033274537, v_intr = 137253, v_soft = 151981308, >v_vm_faults = 14199910, v_cow_faults = 1468132, v_cow_optim = 533, v_zfod = > 11032593, v_ozfod = 0, v_swapin = 0, v_swapout = 0, v_swappgsin = 0, > v_swappgsout > = 0, v_vnodein = 17238, v_vnodeout = 48, v_vnodepgsin = 17238, >v_vnodepgsout = 378, v_intrans = 6753, v_reactivated = 0, v_pdwakeups = 0, > v_pdpages = 0, v_tcached = 0, v_dfree = 0, v_pfree = 0, v_tfree = 15435380, > v_page_size = 0, v_page_count = 0, v_free_reserved = 0, >v_free_target = 0, v_free_min = 0, v_free_count = 0, v_wire_count = 0, > v_active_count = 0, v_inactive_target = 0, v_inactive_count = 0, > v_cache_count = > 0, v_cache_min = 0, v_cache_max = 0, v_pageout_free_min = 0, >v_interrupt_free_min = 0, v_free_severe = 0, v_forks = 24041, v_vforks = > 16857, > v_rforks = 0, v_kthreads = 0, v_forkpages = 6281292, v_vforkpages = 3606842, > v_rforkpages = 0, v_kthreadpages = 0}, pc_cp_time = {8629094, >693, 594838, 24425, 23707811}, pc_device = 0xff0012da2500, pc_netisr = > 0x0, > pc_rm_queue = {rmq_next = 0x808afa50, rmq_prev = 0x808afa50}, > pc_dynamic = 18446743526093326592, > pc_monitorbuf = '\0' , pc_prvspace = 0x808af900, > pc_curpmap = 0x8083ea50, pc_tssp = 0x808ae7d0, pc_commontssp = > 0x808ae7d0, pc_rsp0 = -491518579456, > pc_scratch_rsp = 140737488347240, pc_apic_id = 2, pc_acpi_id = 2, pc_fs32p = > 0x808ad600, pc_gs32p = 0x808ad608, pc_ldt = > 0x808ad648, > pc_tss = 0x808ad638, pc_cmci_mask = 8} [snip] Thank you. A few more questions: 1. more kgdb info for the core: p *(cpuid_to_pcpu[2]->pc_curthread) p *(cpuid_to_pcpu[2]->pc_curthread->td_proc) p *(cpuid_to_pcpu[2]->pc_curthread->td_proc->p_limit) 2. do you have any additional patches in your source tree besides those debugging patches that I provided to you? 3. do you have any thirdparty/out-of-tree kernel modules? 4. could you please send me your kernel config? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Wednesday, August 17, 2011 12:12 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 16/08/2011 23:43 Steven Hartland said the following: - Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Tuesday, August 16, 2011 9:30 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 15/08/2011 17:56 Steven Hartland said the following: (kgdb) x/512a 0xff8d8f357210 [snip] Can you please also provide the following for this core? list *vm_map_growstack+93 list *lim_cur+17 list *lim_rlimit+18 Also, it would be interesting to get panic output with DDB option. Here's the info:- (kgdb) list *vm_map_growstack+93 0x80543ffd is in vm_map_growstack (/usr/src/sys/vm/vm_map.c:3305). 3300struct uidinfo *uip; 3301 3302Retry: 3303PROC_LOCK(p); 3304stacklim = lim_cur(p, RLIMIT_STACK); 3305vmemlim = lim_cur(p, RLIMIT_VMEM); 3306PROC_UNLOCK(p); 3307 3308vm_map_lock_read(map); 3309 (kgdb) list *lim_cur+17 0x80384681 is in lim_cur (/usr/src/sys/kern/kern_resource.c:1150). 1145rlim_t 1146lim_cur(struct proc *p, int which) 1147{ 1148struct rlimit rl; 1149 1150lim_rlimit(p, which, &rl); 1151return (rl.rlim_cur); 1152} 1153 1154/* (kgdb) list *lim_rlimit+18 0x80384632 is in lim_rlimit (/usr/src/sys/kern/kern_resource.c:1165). 1160{ 1161 1162PROC_LOCK_ASSERT(p, MA_OWNED); 1163KASSERT(which >= 0 && which < RLIM_NLIMITS, 1164("request for invalid resource limit")); 1165*rlp = p->p_limit->pl_rlimit[which]; 1166if (p->p_sysent->sv_fixlimit != NULL) 1167p->p_sysent->sv_fixlimit(rlp, which); 1168} 1169 I've yet to have the machine with DDB + expanded stack panic. I plan to leave it a day or so more then try a reboot to see if that triggers it. If not I'll drop the stack back down to 4 and see if that enables us to get another panic. OK, thank you for continuing to debug this! No thank you for the help :) Another request: could you please execute the following commands in kgdb on the above core file? define allpcpu set $i = 0 while ($i <= mp_maxid) p *cpuid_to_pcpu[$i] set $i = $i + 1 end end allpcpu Here's the output. $1 = {pc_curthread = 0xff0012d708c0, pc_idlethread = 0xff0012d838c0, pc_fpcurthread = 0x0, pc_deadthread = 0x0, pc_curpcb = 0xff8000149d00, pc_switchtime = 564139965450231, pc_switchticks = 247796551, pc_cpuid = 0, pc_cpumask = 1, pc_other_cpus = 16777214, pc_allcpu = {sle_next = 0x0}, pc_spinlocks = 0x0, pc_cnt = {v_swtch = 1246344506, v_trap = 121031682, v_syscall = 2590785278, v_intr = 866415, v_soft = 174249227, v_vm_faults = 24640099, v_cow_faults = 2606934, v_cow_optim = 678, v_zfod = 19177479, v_ozfod = 0, v_swapin = 0, v_swapout = 0, v_swappgsin = 0, v_swappgsout = 0, v_vnodein = 24007, v_vnodeout = 41, v_vnodepgsin = 24007, v_vnodepgsout = 322, v_intrans = 7300, v_reactivated = 0, v_pdwakeups = 0, v_pdpages = 0, v_tcached = 0, v_dfree = 0, v_pfree = 0, v_tfree = 25056637, v_page_size = 0, v_page_count = 0, v_free_reserved = 0, v_free_target = 0, v_free_min = 0, v_free_count = 0, v_wire_count = 0, v_active_count = 0, v_inactive_target = 0, v_inactive_count = 0, v_cache_count = 0, v_cache_min = 0, v_cache_max = 0, v_pageout_free_min = 0, v_interrupt_free_min = 0, v_free_severe = 0, v_forks = 35906, v_vforks = 21218, v_rforks = 0, v_kthreads = 20, v_forkpages = 9357854, v_vforkpages = 4445028, v_rforkpages = 0, v_kthreadpages = 0}, pc_cp_time = {9035196, 1438, 426481, 1091491, 22402335}, pc_device = 0xff0012da2700, pc_netisr = 0xff0012cfe500, pc_rm_queue = {rmq_next = 0x808af550, rmq_prev = 0x808af550}, pc_dynamic = 3737856, pc_monitorbuf = '\0' , pc_prvspace = 0x808af400, pc_curpmap = 0xff0012d74ef8, pc_tssp = 0x808ae700, pc_commontssp = 0x808ae700, pc_rsp0 = -549754462976, pc_scratch_rsp = 140737488348968, pc_apic_id = 0, pc_acpi_id = 1, pc_fs32p = 0x808ad530, pc_gs32p = 0x808ad538, pc_ldt = 0x808ad578, pc_tss = 0x808ad568, pc_cmci_mask = 364} $2 = {pc_curthread = 0xff0012d85000, pc_idlethread = 0xff0012d85000, pc_fpcurthread = 0x0, pc_deadthread = 0x0, pc_curpcb = 0xff80001bcd00, pc_switchtime = 564139964769035, pc_switchticks = 247796551, pc_cpuid = 1, pc_cpumask = 2, pc_other_cpus = 16777213, pc_allcpu = {sle_next = 0x808af400}, pc_spinlocks = 0x0, pc_cnt = {v_swtch = 457697994, v_trap = 61700571, v_syscall = 670428238, v_intr = 298981, v_soft = 58852682, v_vm_faults = 7228810, v_cow_faults = 442573, v_cow_optim = 116, v_zfod = 6
Re: debugging frequent kernel panics on 8.2-RELEASE
on 17/08/2011 14:12 Andriy Gapon said the following: > A little bit later I will send you another patch that, I hope, will produce > better > diagnostics for this crash (without DDB in kernel). The patch: Index: sys/amd64/amd64/trap.c === --- sys/amd64/amd64/trap.c (revision 224782) +++ sys/amd64/amd64/trap.c (working copy) @@ -198,6 +198,10 @@ PCPU_INC(cnt.v_trap); type = frame->tf_trapno; + if ((uintptr_t)frame->tf_rip >= (uintptr_t)&lim_rlimit + && (uintptr_t)frame->tf_rip < (uintptr_t)&lim_rlimit + 40) + panic("trap in lim_rlimit"); + #ifdef SMP /* Handler for NMI IPIs used for stopping CPUs. */ if (type == T_NMI) { -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 16/08/2011 23:43 Steven Hartland said the following: > > - Original Message - From: "Andriy Gapon" > To: "Steven Hartland" > Cc: > Sent: Tuesday, August 16, 2011 9:30 PM > Subject: Re: debugging frequent kernel panics on 8.2-RELEASE > > >> on 15/08/2011 17:56 Steven Hartland said the following: >>> (kgdb) x/512a 0xff8d8f357210 >> [snip] >> >> Can you please also provide the following for this core? >> list *vm_map_growstack+93 >> list *lim_cur+17 >> list *lim_rlimit+18 >> >> Also, it would be interesting to get panic output with DDB option. > > Here's the info:- > > (kgdb) list *vm_map_growstack+93 > 0x80543ffd is in vm_map_growstack (/usr/src/sys/vm/vm_map.c:3305). > 3300struct uidinfo *uip; > 3301 > 3302Retry: > 3303PROC_LOCK(p); > 3304stacklim = lim_cur(p, RLIMIT_STACK); > 3305vmemlim = lim_cur(p, RLIMIT_VMEM); > 3306PROC_UNLOCK(p); > 3307 > 3308vm_map_lock_read(map); > 3309 > (kgdb) list *lim_cur+17 > 0x80384681 is in lim_cur (/usr/src/sys/kern/kern_resource.c:1150). > 1145rlim_t > 1146lim_cur(struct proc *p, int which) > 1147{ > 1148struct rlimit rl; > 1149 > 1150lim_rlimit(p, which, &rl); > 1151return (rl.rlim_cur); > 1152} > 1153 > 1154/* > (kgdb) list *lim_rlimit+18 > 0x80384632 is in lim_rlimit (/usr/src/sys/kern/kern_resource.c:1165). > 1160{ > 1161 > 1162PROC_LOCK_ASSERT(p, MA_OWNED); > 1163KASSERT(which >= 0 && which < RLIM_NLIMITS, > 1164("request for invalid resource limit")); > 1165*rlp = p->p_limit->pl_rlimit[which]; > 1166if (p->p_sysent->sv_fixlimit != NULL) > 1167p->p_sysent->sv_fixlimit(rlp, which); > 1168} > 1169 > > I've yet to have the machine with DDB + expanded stack panic. > > I plan to leave it a day or so more then try a reboot to see if that > triggers it. If not I'll drop the stack back down to 4 and see if that > enables us to get another panic. OK, thank you for continuing to debug this! Another request: could you please execute the following commands in kgdb on the above core file? define allpcpu set $i = 0 while ($i <= mp_maxid) p *cpuid_to_pcpu[$i] set $i = $i + 1 end end allpcpu A little bit later I will send you another patch that, I hope, will produce better diagnostics for this crash (without DDB in kernel). -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Tuesday, August 16, 2011 9:30 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 15/08/2011 17:56 Steven Hartland said the following: (kgdb) x/512a 0xff8d8f357210 [snip] Can you please also provide the following for this core? list *vm_map_growstack+93 list *lim_cur+17 list *lim_rlimit+18 Also, it would be interesting to get panic output with DDB option. Here's the info:- (kgdb) list *vm_map_growstack+93 0x80543ffd is in vm_map_growstack (/usr/src/sys/vm/vm_map.c:3305). 3300struct uidinfo *uip; 3301 3302Retry: 3303PROC_LOCK(p); 3304stacklim = lim_cur(p, RLIMIT_STACK); 3305vmemlim = lim_cur(p, RLIMIT_VMEM); 3306PROC_UNLOCK(p); 3307 3308vm_map_lock_read(map); 3309 (kgdb) list *lim_cur+17 0x80384681 is in lim_cur (/usr/src/sys/kern/kern_resource.c:1150). 1145rlim_t 1146lim_cur(struct proc *p, int which) 1147{ 1148struct rlimit rl; 1149 1150lim_rlimit(p, which, &rl); 1151return (rl.rlim_cur); 1152} 1153 1154/* (kgdb) list *lim_rlimit+18 0x80384632 is in lim_rlimit (/usr/src/sys/kern/kern_resource.c:1165). 1160{ 1161 1162PROC_LOCK_ASSERT(p, MA_OWNED); 1163KASSERT(which >= 0 && which < RLIM_NLIMITS, 1164("request for invalid resource limit")); 1165*rlp = p->p_limit->pl_rlimit[which]; 1166if (p->p_sysent->sv_fixlimit != NULL) 1167p->p_sysent->sv_fixlimit(rlp, which); 1168} 1169 I've yet to have the machine with DDB + expanded stack panic. I plan to leave it a day or so more then try a reboot to see if that triggers it. If not I'll drop the stack back down to 4 and see if that enables us to get another panic. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 15/08/2011 17:56 Steven Hartland said the following: > (kgdb) x/512a 0xff8d8f357210 [snip] Can you please also provide the following for this core? list *vm_map_growstack+93 list *lim_cur+17 list *lim_rlimit+18 Also, it would be interesting to get panic output with DDB option. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Monday, August 15, 2011 4:36 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 15/08/2011 17:56 Steven Hartland said the following: - Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Monday, August 15, 2011 2:20 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 15/08/2011 15:51 Steven Hartland said the following: - Original Message - From: "Andriy Gapon" on 15/08/2011 13:34 Steven Hartland said the following: (kgdb) list *0x8053b691 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). 234 /* 235 * Find the backing store object and offset into it to begin the 236 * search. 237 */ 238 fs.map = map; 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, 240 &fs.first_object, &fs.first_pindex, &prot, &wired); 241 if (result != KERN_SUCCESS) { 242 if (result != KERN_PROTECTION_FAILURE || 243 (fault_flags & VM_FAULT_WIRE_MASK) != VM_FAULT_USER_WIRE) { Interesting... thanks! [snip] (kgdb) x/512a 0xff8d8f357210 This is not conclusive, but that stack looks like the following recursive chain: vm_fault -> {vm_map_lookup, vm_map_growstack} -> trap -> trap_pfault -> vm_fault So I suspect that increasing kernel stack size won't help here much. Where does this chain come from? I have no answer at the moment, maybe other developers could help here. I suspect that we shouldn't be getting that trap in vm_map_growstack or should handle it in a different way. Just in case its relevant I've checked other crashes and all rip entries point to: vm_fault (/usr/src/sys/vm/vm_fault.c:239). A more typical layout is from a selection of machines is:- Unread portion of the kernel message buffer: Fatal double fault rip = 0x8053b061 rsp = 0xff86ccf8ffb0 rbp = 0xff86ccf90210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0x803bb28e at kdb_backtrace+0x5e #1 0x80389187 at panic+0x187 #2 0x8057fc86 at dblfault_handler+0x96 #3 0x805689dd at Xdblfault+0xad Uptime: 2d21h25m4s Physical memory: 24555 MB Dumping 4184 MB:... Unread portion of the kernel message buffer: Fatal double fault rip = 0x8053b061 rsp = 0xff86cc742fb0 rbp = 0xff86cc743210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0x803bb28e at kdb_backtrace+0x5e #1 0x80389187 at panic+0x187 #2 0x8057fc86 at dblfault_handler+0x96 #3 0x805689dd at Xdblfault+0xad Uptime: 2d4h30m58s Physical memory: 24555 MB Dumping 5088 MB:... Fatal double fault rip = 0x8053b061 rsp = 0xff86caeabfb0 rbp = 0xff86caeac210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0x803bb28e at kdb_backtrace+0x5e #1 0x80389187 at panic+0x187 #2 0x8057fc86 at dblfault_handler+0x96 #3 0x805689dd at Xdblfault+0xad Uptime: 3d1h56m45s Physical memory: 24555 MB Dumping 4690 MB:... Fatal double fault rip = 0x8053b061 rsp = 0xff86cb1c7fb0 rbp = 0xff86cb1c8210 cpuid = 4; apic id = 04 panic: double fault cpuid = 4 KDB: stack backtrace: #0 0x803bb28e at kdb_backtrace+0x5e #1 0x80389187 at panic+0x187 #2 0x8057fc86 at dblfault_handler+0x96 #3 0x805689dd at Xdblfault+0xad Uptime: 1d13h41m19s Physical memory: 24555 MB Dumping 3626 MB:... And in case any of the changes to loader.conf or sysctl.conf are relevant here they are:- [loader.conf] zfs_load="YES" vfs.root.mountfrom="zfs:tank/root" # fix swap zone exhausted, increase kern.maxswzone kern.maxswzone=67108864 # Reduce the minimum arc level we want our apps to have the memory vfs.zfs.arc_min="512M" [/loader.conf] [sysctl.conf] vfs.read_max=32 net.inet.tcp.inflight.enable=0 net.inet.tcp.sendspace=65536 kern.ipc.maxsockbuf=524288 kern.maxfiles=5 kern.ipc.nmbclusters=51200 [/sysctl.conf] Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 15/08/2011 17:56 Steven Hartland said the following: > > - Original Message - From: "Andriy Gapon" > To: "Steven Hartland" > Cc: > Sent: Monday, August 15, 2011 2:20 PM > Subject: Re: debugging frequent kernel panics on 8.2-RELEASE > > >> on 15/08/2011 15:51 Steven Hartland said the following: >>> - Original Message - From: "Andriy Gapon" >>> >>> >>>> on 15/08/2011 13:34 Steven Hartland said the following: >>>>> (kgdb) list *0x8053b691 >>>>> 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). >>>>> 234 /* >>>>> 235 * Find the backing store object and offset into it to >>>>> begin the >>>>> 236 * search. >>>>> 237 */ >>>>> 238 fs.map = map; >>>>> 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, >>>>> &fs.entry, >>>>> 240 &fs.first_object, &fs.first_pindex, &prot, &wired); >>>>> 241 if (result != KERN_SUCCESS) { >>>>> 242 if (result != KERN_PROTECTION_FAILURE || >>>>> 243 (fault_flags & VM_FAULT_WIRE_MASK) != >>>>> VM_FAULT_USER_WIRE) { >>>>> >>>> >>>> Interesting... thanks! [snip] > (kgdb) x/512a 0xff8d8f357210 This is not conclusive, but that stack looks like the following recursive chain: vm_fault -> {vm_map_lookup, vm_map_growstack} -> trap -> trap_pfault -> vm_fault So I suspect that increasing kernel stack size won't help here much. Where does this chain come from? I have no answer at the moment, maybe other developers could help here. I suspect that we shouldn't be getting that trap in vm_map_growstack or should handle it in a different way. > 0xff8d8f357210: 0xff8d8f357280 0x805807d3 > > 0xff8d8f357220: 0x0 0xff8d8f357370 > 0xff8d8f357230: 0xff06b7f9c000 0x30 > 0xff8d8f357240: 0x1 0x0 > 0xff8d8f357250: 0x0 0x9 > 0xff8d8f357260: 0xc 0xff8d8f357370 > 0xff8d8f357270: 0xff06b7f9c000 0x0 > 0xff8d8f357280: 0xff8d8f357360 0x80580e0f > 0xff8d8f357290: 0x0 0x0 > 0xff8d8f3572a0: 0x80074e49e 0x2 > 0xff8d8f3572b0: 0x80071cba0 0x80071cdc0 > 0xff8d8f3572c0: 0x80071c9a0 0x0 > 0xff8d8f3572d0: 0x0 0x0 > 0xff8d8f3572e0: 0x0 0x0 > 0xff8d8f3572f0: 0x0 0x0 > 0xff8d8f357300: 0x80074e49e 0x1 > 0xff8d8f357310: 0x80071cba0 0x80071cdc0 > 0xff8d8f357320: 0x80071c9a0 0x0 > 0xff8d8f357330: 0x0 0x4 > 0xff8d8f357340: 0xff070b5a48c0 0xff06b7f9c000 > 0xff8d8f357350: 0x0 0x8083e920 > 0xff8d8f357360: 0xff8d8f357430 0x80568f04 > > 0xff8d8f357370: 0xff070b5a48c0 0x3 > 0xff8d8f357380: 0xff8d8f357440 0x0 > 0xff8d8f357390: 0xff8d8f357440 0x30 > 0xff8d8f3573a0: 0xff06b7f9c000 0x4 > 0xff8d8f3573b0: 0xff8d8f357430 0x8083e920 > 0xff8d8f3573c0: 0xff06b7f9c000 0xff070b5a48c0 > 0xff8d8f3573d0: 0xff06b7f9c000 0x0 > 0xff8d8f3573e0: 0x8083e9200x1b0013000c > 0xff8d8f3573f0: 0x300x3b003b0001 > 0xff8d8f357400: 0x0 0x80384632 > 0xff8d8f357410: 0x200x10206 > 0xff8d8f357420: 0xff8d8f357430 0x28 > 0xff8d8f357430: 0xff8d8f357450 0x80384681 > > 0xff8d8f357440: 0x4 0xff070b5a48c0 > 0xff8d8f357450: 0xff8d8f357500 0x80543ffd > > 0xff8d8f357460: 0xff8d8f357470 0xff8d8f3576d8 > 0xff8d8f357470: 0xff8d8f357500 0x80544ef8 > [trim] -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Monday, August 15, 2011 2:20 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE on 15/08/2011 15:51 Steven Hartland said the following: - Original Message - From: "Andriy Gapon" on 15/08/2011 13:34 Steven Hartland said the following: (kgdb) list *0x8053b691 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). 234 /* 235 * Find the backing store object and offset into it to begin the 236 * search. 237 */ 238 fs.map = map; 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, 240 &fs.first_object, &fs.first_pindex, &prot, &wired); 241 if (result != KERN_SUCCESS) { 242 if (result != KERN_PROTECTION_FAILURE || 243 (fault_flags & VM_FAULT_WIRE_MASK) != VM_FAULT_USER_WIRE) { Interesting... thanks! Can you please also additionally provide (lengthy) output of x/512a 0xff8d8f356fb0 ? Sorry I'm not sure I follow your their? It seems that you got me correctly :) Do you mean any of the following:- (kgdb) x/512a 0xff8d8f35b000: Cannot access memory at address 0xff8d8f35b000 (kgdb) list *0xff8d8f356fb0 No source file for address 0xff8d8f356fb0. or: (kgdb) x/512a 0xff8d8f356fb0 0xff8d8f356fb0: Cannot access memory at address 0xff8d8f356fb0 Can you please try this (the last command) with 0xff8d8f357210 instead? (kgdb) x/512a 0xff8d8f357210 0xff8d8f357210: 0xff8d8f357280 0x805807d3 0xff8d8f357220: 0x0 0xff8d8f357370 0xff8d8f357230: 0xff06b7f9c000 0x30 0xff8d8f357240: 0x1 0x0 0xff8d8f357250: 0x0 0x9 0xff8d8f357260: 0xc 0xff8d8f357370 0xff8d8f357270: 0xff06b7f9c000 0x0 0xff8d8f357280: 0xff8d8f357360 0x80580e0f 0xff8d8f357290: 0x0 0x0 0xff8d8f3572a0: 0x80074e49e 0x2 0xff8d8f3572b0: 0x80071cba0 0x80071cdc0 0xff8d8f3572c0: 0x80071c9a0 0x0 0xff8d8f3572d0: 0x0 0x0 0xff8d8f3572e0: 0x0 0x0 0xff8d8f3572f0: 0x0 0x0 0xff8d8f357300: 0x80074e49e 0x1 0xff8d8f357310: 0x80071cba0 0x80071cdc0 0xff8d8f357320: 0x80071c9a0 0x0 0xff8d8f357330: 0x0 0x4 0xff8d8f357340: 0xff070b5a48c0 0xff06b7f9c000 0xff8d8f357350: 0x0 0x8083e920 0xff8d8f357360: 0xff8d8f357430 0x80568f04 0xff8d8f357370: 0xff070b5a48c0 0x3 0xff8d8f357380: 0xff8d8f357440 0x0 0xff8d8f357390: 0xff8d8f357440 0x30 0xff8d8f3573a0: 0xff06b7f9c000 0x4 0xff8d8f3573b0: 0xff8d8f357430 0x8083e920 0xff8d8f3573c0: 0xff06b7f9c000 0xff070b5a48c0 0xff8d8f3573d0: 0xff06b7f9c000 0x0 0xff8d8f3573e0: 0x8083e9200x1b0013000c 0xff8d8f3573f0: 0x300x3b003b0001 0xff8d8f357400: 0x0 0x80384632 0xff8d8f357410: 0x200x10206 0xff8d8f357420: 0xff8d8f357430 0x28 0xff8d8f357430: 0xff8d8f357450 0x80384681 0xff8d8f357440: 0x4 0xff070b5a48c0 0xff8d8f357450: 0xff8d8f357500 0x80543ffd 0xff8d8f357460: 0xff8d8f357470 0xff8d8f3576d8 0xff8d8f357470: 0xff8d8f357500 0x80544ef8 0xff8d8f357480: 0xff070b5a49b8 0x0 0xff8d8f357490: 0x8 0xff06b7f9c000 0xff8d8f3574a0: 0xff06b7f9c000 0xff8d8f3576d8 0xff8d8f3574b0: 0xff8d8f3576d0 0xff8d8f3576e8 0xff8d8f3574c0: 0x0 0xff8d8f3576e0 0xff8d8f3574d0: 0x10001 0x1 0xff8d8f3574e0: 0xff06b7f9c000 0x1 0xff8d8f3574f0: 0x0 0x8083e920 0xff8d8f357500: 0xff8d8f357770 0x8053c723 0xff8d8f357510: 0xff8d8f35773f 0xff8d8f357738 0xff8d8f357520: 0x80085e4f9 0x80085e4f8 0xff8d8f357530: 0xff06b7f9c000 0xff8d8f3576e0 0xff8d8f357540: 0xff8d8f3576e8 0xff8d8f3576d0 0xff8d8f357550: 0xff8d8f3576d8 0x80085e4f9 0xff8d8f357560: 0x80085e4f9 0x80085e4f9 0xff8d8f357570: 0x80085e4f9 0x80085e4f9 0xff8d8f357580: 0x80085e4f9 0x80085e4f9 0xff8d8f357590: 0x80085e4f9 0x80085e4f9 0xff8d8f3575a0: 0x80085e4f9 0x734210 0xff8d8f3575b0: 0x101 0x80073ada0 0xff8d8f3575c0: 0x0 0x8083e920 0xff8d8f3575d0: 0x80073aec0 0x1 0xff8d8f3575e0: 0x6967614d 0x
Re: debugging frequent kernel panics on 8.2-RELEASE
on 15/08/2011 15:51 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" > > >> on 15/08/2011 13:34 Steven Hartland said the following: >>> (kgdb) list *0x8053b691 >>> 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). >>> 234 /* >>> 235 * Find the backing store object and offset into it to >>> begin the >>> 236 * search. >>> 237 */ >>> 238 fs.map = map; >>> 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, >>> &fs.entry, >>> 240 &fs.first_object, &fs.first_pindex, &prot, &wired); >>> 241 if (result != KERN_SUCCESS) { >>> 242 if (result != KERN_PROTECTION_FAILURE || >>> 243 (fault_flags & VM_FAULT_WIRE_MASK) != >>> VM_FAULT_USER_WIRE) { >>> >> >> Interesting... thanks! >> Can you please also additionally provide (lengthy) output of x/512a >> 0xff8d8f356fb0 ? > > Sorry I'm not sure I follow your their? It seems that you got me correctly :) > Do you mean any of the following:- > (kgdb) x/512a > 0xff8d8f35b000: Cannot access memory at address 0xff8d8f35b000 > > (kgdb) list *0xff8d8f356fb0 > No source file for address 0xff8d8f356fb0. > > or: > (kgdb) x/512a 0xff8d8f356fb0 > 0xff8d8f356fb0: Cannot access memory at address 0xff8d8f356fb0 Can you please try this (the last command) with 0xff8d8f357210 instead? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" on 15/08/2011 13:34 Steven Hartland said the following: (kgdb) list *0x8053b691 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). 234 /* 235 * Find the backing store object and offset into it to begin the 236 * search. 237 */ 238 fs.map = map; 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, 240 &fs.first_object, &fs.first_pindex, &prot, &wired); 241 if (result != KERN_SUCCESS) { 242 if (result != KERN_PROTECTION_FAILURE || 243 (fault_flags & VM_FAULT_WIRE_MASK) != VM_FAULT_USER_WIRE) { Interesting... thanks! Can you please also additionally provide (lengthy) output of x/512a 0xff8d8f356fb0 ? Sorry I'm not sure I follow your their? Do you mean any of the following:- (kgdb) x/512a 0xff8d8f35b000: Cannot access memory at address 0xff8d8f35b000 (kgdb) list *0xff8d8f356fb0 No source file for address 0xff8d8f356fb0. or: (kgdb) x/512a 0xff8d8f356fb0 0xff8d8f356fb0: Cannot access memory at address 0xff8d8f356fb0 Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 15/08/2011 13:34 Steven Hartland said the following: > (kgdb) list *0x8053b691 > 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). > 234 /* > 235 * Find the backing store object and offset into it to begin > the > 236 * search. > 237 */ > 238 fs.map = map; > 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, > 240 &fs.first_object, &fs.first_pindex, &prot, &wired); > 241 if (result != KERN_SUCCESS) { > 242 if (result != KERN_PROTECTION_FAILURE || > 243 (fault_flags & VM_FAULT_WIRE_MASK) != > VM_FAULT_USER_WIRE) { > Interesting... thanks! Can you please also additionally provide (lengthy) output of x/512a 0xff8d8f356fb0 ? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 15/08/2011 13:34 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" >> I think (not 100% sure) that with DDB in kernel we could get a better >> backtrace >> here, possibly with pre-dblfault stack frames, because DDB backend is a bit >> more >> smarter than the trivial stack(9) printer. > > I've added this into the the kernel on my test machine and will try > to get it panic over the next few days. Seems to need a few days on > uptime before the panics start happening. In addition to increasing > KSTACK_PAGES to 12, if you believe this may be stack exhaustion, do > you want me to remove this increase? Yes, I think it would make sense to change KSTACK_PAGES to the default value. But, OTOH, if you can afford to have DDB in a few more machines, then it would be interesting to compare behavior with different stack sizes. BTW, if you don't want your machines to sit at ddb prompt after panic, then you'd also need either KDB_UNATTENDED option or set debug.debugger_on_panic=0. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" We have 352 thread entries starting with:- #0 sched_switch (td=0x8083e4e0, newtd=0xff0012d838c0, flags=Variable "flags" is not available. 23 with:- cpustop_handler () at atomic.h:285 and 16 with:- #0 fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:562 I would like to get a full output of thread apply all bt. http://blog.multplay.co.uk/dropzone/freebsd/panic-2011-08-14-1524.txt The main message being:- panic: double fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: <118>Aug 14 15:13:33 amsbld15 syslogd: exiting on signal 15 So this line, does it indicate a shutdown of a jail or of the whole system? This specific panic was caused by me running "reboot" after all jails (~40) where shutdown, which is slightly different from what my collegue was seeing last friday, where the machines where panicing when the jails themselves where stopped. I may have a crash from one of these if needed. Fatal double fault rip = 0x8053b691 Can you please provide output of 'list *0x8053b691' in kgdb? (kgdb) list *0x8053b691 0x8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). 234 /* 235 * Find the backing store object and offset into it to begin the 236 * search. 237 */ 238 fs.map = map; 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, 240 &fs.first_object, &fs.first_pindex, &prot, &wired); 241 if (result != KERN_SUCCESS) { 242 if (result != KERN_PROTECTION_FAILURE || 243 (fault_flags & VM_FAULT_WIRE_MASK) != VM_FAULT_USER_WIRE) { rsp = 0xff8d8f356fb0 rbp = 0xff8d8f357210 cpuid = 2; apic id = 02 panic: double fault cpuid = 2 KDB: stack backtrace: #0 0x803bb75e at kdb_backtrace+0x5e #1 0x8038956e at panic+0x2ae #2 0x805802b6 at dblfault_handler+0x96 #3 0x8056900d at Xdblfault+0xad I think (not 100% sure) that with DDB in kernel we could get a better backtrace here, possibly with pre-dblfault stack frames, because DDB backend is a bit more smarter than the trivial stack(9) printer. I've added this into the the kernel on my test machine and will try to get it panic over the next few days. Seems to need a few days on uptime before the panics start happening. In addition to increasing KSTACK_PAGES to 12, if you believe this may be stack exhaustion, do you want me to remove this increase? stack: 0xff8d8f357000, 4 One thing I can say is that this looks like like a double-fault because of stack exhaustion (the most typical cause): rsp value is below td_kstack. Can you please also provide the following information: p *((struct pcb *)((char *)0xff8d8f357000 + KSTACK_PAGES * PAGE_SIZE) - 1) where KSTACK_PAGES is a value of KSTACK_PAGES option (amd64 default is 4) and PAGE_SIZE is 4096. (kgdb) p *((struct pcb *)((char *)0xff8d8f357000 + 4 * 4096) - 1) $1 = {pcb_r15 = -2138686968, pcb_r14 = -1070655224792, pcb_r13 = 0, pcb_r12 = -1070655225856, pcb_rbp = -491518580864, pcb_rsp = -491518580952, pcb_rbx = -1099195460512, pcb_rip = -2143622375, pcb_fsbase = 34365428376, pcb_gsbase = 0, pcb_kgsbase = 0, pcb_cr0 = 0, pcb_cr2 = 0, pcb_cr3 = 12406784, pcb_cr4 = 0, pcb_dr0 = 0, pcb_dr1 = 0, pcb_dr2 = 0, pcb_dr3 = 0, pcb_dr6 = 0, pcb_dr7 = 0, pcb_flags = 0, pcb_initial_fpucw = 895, pcb_onfault = 0x0, pcb_gs32sd = {sd_lolimit = 0, sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, sd_xx = 0, sd_long = 0, sd_def32 = 0, sd_gran = 0, sd_hibase = 0}, pcb_tssp = 0x0, pcb_save = 0xff8d8f35ae00, pcb_full_iret = 0 '\0', pcb_gdt = {rd_limit = 0, rd_base = 0}, pcb_idt = {rd_limit = 0, rd_base = 0}, pcb_ldt = {rd_limit = 0, rd_base = 0}, pcb_tr = 0, pcb_user_save = {sv_env = {en_cw = 895, en_sw = 0, en_tw = 0 '\0', en_zero = 0 '\0', en_opcode = 0, en_rip = 0, en_rdp = 0, en_mxcsr = 8096, en_mxcsr_mask = 65535}, sv_fp = {{fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"}, fp_pad = "\000\000\000\000\000"}, {fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"}, fp_pad = "\000\000\000\000\000"}, {fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"}, fp_pad = "\000\000\000\000\000"}, {fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"}, fp_pad = "\000\000\000\000\000"}, {fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"}, fp_pad = "\000\000\000\000\000"}, {fp_acc = {fp_bytes = "\000\000\000\000\000\000\000\000\000"},
Re: debugging frequent kernel panics on 8.2-RELEASE
on 14/08/2011 17:43 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" >> >> Maybe test it on couple of machines first just in case I overlooked something >> essential, although I have a report from another use that the patch didn't >> break >> anything for him (it was tested for an unrelated issue). > > We've got this running on a ~40 machines and just had the first panic > since the update. Unfortunately it doesn't seem to have changed anything :( > > We have 352 thread entries starting with:- > #0 sched_switch (td=0x8083e4e0, newtd=0xff0012d838c0, > flags=Variable "flags" is not available. > 23 with:- > cpustop_handler () at atomic.h:285 > and 16 with:- > #0 fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:562 I would like to get a full output of thread apply all bt. > The main message being:- > panic: double fault > > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > <118>Aug 14 15:13:33 amsbld15 syslogd: exiting on signal 15 So this line, does it indicate a shutdown of a jail or of the whole system? > Fatal double fault > rip = 0x8053b691 Can you please provide output of 'list *0x8053b691' in kgdb? > rsp = 0xff8d8f356fb0 > rbp = 0xff8d8f357210 > cpuid = 2; apic id = 02 > panic: double fault > cpuid = 2 > KDB: stack backtrace: > #0 0x803bb75e at kdb_backtrace+0x5e > #1 0x8038956e at panic+0x2ae > #2 0x805802b6 at dblfault_handler+0x96 > #3 0x8056900d at Xdblfault+0xad I think (not 100% sure) that with DDB in kernel we could get a better backtrace here, possibly with pre-dblfault stack frames, because DDB backend is a bit more smarter than the trivial stack(9) printer. > stack: 0xff8d8f357000, 4 One thing I can say is that this looks like like a double-fault because of stack exhaustion (the most typical cause): rsp value is below td_kstack. Can you please also provide the following information: p *((struct pcb *)((char *)0xff8d8f357000 + KSTACK_PAGES * PAGE_SIZE) - 1) where KSTACK_PAGES is a value of KSTACK_PAGES option (amd64 default is 4) and PAGE_SIZE is 4096. > rsp = 0xff89ae10 [snip] > There are some indications that stopping jails could be the > cause of the panics so on one test box I've added in invariants > to see if we get anything shows up from that. OK. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Attilio Rao" Anyway, we really would need much more information in order to take a proactive action. Would it be possible to access to one of the panic'ing machine? Is it always the same panic which is happening or it is variadic (like: once page fault, once fatal double fault, once fatal trap, etc.). They are always double fault, 99% of the time with no additional info we've seen 1 mention of java on one of the machines but the vmcore didn't seem to mention anything to do with that after dump. My colleague informs me when he did the upgrade to add in schedule stop patch, pretty much every machine paniced when shutting the java servers down, which is essentially a jail stop. I've also had two panics when rebooting my test machine to change kernel settings, although this could be a side effect of the scheduler patch? This single test machine is now running with the following none standard settings:- options INVARIANTS options INVARIANT_SUPPORT options DDB options KSTACK_PAGES=12 I've got several vmcores from a number or different machines but none seem to be any use, as they don't seem to list any thread that caused the panic i.e. no mention of dump, or fault. Is there something else in particular I should be looking for? Circumstantial evidence seems to indicate uptime may to be a factor, machines under 2 days seem much less likely to panic. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" Maybe test it on couple of machines first just in case I overlooked something essential, although I have a report from another use that the patch didn't break anything for him (it was tested for an unrelated issue). We've got this running on a ~40 machines and just had the first panic since the update. Unfortunately it doesn't seem to have changed anything :( We have 352 thread entries starting with:- #0 sched_switch (td=0x8083e4e0, newtd=0xff0012d838c0, flags=Variable "flags" is not available. 23 with:- cpustop_handler () at atomic.h:285 and 16 with:- #0 fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:562 The main message being:- panic: double fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: <118>Aug 14 15:13:33 amsbld15 syslogd: exiting on signal 15 Fatal double fault rip = 0x8053b691 rsp = 0xff8d8f356fb0 rbp = 0xff8d8f357210 cpuid = 2; apic id = 02 panic: double fault cpuid = 2 KDB: stack backtrace: #0 0x803bb75e at kdb_backtrace+0x5e #1 0x8038956e at panic+0x2ae #2 0x805802b6 at dblfault_handler+0x96 #3 0x8056900d at Xdblfault+0xad stack: 0xff8d8f357000, 4 rsp = 0xff89ae10 Uptime: 2d21h6m18s Physical memory: 49132 MB Dumping 17080 MB: 17065... Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko #0 sched_switch (td=0x8083e4e0, newtd=0xff0012d838c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 1858cpuid = PCPU_GET(cpuid); (kgdb) #0 sched_switch (td=0x8083e4e0, newtd=0xff0012d838c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 #1 0x80391a99 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:451 #2 0x803c5112 in sleepq_timedwait (wchan=0x8083e080, pri=68) at /usr/src/sys/kern/subr_sleepqueue.c:644 #3 0x80391efb in _sleep (ident=0x8083e080, lock=0x0, priority=Variable "priority" is not available.) at /usr/src/sys/kern/kern_synch.c:230 #4 0x8053ebc9 in scheduler (dummy=Variable "dummy" is not available.) at /usr/src/sys/vm/vm_glue.c:807 #5 0x80341767 in mi_startup () at /usr/src/sys/kern/init_main.c:254 #6 0x8016efdc in btext () at /usr/src/sys/amd64/amd64/locore.S:81 #7 0x80863dc8 in sleepq_chains () #8 0x80848ae0 in cpu_top () #9 0x in ?? () #10 0x8083e4e0 in proc0 () #11 0x80bb3b90 in ?? () #12 0x80bb3b38 in ?? () #13 0xff0012d838c0 in ?? () #14 0x803aeb19 in sched_switch (td=0x0, newtd=0x0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1852 Previous frame inner to this frame (corrupt stack?) There are some indications that stopping jails could be the cause of the panics so on one test box I've added in invariants to see if we get anything shows up from that. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Rick Macklem" Just a random thought that is probably not relevent, but... Is it possible that some change for the upgrade is making the machines run hotter and they're failing when they overhead? The machines have full HW monitoring and we've not seen reports of temperature issues, add to that quite a few are L series so run really cool anyway, I very much doubt it. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
Steven Hartland wrote: > - Original Message - > From: "Andriy Gapon" > > >>> I would really appreciate if you could try to reproduce the > >>> problem with the patch that I sent earlier. > >> > >> Hi Andriy, what's the risk of this patch causing other issues? > > > > I can not estimate. > > The code is supposed to affect only things that happen after panic, > > so make your guess. > > So in theory should be good. > > >> I ask as to get results from this we've going to have to roll it > >> out to over 130+ production machines, so I'd like to be clear on > >> the risks before I sign that off. > > > > I will be happy if you try the patch on a single machine > > provided the problem is that reproducible. > > Unfortunately although its happening a lot its taking the > large numbers of machines to make it that way. > > Over the 130+ machines we're seeing between 3 and 8 panics > a day, so based on that we could be waiting quite some time > for a specific machine to panic :( > > Don't think we're going to make any progress on this in the current > state so I think we'll give it a shot. > Just a random thought that is probably not relevent, but... Is it possible that some change for the upgrade is making the machines run hotter and they're failing when they overhead? rick > Regards > Steve > > > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained in > it. > > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 > or return the E.mail to postmas...@multiplay.co.uk. > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 11/08/2011 20:14 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" > I would really appreciate if you could try to reproduce the problem with the patch that I sent earlier. >>> >>> Hi Andriy, what's the risk of this patch causing other issues? >> >> I can not estimate. >> The code is supposed to affect only things that happen after panic, >> so make your guess. > > So in theory should be good. > >>> I ask as to get results from this we've going to have to roll it >>> out to over 130+ production machines, so I'd like to be clear on >>> the risks before I sign that off. >> >> I will be happy if you try the patch on a single machine >> provided the problem is that reproducible. > > Unfortunately although its happening a lot its taking the > large numbers of machines to make it that way. > > Over the 130+ machines we're seeing between 3 and 8 panics > a day, so based on that we could be waiting quite some time > for a specific machine to panic :( > > Don't think we're going to make any progress on this in the current > state so I think we'll give it a shot. Maybe test it on couple of machines first just in case I overlooked something essential, although I have a report from another use that the patch didn't break anything for him (it was tested for an unrelated issue). -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" I would really appreciate if you could try to reproduce the problem with the patch that I sent earlier. Hi Andriy, what's the risk of this patch causing other issues? I can not estimate. The code is supposed to affect only things that happen after panic, so make your guess. So in theory should be good. I ask as to get results from this we've going to have to roll it out to over 130+ production machines, so I'd like to be clear on the risks before I sign that off. I will be happy if you try the patch on a single machine provided the problem is that reproducible. Unfortunately although its happening a lot its taking the large numbers of machines to make it that way. Over the 130+ machines we're seeing between 3 and 8 panics a day, so based on that we could be waiting quite some time for a specific machine to panic :( Don't think we're going to make any progress on this in the current state so I think we'll give it a shot. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 11/08/2011 19:37 Steven Hartland said the following: > - Original Message - From: "Andriy Gapon" > >> >> I would really appreciate if you could try to reproduce the problem with the >> patch >> that I sent earlier. > > Hi Andriy, what's the risk of this patch causing other issues? I can not estimate. The code is supposed to affect only things that happen after panic, so make your guess. > I ask as to get results from this we've going to have to roll it > out to over 130+ production machines, so I'd like to be clear on > the risks before I sign that off. I will be happy if you try the patch on a single machine provided the problem is that reproducible. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" I would really appreciate if you could try to reproduce the problem with the patch that I sent earlier. Hi Andriy, what's the risk of this patch causing other issues? I ask as to get results from this we've going to have to roll it out to over 130+ production machines, so I'd like to be clear on the risks before I sign that off. Regard Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 11/08/2011 14:39 Steven Hartland said the following: > The trimmed down output, removed the 10,000's of ?? lines here:- > http://blog.multiplay.co.uk/dropzone/freebsd/panic-2011-08-11-1402.txt > > The raw output is here:- > http://blog.multiplay.co.uk/dropzone/freebsd/panic-full-2011-08-11-1402.txt.bz2 > > I'm not sure how useful its going to be as pretty much all of it seems > to be just:- > #0 sched_tswitch (td=0xff00194d4460, newtd=0xff000a74a000, > flags=Variable > "flags" is not available. > #1 0x80385c86 in mi_switch (flags=260, newtd=0x0) at > /usr/src/sys/kern/kern_synch.c:449 > #2 0x803b8a0c in sleepq_catch_signals (wchan=0xff02f27c48c0, > pri=92) > at /usr/src/sys/kern/subr_sleepqueue.c:418 > #3 0x803b9326 in sleepq_wait_sig (wchan=Variable "wchan" is not > available. > #4 0x80386149 in _sleep (ident=0xff02f27c48c0, > lock=0xff02f27c49b8, priority=Variable "priority" is not available. > #5 0x8035079d in kern_wait (td=0xff00194d4460, pid=91362, > status=0xff86cdbffabc, options=Variable "options" is not available. > #6 0x80350e95 in wait4 (td=Variable "td" is not available. > #7 0x803bb8e5 in syscallenter (td=0xff00194d4460, > sa=0xff86cdbffba0) at /usr/src/sys/kern/subr_trap.c:315 > #8 0x80574a0b in syscall (frame=0xff86cdbffc40) at > /usr/src/sys/amd64/amd64/trap.c:888 > #9 0x8055d242 in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:377 > > On one machine we had a little more info on console which may indicate > java as the problem. > > http://blog.multiplay.co.uk/dropzone/freebsd/panic-java.jpg I would really appreciate if you could try to reproduce the problem with the patch that I sent earlier. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Andriy Gapon" on 10/08/2011 18:35 Steven Hartland said the following: Fatal double fault ... #14 0x803a2cc9 in sched_switch (td=0x0, newtd=0x0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/sched_ule.c:1852 Previous frame inner to this frame (corrupt stack?) (kgdb) Looks like this is just the first thread in the kernel. Perhaps 'thread apply all bt' could help to find the culprit. The trimmed down output, removed the 10,000's of ?? lines here:- http://blog.multiplay.co.uk/dropzone/freebsd/panic-2011-08-11-1402.txt The raw output is here:- http://blog.multiplay.co.uk/dropzone/freebsd/panic-full-2011-08-11-1402.txt.bz2 I'm not sure how useful its going to be as pretty much all of it seems to be just:- #0 sched_tswitch (td=0xff00194d4460, newtd=0xff000a74a000, flags=Variable "flags" is not available. #1 0x80385c86 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449 #2 0x803b8a0c in sleepq_catch_signals (wchan=0xff02f27c48c0, pri=92) at /usr/src/sys/kern/subr_sleepqueue.c:418 #3 0x803b9326 in sleepq_wait_sig (wchan=Variable "wchan" is not available. #4 0x80386149 in _sleep (ident=0xff02f27c48c0, lock=0xff02f27c49b8, priority=Variable "priority" is not available. #5 0x8035079d in kern_wait (td=0xff00194d4460, pid=91362, status=0xff86cdbffabc, options=Variable "options" is not available. #6 0x80350e95 in wait4 (td=Variable "td" is not available. #7 0x803bb8e5 in syscallenter (td=0xff00194d4460, sa=0xff86cdbffba0) at /usr/src/sys/kern/subr_trap.c:315 #8 0x80574a0b in syscall (frame=0xff86cdbffc40) at /usr/src/sys/amd64/amd64/trap.c:888 #9 0x8055d242 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:377 On one machine we had a little more info on console which may indicate java as the problem. http://blog.multiplay.co.uk/dropzone/freebsd/panic-java.jpg Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Jeremy Chadwick" On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote: That's not the issue as its happening across board over 130 machines :( Agreed, bad hardware sounds unlikely here. I could believe some strange incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems en masse across many servers, but hardware issues are unlikely in this situation. Its affecting a range of hardware from supermicro blades / 2u's & dell blades. So it seems more like a software bug. [1]: I mention this because we had something similar happen at my workplace. For months we used a specific model of system from our vendor which worked reliably, zero issues. Then we got a new shipment of boxes (same model as prior) which started acting very odd (often AHCI timeout issues or MCEs which when decoded would usually turn out to be nonsensical). It took weeks to determine the cause given how slow the vendor was to respond: root cause turned out to be that the vendor decided, on a whim, to start shipping a newer BIOS version which wasn't "as compatible" with Solaris as previous BIOSes. Downgrading all the systems to the older BIOS fixed the problem. The machines have been working for months fine, the panics only started last week. We've been looking at the changes made last week to see if we can identify the cause. The only change made in that time frame was the rollout of the change to kern.ipc.nmbclusters to workaround the tcp re-assembly issue. In this case we raised the value from the default of 25600 to 262144. We've used this value for a long time on our core webservers, which are also running 8.2 so I'd be very surprised if this was the cause. That said we're looking to roll out kern.ipc.nmbclusters=51200 to try and rule it out. Prior to this, 1-2 weeks previous, we rolled out a significant update which included:- 1. Adding IPv6 to the kernel (although no machines are configued with it yet) 2. Adding ipmi module to the kernel, although not loaded. 3. Rebuilding ALL ports to the latest version 4. Restructuring the server layout to be one jail per java server (~60 servers per machine) 5. Restructing the filesystem to be a base nullfs mount + devfs + zfs volume per server This update had been testing for 2 weeks prior to that, so in total 3-4 weeks before any panics where seen but that doesn't mean the issue didnt exist at that time. Currently we're seeing 1-4 panics a day across all machines. So currently the most likely suspects are:- 1. kern.ipc.nmbclusters 2. nullfs 3. ipv6 4. a package update, most likely being openjdk6-b23 5. jail In Steve's case this is unlikely to be the situation, but I thought I'd share the story anyway. "SKU ABCXYZ-1" from August 2009 is not necessarily the same thing as "SKU ABCXYZ-1" from May 2010. ;-) This is also why I prefer to buy/build my own systems, since I cannot trust vendors to not mess about with settings w/out changing SKUs, P/Ns, or revision numbers. This caused us much scratching of heads when looking for that tcp issue the other day. As it seemed to effecting the newer machines more than the old, we even found two machines with the same "version" of the bios but that's clearly a different build as the date and available options where different, quite frustrating! Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
2011/8/11 Jeremy Chadwick : > On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote: >> That's not the issue as its happening across board over 130 machines :( > > Agreed, bad hardware sounds unlikely here. I could believe some strange > incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems > en masse across many servers, but hardware issues are unlikely in this > situation. > > [1]: I mention this because we had something similar happen at my > workplace. For months we used a specific model of system from our > vendor which worked reliably, zero issues. Then we got a new shipment > of boxes (same model as prior) which started acting very odd (often AHCI > timeout issues or MCEs which when decoded would usually turn out to be > nonsensical). It took weeks to determine the cause given how slow the > vendor was to respond: root cause turned out to be that the vendor > decided, on a whim, to start shipping a newer BIOS version which wasn't > "as compatible" with Solaris as previous BIOSes. Downgrading all the > systems to the older BIOS fixed the problem. That falls in the "hw problem" category for me. Anyway, we really would need much more information in order to take a proactive action. Would it be possible to access to one of the panic'ing machine? Is it always the same panic which is happening or it is variadic (like: once page fault, once fatal double fault, once fatal trap, etc.). Whatever informations you can provide may be valuable here. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote: > That's not the issue as its happening across board over 130 machines :( Agreed, bad hardware sounds unlikely here. I could believe some strange incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems en masse across many servers, but hardware issues are unlikely in this situation. [1]: I mention this because we had something similar happen at my workplace. For months we used a specific model of system from our vendor which worked reliably, zero issues. Then we got a new shipment of boxes (same model as prior) which started acting very odd (often AHCI timeout issues or MCEs which when decoded would usually turn out to be nonsensical). It took weeks to determine the cause given how slow the vendor was to respond: root cause turned out to be that the vendor decided, on a whim, to start shipping a newer BIOS version which wasn't "as compatible" with Solaris as previous BIOSes. Downgrading all the systems to the older BIOS fixed the problem. In Steve's case this is unlikely to be the situation, but I thought I'd share the story anyway. "SKU ABCXYZ-1" from August 2009 is not necessarily the same thing as "SKU ABCXYZ-1" from May 2010. ;-) This is also why I prefer to buy/build my own systems, since I cannot trust vendors to not mess about with settings w/out changing SKUs, P/Ns, or revision numbers. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
That's not the issue as its happening across board over 130 machines :( Regards Steve - Original Message - From: "Attilio Rao" I'd really point the finger to faulty hw. Please run all the necessary diagnostic tools for catching it. Attilio This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
I'd really point the finger to faulty hw. Please run all the necessary diagnostic tools for catching it. Attilio 2011/8/11 Andriy Gapon : > on 10/08/2011 18:35 Steven Hartland said the following: >> Fatal double fault >> rip = 0x8052f6f1 >> rsp = 0xff86ce600fb0 >> rbp = 0xff86ce601210 >> cpuid = 0; apic id = 00 >> panic: double fault >> cpuid = 0 >> KDB: stack backtrace: >> #0 0x803af91e at kdb_backtrace+0x5e >> #1 0x8037d817 at panic+0x187 >> #2 0x80574316 at dblfault_handler+0x96 >> #3 0x8055d06d at Xdblfault+0xad > [snip] >> #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, >> flags=Variable >> "flags" is not available.) >> at /usr/src/sys/kern/sched_ule.c:1858 >> 1858 cpuid = PCPU_GET(cpuid); >> (kgdb) >> #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, >> flags=Variable >> "flags" is not available.) >> at /usr/src/sys/kern/sched_ule.c:1858 >> #1 0x80385c86 in mi_switch (flags=260, newtd=0x0) >> at /usr/src/sys/kern/kern_synch.c:449 >> #2 0x803b92d2 in sleepq_timedwait (wchan=0x80830760, pri=68) >> at /usr/src/sys/kern/subr_sleepqueue.c:644 >> #3 0x803861e1 in _sleep (ident=0x80830760, lock=0x0, >> priority=Variable "priority" is not available. >> ) at /usr/src/sys/kern/kern_synch.c:230 >> #4 0x80532c29 in scheduler (dummy=Variable "dummy" is not available. >> ) at /usr/src/sys/vm/vm_glue.c:807 >> #5 0x80335d67 in mi_startup () at /usr/src/sys/kern/init_main.c:254 >> #6 0x8016efac in btext () at /usr/src/sys/amd64/amd64/locore.S:81 >> #7 0x808556e0 in sleepq_chains () >> #8 0x8083b1e0 in cpu_top () >> #9 0x in ?? () >> #10 0x80830bc0 in proc0 () >> #11 0x80ba4b90 in ?? () >> #12 0x80ba4b38 in ?? () >> #13 0xff000a73f8c0 in ?? () >> #14 0x803a2cc9 in sched_switch (td=0x0, newtd=0x0, flags=Variable >> "flags" >> is not available. >> ) >> at /usr/src/sys/kern/sched_ule.c:1852 >> Previous frame inner to this frame (corrupt stack?) >> (kgdb) > > Looks like this is just the first thread in the kernel. > Perhaps 'thread apply all bt' could help to find the culprit. > > -- > Andriy Gapon > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
on 10/08/2011 18:35 Steven Hartland said the following: > Fatal double fault > rip = 0x8052f6f1 > rsp = 0xff86ce600fb0 > rbp = 0xff86ce601210 > cpuid = 0; apic id = 00 > panic: double fault > cpuid = 0 > KDB: stack backtrace: > #0 0x803af91e at kdb_backtrace+0x5e > #1 0x8037d817 at panic+0x187 > #2 0x80574316 at dblfault_handler+0x96 > #3 0x8055d06d at Xdblfault+0xad [snip] > #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, > flags=Variable > "flags" is not available.) >at /usr/src/sys/kern/sched_ule.c:1858 > 1858cpuid = PCPU_GET(cpuid); > (kgdb) > #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, > flags=Variable > "flags" is not available.) >at /usr/src/sys/kern/sched_ule.c:1858 > #1 0x80385c86 in mi_switch (flags=260, newtd=0x0) >at /usr/src/sys/kern/kern_synch.c:449 > #2 0x803b92d2 in sleepq_timedwait (wchan=0x80830760, pri=68) >at /usr/src/sys/kern/subr_sleepqueue.c:644 > #3 0x803861e1 in _sleep (ident=0x80830760, lock=0x0, >priority=Variable "priority" is not available. > ) at /usr/src/sys/kern/kern_synch.c:230 > #4 0x80532c29 in scheduler (dummy=Variable "dummy" is not available. > ) at /usr/src/sys/vm/vm_glue.c:807 > #5 0x80335d67 in mi_startup () at /usr/src/sys/kern/init_main.c:254 > #6 0x8016efac in btext () at /usr/src/sys/amd64/amd64/locore.S:81 > #7 0x808556e0 in sleepq_chains () > #8 0x8083b1e0 in cpu_top () > #9 0x in ?? () > #10 0x80830bc0 in proc0 () > #11 0x80ba4b90 in ?? () > #12 0x80ba4b38 in ?? () > #13 0xff000a73f8c0 in ?? () > #14 0x803a2cc9 in sched_switch (td=0x0, newtd=0x0, flags=Variable > "flags" > is not available. > ) >at /usr/src/sys/kern/sched_ule.c:1852 > Previous frame inner to this frame (corrupt stack?) > (kgdb) Looks like this is just the first thread in the kernel. Perhaps 'thread apply all bt' could help to find the culprit. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On Wed, Aug 10, 2011 at 05:26:27PM +0100, Steven Hartland wrote: > - Original Message - From: "Jeremy Chadwick" > free...@jdc.parodius.com > > >>>In combination with this, we use the following in /etc/rc.conf (the > >>>dumpdev line is important, else savecore won't pick up anything): > >>> > >>>dumpdev="auto" > >> > >>I thought this was ment to be the default from back in the 6.x days but > >>it didnt seem to work, so I added the gptid device from /etc/fstab > > > >/etc/defaults/rc.conf has dumpdev="NO", which affects two things: both > >/etc/rc.d/dumpon (this script is a little tricky, you really have to > >read it slowly/pay close attention to what's going on), and > >/etc/rc.d/savecore. > > Hmm, someone might want to correct the docs then:- > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html > > "AUTO is the default as of FreeBSD 6.0" It used to be "auto", and was changed to "no" in this commit back in September 2009, and was reviewed by two separate people: http://www.freebsd.org/cgi/cvsweb.cgi/src/etc/defaults/rc.conf#rev1.358.2.2 Prior to that, it was "auto", as confirmed here (circa June 2005): http://www.freebsd.org/cgi/cvsweb.cgi/src/etc/defaults/rc.conf#rev1.250 So basically the documentation is both correct and incorrect. For anyone running FreeBSD later than September 2009 (I would need to spend some time figuring out what releases that was), dumpdev will not be enabled by default. Prior to that (which includes 6.x), it will be. The documentation needs to be updated to reflect reality (specifically the commit that was done in September 2009). I'll file a PR for this, but won't have the PR number until later today. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Jeremy Chadwick" free...@jdc.parodius.com >In combination with this, we use the following in /etc/rc.conf (the >dumpdev line is important, else savecore won't pick up anything): > >dumpdev="auto" I thought this was ment to be the default from back in the 6.x days but it didnt seem to work, so I added the gptid device from /etc/fstab /etc/defaults/rc.conf has dumpdev="NO", which affects two things: both /etc/rc.d/dumpon (this script is a little tricky, you really have to read it slowly/pay close attention to what's going on), and /etc/rc.d/savecore. Hmm, someone might want to correct the docs then:- http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html "AUTO is the default as of FreeBSD 6.0" Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On Wed, Aug 10, 2011 at 04:46:17PM +0100, Steven Hartland wrote: > >On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland wrote: > >>The base stack reported is a double fault with no additional > >>details and CTRL+ALT+ESC fails to break to the debugger as > >>does and NMI, even though it at least tries printing the > >>following many times some quite jumbled:- > >>NMI ... going to debugger > > >If you're generating the NMI yourself (possibly via the KVM, etc.) then > >okay, that's different. I'm trying to discern whether or not *you're* > >generating the NMI, or if the NMI just happens and causes a panic for > >you and that's what you're worried about. > > Yer generating it after panic in order to try and get to the debugger :) Understood, thanks for clarifying. > >Now to discuss the "jumbled console output": > ... > >The default (assuming your kernel configs are based off of GENERIC > >within the past 4-5 years) is 128. However, the same developers stated > >that they have great reservations over increasing this number > >dramatically (meaning, something like 256 will probably work, but larger > >"may have repercussions which are unknown at this time"). > > Might try that if it will help but with so many production machines to > action I'd like to try and avoid if possible. I've used PRINTF_BUFR_SIZE=256 with success on our systems, but since it doesn't actually *solve* the problem, I just use the default 128 and just grit my teeth when we experience it. It's larger values (e.g. 512/1024, etc.) which there is concern over. > >In combination with this, we use the following in /etc/rc.conf (the > >dumpdev line is important, else savecore won't pick up anything): > > > >dumpdev="auto" > > I thought this was ment to be the default from back in the 6.x days but > it didnt seem to work, so I added the gptid device from /etc/fstab /etc/defaults/rc.conf has dumpdev="NO", which affects two things: both /etc/rc.d/dumpon (this script is a little tricky, you really have to read it slowly/pay close attention to what's going on), and /etc/rc.d/savecore. I've always wondered why dumpdev="NO" is the default, not "auto", since on a system with no swap devices in /etc/fstab dumpdev="auto" should behave the same. Possibly the idea of the default is to ensure that savecore(8) never gets run (e.g. there's no guarantee someone has /var/crash, or a /var that's big enough to hold a crash dump; possibly embedded systems or NFS-only systems, for example). Touchy subject I guess. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Jeremy Chadwick" On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland wrote: The base stack reported is a double fault with no additional details and CTRL+ALT+ESC fails to break to the debugger as does and NMI, even though it at least tries printing the following many times some quite jumbled:- NMI ... going to debugger If you're generating the NMI yourself (possibly via the KVM, etc.) then okay, that's different. I'm trying to discern whether or not *you're* generating the NMI, or if the NMI just happens and causes a panic for you and that's what you're worried about. Yer generating it after panic in order to try and get to the debugger :) Now to discuss the "jumbled console output": ... The default (assuming your kernel configs are based off of GENERIC within the past 4-5 years) is 128. However, the same developers stated that they have great reservations over increasing this number dramatically (meaning, something like 256 will probably work, but larger "may have repercussions which are unknown at this time"). Might try that if it will help but with so many production machines to action I'd like to try and avoid if possible. The machines are single disk ZFS root install and the dump device is configured using the gptid, could this be what's preventing the dump happening? I can tell you that others have reported this problem where the kernel panic/dump begins but either locks up after showing the first progress metre/amount, or during the dumping itself. Ahh, so possibly not a gptid issue I give everyone the same advice: please make sure that you have a swap partition that's large enough to fit your entire memory contents (preferably a swap that's 2x or 1.5x the amount of physical RAM), and please make sure it's on a dedicated slice (e.g. ada0s1b). I do not advise any sort of "abstraction" layer between swap and the rest of the system. It might seem like a great/fun/awesome idea followed by "whatever jdc, it works!" but when a crash happens -- which is when you need it most -- and it doesn't work, I won't sympathise. :-) As for the GPT aspects of things: I'm still not familiar with GPT (as a technology I am, but when it comes to actual usability I am not). Just managed to get a crash dump from one machine so hopefully will be able to make some progress is someone can point me in the right direction. # Debugging options options BREAK_TO_DEBUGGER # Sending a serial BREAK drops to DDB options ALT_BREAK_TO_DEBUGGER # Permit ~ to drop to DDB options KDB # Enable kernel debugger support options KDB_TRACE # Print stack trace automatically on panic options DDB # Support DDB options GDB # Support remote GDB Cheers In combination with this, we use the following in /etc/rc.conf (the dumpdev line is important, else savecore won't pick up anything): dumpdev="auto" I thought this was ment to be the default from back in the 6.x days but it didnt seem to work, so I added the gptid device from /etc/fstab ddb_enable="yes" Thanks :) Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
- Original Message - From: "Steven Hartland" To: Sent: Wednesday, August 10, 2011 3:22 PM Subject: debugging frequent kernel panics on 8.2-RELEASE We're currently experiencing a large number of kernel panics on FreeBSD 8.2-RELEASE across a large number of machines here. The base stack reported is a double fault with no additional details and CTRL+ALT+ESC fails to break to the debugger as does and NMI, even though it at least tries printing the following many times some quite jumbled:- NMI ... going to debugger We've configured the dump device but that also seems to fail to capture any details just sitting there after panic with Dumping 4465MB: The machines are single disk ZFS root install and the dump device is configured using the gptid, could this be what's preventing the dump happening? The kernel is compiled with:- options KDB # Kernel debugger related code options KDB_TRACE # Print a stack trace for a panic We have remove KVM but not remote serial on the most of the machines. Any advice on how to debug this issue? ldn32.multiplay.co.uk dumped core - see /var/crash/vmcore.0 Wed Aug 10 14:02:07 UTC 2011 FreeBSD crash 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Jul 21 11:05:52 BST 2011 root@crash:/usr/obj/usr/src/sys/MULTIPLAY amd64 panic: double fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal double fault rip = 0x8052f6f1 rsp = 0xff86ce600fb0 rbp = 0xff86ce601210 cpuid = 0; apic id = 00 panic: double fault cpuid = 0 KDB: stack backtrace: #0 0x803af91e at kdb_backtrace+0x5e #1 0x8037d817 at panic+0x187 #2 0x80574316 at dblfault_handler+0x96 #3 0x8055d06d at Xdblfault+0xad Uptime: 13d20h53m31s Physical memory: 24555 MB Dumping 3283 MB: 3268 3252 3236 3220 3204 3188 3172 3156 3140 3124 3108 3092 3076 3060 3044 3028 3012 2996 2980 2964 2948 2932 2916 2900 2884 2868 2852 2836 2820 2804 2788 2772 2756 2740 272 4 2708 2692 2676 2660 2644 2628 2612 2596 2580 2564 2548 2532 2516 2500 2484 2468 2452 2436 2420 2404 2388 2372 2356 2340 2324 2308 2292 2276 2260 2244 2228 2212 2196 2180 2164 2148 2132 211 6 2100 2084 2068 2052 2036 2020 2004 1988 1972 1956 1940 1924 1908 1892 1876 1860 1844 1828 1812 1796 1780 1764 1748 1732 1716 1700 1684 1668 1652 1636 1620 1604 1588 1572 1556 1540 1524 150 8 1492 1476 1460 1444 1428 1412 1396 1380 1364 1348 1332 1316 1300 1284 1268 1252 1236 1220 1204 1188 1172 1156 1140 1124 1108 1092 1076 1060 1044 1028 1012 996 980 964 948 932 916 900 884 8 68 852 836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564 548 532 516 500 484 468 452 436 420 404 388 372 356 340 324 308 292 276 260 244 228 212 196 180 164 148 132 116 100 84 68 52 36 20 4 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko One of the machines has managed to dump where all the others have failed to do so here's the stack from core.txt.0 #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 1858cpuid = PCPU_GET(cpuid); (kgdb) #0 sched_switch (td=0x80830bc0, newtd=0xff000a73f8c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 #1 0x80385c86 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449 #2 0x803b92d2 in sleepq_timedwait (wchan=0x80830760, pri=68) at /usr/src/sys/kern/subr_sleepqueue.c:644 #3 0x803861e1 in _sleep (ident=0x80830760, lock=0x0, priority=Variable "priority" is not available. ) at /usr/src/sys/kern/kern_synch.c:230 #4 0x80532c29 in scheduler (dummy=Variable "dummy" is not available. ) at /usr/src/sys/vm/vm_glue.c:807 #5 0x80335d67 in mi_startup () at /usr/src/sys/kern/init_main.c:254 #6 0x8016efac in btext () at /usr/src/sys/amd64/amd64/locore.S:81 #7
Re: debugging frequent kernel panics on 8.2-RELEASE
on 10/08/2011 17:22 Steven Hartland said the following: > The kernel is compiled with:- > options KDB # Kernel debugger related code > options KDB_TRACE # Print a stack trace for a panic You also have to provide an actual debugger backend like built-in DDB or a stub for remote GDB to get online debugging. No guarantees that that would help you to get the debugging information, but without that the chances are even slimmer. You may also try this patch and see if it provides any improvements for post-panic environment (dumping etc): http://people.freebsd.org/~avg/stop_scheduler_on_panic.8.x.diff It might also be a good idea to at least capture a screenshot of whatever information you get on console when the panic happens. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland wrote: > The base stack reported is a double fault with no additional > details and CTRL+ALT+ESC fails to break to the debugger as > does and NMI, even though it at least tries printing the > following many times some quite jumbled:- > NMI ... going to debugger You may be interested in these system tunables (not sysctls). These come from sys/amd64/amd64/trap.c (i386 has the same): machdep.kdb_on_nmi(defaults to 1) machdep.panic_on_nmi (defaults to 1) If what you're seeing is a hardware NMI that fires, followed by the machine panicing, the above tunables are probably doing that. A hardware NMI could indicate an actual hardware issue of sorts, depending on how the motherboard vendor implemented what they did. For example, on a series of mainboards we have at my workplace, the BIOS can be configured to generate either an NMI or SMI# when different kinds of ECC RAM errors happen (either single-bit or multi-bit parity errors). I don't know if that's what you're seeing. If you're generating the NMI yourself (possibly via the KVM, etc.) then okay, that's different. I'm trying to discern whether or not *you're* generating the NMI, or if the NMI just happens and causes a panic for you and that's what you're worried about. Now to discuss the "jumbled console output": The interspersing of kernel text output has plagued FreeBSD for a very long time (since approximately 6.x). There have been statements from kernel coders that you can decrease the likelihood of it happening by increasing the PRINTF_BUFR_SIZE (not a typo) option in your kernel configuration. The issue is exacerbated by use of SMP (either multi-core or multi-CPU). The default (assuming your kernel configs are based off of GENERIC within the past 4-5 years) is 128. However, the same developers stated that they have great reservations over increasing this number dramatically (meaning, something like 256 will probably work, but larger "may have repercussions which are unknown at this time"). I have stated publicly then, and will do so again now, that this option does not solve the problem. I acknowledge it may make it "less likely to happen" or may decrease the amount of interspersed output, but in my experience neither of those prove true; and more importantly, said option does not solve the problem. I've talked on-list with John Baldwin about this problem in the past, who had some pretty good ideas of how to solve it. I should point out that Solaris 10 and OpenSolaris (not sure about present-day releases) both have this problem as well, especially during kernel panics or MCEs. Linux addressed this issue by implementing a ring-based cyclic buffer for its kernel messages (syslog/klogd), and the model is extremely well-documented (quite clever too): http://www.mjmwired.net/kernel/Documentation/trace/ring-buffer-design.txt I'm still surprised not a single GSoC project has attempted to solve this for FreeBSD. It really is a serious matter, as it makes getting kernel backtraces and crash data a serious pain in the butt. It can also impact real-time debugging. These are the *worst* times to have to tolerate something like this. I can point you to old threads about this, and my old FreeBSD wiki page ("Commonly reported issues") touches on this as well. The point I want to get across is that PRINTF_BUFR_SIZE does not solve the problem. > We've configured the dump device but that also seems to fail > to capture any details just sitting there after panic with > Dumping 4465MB: > > The machines are single disk ZFS root install and the dump > device is configured using the gptid, could this be what's > preventing the dump happening? I can tell you that others have reported this problem where the kernel panic/dump begins but either locks up after showing the first progress metre/amount, or during the dumping itself. I give everyone the same advice: please make sure that you have a swap partition that's large enough to fit your entire memory contents (preferably a swap that's 2x or 1.5x the amount of physical RAM), and please make sure it's on a dedicated slice (e.g. ada0s1b). I do not advise any sort of "abstraction" layer between swap and the rest of the system. It might seem like a great/fun/awesome idea followed by "whatever jdc, it works!" but when a crash happens -- which is when you need it most -- and it doesn't work, I won't sympathise. :-) As for the GPT aspects of things: I'm still not familiar with GPT (as a technology I am, but when it comes to actual usability I am not). > The kernel is compiled with:- > options KDB # Kernel debugger related code > options KDB_TRACE # Print a stack trace for a panic > > We have remove KVM but not remote serial on the most of the > machines. As long as remote KVM provides actual VGA-level redirection, then that's sufficient (though makes copy-pasting output basically impossible). We use serial console a
debugging frequent kernel panics on 8.2-RELEASE
We're currently experiencing a large number of kernel panics on FreeBSD 8.2-RELEASE across a large number of machines here. The base stack reported is a double fault with no additional details and CTRL+ALT+ESC fails to break to the debugger as does and NMI, even though it at least tries printing the following many times some quite jumbled:- NMI ... going to debugger We've configured the dump device but that also seems to fail to capture any details just sitting there after panic with Dumping 4465MB: The machines are single disk ZFS root install and the dump device is configured using the gptid, could this be what's preventing the dump happening? The kernel is compiled with:- options KDB # Kernel debugger related code options KDB_TRACE # Print a stack trace for a panic We have remove KVM but not remote serial on the most of the machines. Any advice on how to debug this issue? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"