Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-05 Thread Kostik Belousov
On Thu, May 05, 2011 at 12:49:48PM -0700, Garrett Cooper wrote:
> Things look ok with that patch and the one that Jeff provided for
> the LOR, taking into account your style change with the flag list.
> Thanks!

I do not understand your response. Jeff' patch was included into the
cumulative change I sent you, with slight modification.

What 'style change with the flag list' are you referencing to ?


pgpoDAWslEXEc.pgp
Description: PGP signature


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-05 Thread Garrett Cooper
On Thu, May 5, 2011 at 10:36 AM, Kostik Belousov  wrote:
> On Thu, May 05, 2011 at 10:23:47AM -0700, Garrett Cooper wrote:
>> On May 4, 2011, at 2:07 AM, Kostik Belousov wrote:
>>
>> > On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
>> >> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  
>> >> wrote:
>> >>> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  
>> >>> wrote:
>> >>>>> Date: Tue, 3 May 2011 22:40:26 -0700
>> >>>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>> >>>>>  partition when filesystem full
>> >>>>> From: Garrett Cooper 
>> >>>>> To: Jeff Roberson ,
>> >>>>>         Marshall Kirk McKusick 
>> >>>>> Cc: FreeBSD Current 
>> >>>>>
>> >>>>> Hi Jeff and Dr. McKusick,
>> >>>>>     Ran into this panic when /usr ran out of space doing a make
>> >>>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>> >>>>> after the filesystem ran out of space -- wasn't quite sure what it was
>> >>>>> doing at the time):
>> >>>>>
>> >>>>> ...
>> >>>>>
>> >>>>>     Let me know what other commands you would like for me to run in 
>> >>>>> kgdb.
>> >>>>> Thanks,
>> >>>>> -Garrett
>> >>>>
>> >>>> You did not indicate whether you are running an 8.X system or a 
>> >>>> 9-current
>> >>>> system. It would be helpful to know that.
>> >>>
>> >>> I've actually been running CURRENT for a few years now, but you're right 
>> >>> --
>> >>> I didn't mention that part.
>> >>>
>> >>>> Jeff thinks that there may be a potential race in the locking code for
>> >>>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>> >>>>
>> >>>> Index: ffs_softdep.c
>> >>>> ===
>> >>>> --- ffs_softdep.c       (revision 221385)
>> >>>> +++ ffs_softdep.c       (working copy)
>> >>>> @@ -11380,7 +11380,8 @@
>> >>>>                                continue;
>> >>>>                        }
>> >>>>                        MNT_IUNLOCK(mp);
>> >>>> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
>> >>>> curthread)) {
>> >>>> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
>> >>>> LK_INTERLOCK,
>> >>>> +                           curthread)) {
>> >>>>                                MNT_ILOCK(mp);
>> >>>>                                continue;
>> >>>>                        }
>> >>>>
>> >>>> If you are running an 8.X system, hopefully you will be able to apply 
>> >>>> it.
>> >>>
>> >>>    I've applied it, rebuilt and installed the kernel, and trying to
>> >>> repro the case again. Will let you know how things go!
>> >>
>> >>    Happened again with the change. It's really easy to repro:
>> >>
>> >> 1. Get a filesystem with UFS+SU
>> >> 2. Execute something that does a large number of small writes to a 
>> >> partition.
>> >> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
>> >>
>> >>    The kernel will panic with the issue I discussed above.
>> >> Thanks!
>> >
>> > Jeff' change is required to avoid LORs, but it is not sufficient to
>> > prevent recursion. We must skip the vnode supplied as a parameter to
>> > softdep_request_cleanup(). Theoretically, other vnodes might be also
>> > locked by curthread, thus I think the change below is needed. Try this.
>> >
>> > diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
>> > index a6d4441..25fa5d6 100644
>> > --- a/sys/ufs/ffs/ffs_softdep.c
>> > +++ b/sys/ufs/ffs/ffs_softdep.c
>> > @@ -11380,7 +11380,9 @@ retry:
>> >                             continue;
>> >                     }
>> >                     MNT_IUNLOCK(mp);
>> > -                  

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-05 Thread Kostik Belousov
On Thu, May 05, 2011 at 10:23:47AM -0700, Garrett Cooper wrote:
> On May 4, 2011, at 2:07 AM, Kostik Belousov wrote:
> 
> > On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
> >> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  wrote:
> >>> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  
> >>> wrote:
> >>>>> Date: Tue, 3 May 2011 22:40:26 -0700
> >>>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
> >>>>>  partition when filesystem full
> >>>>> From: Garrett Cooper 
> >>>>> To: Jeff Roberson ,
> >>>>> Marshall Kirk McKusick 
> >>>>> Cc: FreeBSD Current 
> >>>>> 
> >>>>> Hi Jeff and Dr. McKusick,
> >>>>> Ran into this panic when /usr ran out of space doing a make
> >>>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
> >>>>> after the filesystem ran out of space -- wasn't quite sure what it was
> >>>>> doing at the time):
> >>>>> 
> >>>>> ...
> >>>>> 
> >>>>> Let me know what other commands you would like for me to run in 
> >>>>> kgdb.
> >>>>> Thanks,
> >>>>> -Garrett
> >>>> 
> >>>> You did not indicate whether you are running an 8.X system or a 9-current
> >>>> system. It would be helpful to know that.
> >>> 
> >>> I've actually been running CURRENT for a few years now, but you're right 
> >>> --
> >>> I didn't mention that part.
> >>> 
> >>>> Jeff thinks that there may be a potential race in the locking code for
> >>>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
> >>>> 
> >>>> Index: ffs_softdep.c
> >>>> ===
> >>>> --- ffs_softdep.c   (revision 221385)
> >>>> +++ ffs_softdep.c   (working copy)
> >>>> @@ -11380,7 +11380,8 @@
> >>>>continue;
> >>>>}
> >>>>MNT_IUNLOCK(mp);
> >>>> -   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
> >>>> curthread)) {
> >>>> +   if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
> >>>> LK_INTERLOCK,
> >>>> +   curthread)) {
> >>>>MNT_ILOCK(mp);
> >>>>continue;
> >>>>}
> >>>> 
> >>>> If you are running an 8.X system, hopefully you will be able to apply it.
> >>> 
> >>>I've applied it, rebuilt and installed the kernel, and trying to
> >>> repro the case again. Will let you know how things go!
> >> 
> >>Happened again with the change. It's really easy to repro:
> >> 
> >> 1. Get a filesystem with UFS+SU
> >> 2. Execute something that does a large number of small writes to a 
> >> partition.
> >> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
> >> 
> >>The kernel will panic with the issue I discussed above.
> >> Thanks!
> > 
> > Jeff' change is required to avoid LORs, but it is not sufficient to
> > prevent recursion. We must skip the vnode supplied as a parameter to
> > softdep_request_cleanup(). Theoretically, other vnodes might be also
> > locked by curthread, thus I think the change below is needed. Try this.
> > 
> > diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> > index a6d4441..25fa5d6 100644
> > --- a/sys/ufs/ffs/ffs_softdep.c
> > +++ b/sys/ufs/ffs/ffs_softdep.c
> > @@ -11380,7 +11380,9 @@ retry:
> > continue;
> > }
> > MNT_IUNLOCK(mp);
> > -   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
> > +   if (VOP_ISLOCKED(lvp) ||
> > +   vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
> > +   curthread)) {
> > MNT_ILOCK(mp);
> > continue;
> > }
> 
>   Ran into the same panic after I applied the 

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-05 Thread Garrett Cooper
On May 4, 2011, at 2:07 AM, Kostik Belousov wrote:

> On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
>> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  wrote:
>>> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  
>>> wrote:
>>>>> Date: Tue, 3 May 2011 22:40:26 -0700
>>>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>>>>>  partition when filesystem full
>>>>> From: Garrett Cooper 
>>>>> To: Jeff Roberson ,
>>>>> Marshall Kirk McKusick 
>>>>> Cc: FreeBSD Current 
>>>>> 
>>>>> Hi Jeff and Dr. McKusick,
>>>>> Ran into this panic when /usr ran out of space doing a make
>>>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>>>>> after the filesystem ran out of space -- wasn't quite sure what it was
>>>>> doing at the time):
>>>>> 
>>>>> ...
>>>>> 
>>>>> Let me know what other commands you would like for me to run in kgdb.
>>>>> Thanks,
>>>>> -Garrett
>>>> 
>>>> You did not indicate whether you are running an 8.X system or a 9-current
>>>> system. It would be helpful to know that.
>>> 
>>> I've actually been running CURRENT for a few years now, but you're right --
>>> I didn't mention that part.
>>> 
>>>> Jeff thinks that there may be a potential race in the locking code for
>>>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>>>> 
>>>> Index: ffs_softdep.c
>>>> ===
>>>> --- ffs_softdep.c   (revision 221385)
>>>> +++ ffs_softdep.c   (working copy)
>>>> @@ -11380,7 +11380,8 @@
>>>>continue;
>>>>}
>>>>MNT_IUNLOCK(mp);
>>>> -   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
>>>> curthread)) {
>>>> +   if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
>>>> LK_INTERLOCK,
>>>> +   curthread)) {
>>>>MNT_ILOCK(mp);
>>>>continue;
>>>>}
>>>> 
>>>> If you are running an 8.X system, hopefully you will be able to apply it.
>>> 
>>>I've applied it, rebuilt and installed the kernel, and trying to
>>> repro the case again. Will let you know how things go!
>> 
>>Happened again with the change. It's really easy to repro:
>> 
>> 1. Get a filesystem with UFS+SU
>> 2. Execute something that does a large number of small writes to a partition.
>> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
>> 
>>The kernel will panic with the issue I discussed above.
>> Thanks!
> 
> Jeff' change is required to avoid LORs, but it is not sufficient to
> prevent recursion. We must skip the vnode supplied as a parameter to
> softdep_request_cleanup(). Theoretically, other vnodes might be also
> locked by curthread, thus I think the change below is needed. Try this.
> 
> diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> index a6d4441..25fa5d6 100644
> --- a/sys/ufs/ffs/ffs_softdep.c
> +++ b/sys/ufs/ffs/ffs_softdep.c
> @@ -11380,7 +11380,9 @@ retry:
>   continue;
>   }
>   MNT_IUNLOCK(mp);
> - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
> + if (VOP_ISLOCKED(lvp) ||
> + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
> + curthread)) {
>   MNT_ILOCK(mp);
>   continue;
>   }

Ran into the same panic after I applied the patch above with the repro 
steps I described before. One thing that I noticed is that the issue isn't as 
easy to reproduce unless you add the dd in parallel with the make operation.
Thanks,
-Garrett___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Garrett Cooper
2011/5/4 Kostik Belousov :
> On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
>> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  wrote:
>> > On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  
>> > wrote:
>> >>> Date: Tue, 3 May 2011 22:40:26 -0700
>> >>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>> >>>  partition when filesystem full
>> >>> From: Garrett Cooper 
>> >>> To: Jeff Roberson ,
>> >>>         Marshall Kirk McKusick 
>> >>> Cc: FreeBSD Current 
>> >>>
>> >>> Hi Jeff and Dr. McKusick,
>> >>>     Ran into this panic when /usr ran out of space doing a make
>> >>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>> >>> after the filesystem ran out of space -- wasn't quite sure what it was
>> >>> doing at the time):
>> >>>
>> >>> ...
>> >>>
>> >>>     Let me know what other commands you would like for me to run in kgdb.
>> >>> Thanks,
>> >>> -Garrett
>> >>
>> >> You did not indicate whether you are running an 8.X system or a 9-current
>> >> system. It would be helpful to know that.
>> >
>> > I've actually been running CURRENT for a few years now, but you're right --
>> > I didn't mention that part.
>> >
>> >> Jeff thinks that there may be a potential race in the locking code for
>> >> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>> >>
>> >> Index: ffs_softdep.c
>> >> ===
>> >> --- ffs_softdep.c       (revision 221385)
>> >> +++ ffs_softdep.c       (working copy)
>> >> @@ -11380,7 +11380,8 @@
>> >>                                continue;
>> >>                        }
>> >>                        MNT_IUNLOCK(mp);
>> >> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
>> >> curthread)) {
>> >> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
>> >> LK_INTERLOCK,
>> >> +                           curthread)) {
>> >>                                MNT_ILOCK(mp);
>> >>                                continue;
>> >>                        }
>> >>
>> >> If you are running an 8.X system, hopefully you will be able to apply it.
>> >
>> >    I've applied it, rebuilt and installed the kernel, and trying to
>> > repro the case again. Will let you know how things go!
>>
>>     Happened again with the change. It's really easy to repro:
>>
>> 1. Get a filesystem with UFS+SU
>> 2. Execute something that does a large number of small writes to a partition.
>> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
>>
>>     The kernel will panic with the issue I discussed above.
>> Thanks!
>
> Jeff' change is required to avoid LORs, but it is not sufficient to
> prevent recursion. We must skip the vnode supplied as a parameter to
> softdep_request_cleanup(). Theoretically, other vnodes might be also
> locked by curthread, thus I think the change below is needed. Try this.
>
> diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> index a6d4441..25fa5d6 100644
> --- a/sys/ufs/ffs/ffs_softdep.c
> +++ b/sys/ufs/ffs/ffs_softdep.c
> @@ -11380,7 +11380,9 @@ retry:
>                                continue;
>                        }
>                        MNT_IUNLOCK(mp);
> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
> curthread)) {
> +                       if (VOP_ISLOCKED(lvp) ||
> +                           vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
> +                           curthread)) {
>                                MNT_ILOCK(mp);
>                                continue;
>                        }

Ok. I'll let the make universe I have going run to completion, and
once I get back home later on, I'll take a look at repro'ing this
again with the above patch applied.
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Kirk McKusick
> Date: Tue, 3 May 2011 22:40:26 -0700
> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>  partition when filesystem full
> From: Garrett Cooper 
> To: Jeff Roberson ,
> Marshall Kirk McKusick 
> Cc: FreeBSD Current 
> 
> Hi Jeff and Dr. McKusick,
> Ran into this panic when /usr ran out of space doing a make
> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
> after the filesystem ran out of space -- wasn't quite sure what it was
> doing at the time):
> 
> ...
> 
> Let me know what other commands you would like for me to run in kgdb.
> Thanks,
> -Garrett

You did not indicate whether you are running an 8.X system or a 9-current
system. It would be helpful to know that.

Jeff thinks that there may be a potential race in the locking code for
softdep_request_cleanup. If so, this patch for 9-current should fix it:

Index: ffs_softdep.c
===
--- ffs_softdep.c   (revision 221385)
+++ ffs_softdep.c   (working copy)
@@ -11380,7 +11380,8 @@
continue;
}
MNT_IUNLOCK(mp);
-   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
+   if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
+   curthread)) {
MNT_ILOCK(mp);
continue;
}

If you are running an 8.X system, hopefully you will be able to apply it.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Kostik Belousov
On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  wrote:
> > On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  
> > wrote:
> >>> Date: Tue, 3 May 2011 22:40:26 -0700
> >>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
> >>>  partition when filesystem full
> >>> From: Garrett Cooper 
> >>> To: Jeff Roberson ,
> >>>         Marshall Kirk McKusick 
> >>> Cc: FreeBSD Current 
> >>>
> >>> Hi Jeff and Dr. McKusick,
> >>>     Ran into this panic when /usr ran out of space doing a make
> >>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
> >>> after the filesystem ran out of space -- wasn't quite sure what it was
> >>> doing at the time):
> >>>
> >>> ...
> >>>
> >>>     Let me know what other commands you would like for me to run in kgdb.
> >>> Thanks,
> >>> -Garrett
> >>
> >> You did not indicate whether you are running an 8.X system or a 9-current
> >> system. It would be helpful to know that.
> >
> > I've actually been running CURRENT for a few years now, but you're right --
> > I didn't mention that part.
> >
> >> Jeff thinks that there may be a potential race in the locking code for
> >> softdep_request_cleanup. If so, this patch for 9-current should fix it:
> >>
> >> Index: ffs_softdep.c
> >> ===
> >> --- ffs_softdep.c       (revision 221385)
> >> +++ ffs_softdep.c       (working copy)
> >> @@ -11380,7 +11380,8 @@
> >>                                continue;
> >>                        }
> >>                        MNT_IUNLOCK(mp);
> >> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
> >> curthread)) {
> >> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
> >> LK_INTERLOCK,
> >> +                           curthread)) {
> >>                                MNT_ILOCK(mp);
> >>                                continue;
> >>                        }
> >>
> >> If you are running an 8.X system, hopefully you will be able to apply it.
> >
> >    I've applied it, rebuilt and installed the kernel, and trying to
> > repro the case again. Will let you know how things go!
> 
> Happened again with the change. It's really easy to repro:
> 
> 1. Get a filesystem with UFS+SU
> 2. Execute something that does a large number of small writes to a partition.
> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
> 
> The kernel will panic with the issue I discussed above.
> Thanks!

Jeff' change is required to avoid LORs, but it is not sufficient to
prevent recursion. We must skip the vnode supplied as a parameter to
softdep_request_cleanup(). Theoretically, other vnodes might be also
locked by curthread, thus I think the change below is needed. Try this.

diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
index a6d4441..25fa5d6 100644
--- a/sys/ufs/ffs/ffs_softdep.c
+++ b/sys/ufs/ffs/ffs_softdep.c
@@ -11380,7 +11380,9 @@ retry:
continue;
}
MNT_IUNLOCK(mp);
-   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
+   if (VOP_ISLOCKED(lvp) ||
+   vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
+   curthread)) {
MNT_ILOCK(mp);
continue;
}


pgpnPeiKnHi9d.pgp
Description: PGP signature


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Sergey Kandaurov
On 4 May 2011 10:42, Garrett Cooper  wrote:
> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  wrote:
>>> Date: Tue, 3 May 2011 22:40:26 -0700
>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>>>  partition when filesystem full
>>> From: Garrett Cooper 
>>> To: Jeff Roberson ,
>>>         Marshall Kirk McKusick 
>>> Cc: FreeBSD Current 
>>>
>>> Hi Jeff and Dr. McKusick,
>>>     Ran into this panic when /usr ran out of space doing a make
>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>>> after the filesystem ran out of space -- wasn't quite sure what it was
>>> doing at the time):
>>>
>>> ...
>>>
>>>     Let me know what other commands you would like for me to run in kgdb.
>>> Thanks,
>>> -Garrett
>>
>> You did not indicate whether you are running an 8.X system or a 9-current
>> system. It would be helpful to know that.
>
> I've actually been running CURRENT for a few years now, but you're right --
> I didn't mention that part.
>
>> Jeff thinks that there may be a potential race in the locking code for
>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>>
>> Index: ffs_softdep.c
>> ===
>> --- ffs_softdep.c       (revision 221385)
>> +++ ffs_softdep.c       (working copy)
>> @@ -11380,7 +11380,8 @@
>>                                continue;
>>                        }
>>                        MNT_IUNLOCK(mp);
>> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
>> curthread)) {
>> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
>> LK_INTERLOCK,
>> +                           curthread)) {
>>                                MNT_ILOCK(mp);
>>                                continue;
>>                        }
>>

FYI,
I was playing with head (w/o the above patch) to reproduce the panic and got
this LOR when filesystem was eventually filled.
I'm not sure the patch would fix the panic but I think it should at
least fix the LOR.

kernel: pid 66153 (dd), uid 0 inumber 4 on /mnt: filesystem full
lock order reversal:
 1st 0xfe001d7d3310 ufs (ufs) @ /usr/src/sys/kern/vfs_vnops.c:614
 2nd 0xff807ba8a800 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658
 3rd 0xfe001ade7588 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2126
KDB: stack backtrace:
db_trace_self_wrapper() at 0x802d9eba = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0x80475d17 = kdb_backtrace+0x37
_witness_debugger() at 0x8048b4fe = _witness_debugger+0x2e
witness_checkorder() at 0x8048c7a7 = witness_checkorder+0x807
__lockmgr_args() at 0x80427553 = __lockmgr_args+0xd63
ffs_lock() at 0x806578fc = ffs_lock+0x9c
VOP_LOCK1_APV() at 0x806f285f = VOP_LOCK1_APV+0xbf
_vn_lock() at 0x804e87c7 = _vn_lock+0x57
vget() at 0x804dbb5b = vget+0x7b
softdep_request_cleanup() at 0x80649f31 = softdep_request_cleanup+0x311
ffs_alloc() at 0x80630b64 = ffs_alloc+0x134
ffs_balloc_ufs2() at 0x8063426c = ffs_balloc_ufs2+0x11ac
ffs_write() at 0x8065889f = ffs_write+0x22f
VOP_WRITE_APV() at 0x806f33dd = VOP_WRITE_APV+0x14d
vn_write() at 0x804e9a42 = vn_write+0x2a2
dofilewrite() at 0x8048df25 = dofilewrite+0x85
kern_writev() at 0x8048f740 = kern_writev+0x60
write() at 0x8048f845 = write+0x55
syscallenter() at 0x80483cbb = syscallenter+0x1cb
syscall() at 0x806abaf0 = syscall+0x60
Xfast_syscall() at 0x8069670d = Xfast_syscall+0xdd
--- syscall (4, FreeBSD ELF64, write), rip = 0x8009438fc, rsp =
0x7fffda68, rbp = 0xa0 ---

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-03 Thread Garrett Cooper
On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper  wrote:
> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  wrote:
>>> Date: Tue, 3 May 2011 22:40:26 -0700
>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>>>  partition when filesystem full
>>> From: Garrett Cooper 
>>> To: Jeff Roberson ,
>>>         Marshall Kirk McKusick 
>>> Cc: FreeBSD Current 
>>>
>>> Hi Jeff and Dr. McKusick,
>>>     Ran into this panic when /usr ran out of space doing a make
>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>>> after the filesystem ran out of space -- wasn't quite sure what it was
>>> doing at the time):
>>>
>>> ...
>>>
>>>     Let me know what other commands you would like for me to run in kgdb.
>>> Thanks,
>>> -Garrett
>>
>> You did not indicate whether you are running an 8.X system or a 9-current
>> system. It would be helpful to know that.
>
> I've actually been running CURRENT for a few years now, but you're right --
> I didn't mention that part.
>
>> Jeff thinks that there may be a potential race in the locking code for
>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>>
>> Index: ffs_softdep.c
>> ===
>> --- ffs_softdep.c       (revision 221385)
>> +++ ffs_softdep.c       (working copy)
>> @@ -11380,7 +11380,8 @@
>>                                continue;
>>                        }
>>                        MNT_IUNLOCK(mp);
>> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
>> curthread)) {
>> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
>> LK_INTERLOCK,
>> +                           curthread)) {
>>                                MNT_ILOCK(mp);
>>                                continue;
>>                        }
>>
>> If you are running an 8.X system, hopefully you will be able to apply it.
>
>    I've applied it, rebuilt and installed the kernel, and trying to
> repro the case again. Will let you know how things go!

Happened again with the change. It's really easy to repro:

1. Get a filesystem with UFS+SU
2. Execute something that does a large number of small writes to a partition.
3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition

The kernel will panic with the issue I discussed above.
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-03 Thread Garrett Cooper
On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick  wrote:
>> Date: Tue, 3 May 2011 22:40:26 -0700
>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>>  partition when filesystem full
>> From: Garrett Cooper 
>> To: Jeff Roberson ,
>>         Marshall Kirk McKusick 
>> Cc: FreeBSD Current 
>>
>> Hi Jeff and Dr. McKusick,
>>     Ran into this panic when /usr ran out of space doing a make
>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>> after the filesystem ran out of space -- wasn't quite sure what it was
>> doing at the time):
>>
>> ...
>>
>>     Let me know what other commands you would like for me to run in kgdb.
>> Thanks,
>> -Garrett
>
> You did not indicate whether you are running an 8.X system or a 9-current
> system. It would be helpful to know that.

I've actually been running CURRENT for a few years now, but you're right --
I didn't mention that part.

> Jeff thinks that there may be a potential race in the locking code for
> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>
> Index: ffs_softdep.c
> ===
> --- ffs_softdep.c       (revision 221385)
> +++ ffs_softdep.c       (working copy)
> @@ -11380,7 +11380,8 @@
>                                continue;
>                        }
>                        MNT_IUNLOCK(mp);
> -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
> curthread)) {
> +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
> +                           curthread)) {
>                                MNT_ILOCK(mp);
>                                continue;
>                        }
>
> If you are running an 8.X system, hopefully you will be able to apply it.

I've applied it, rebuilt and installed the kernel, and trying to
repro the case again. Will let you know how things go!
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-03 Thread Garrett Cooper
Hi Jeff and Dr. McKusick,
Ran into this panic when /usr ran out of space doing a make
universe on amd64/r221219 (it took ~15 minutes for the panic to occur
after the filesystem ran out of space -- wasn't quite sure what it was
doing at the time):

pid 24486 (ld), uid 0 inumber 9993 on /usr: filesystem full
pid 24511 (config), uid 0 inumber 361082 on /usr: filesystem full
pid 24494 (make), uid 0 inumber 1886295 on /usr: filesystem full
panic: __lockmgr_args: recursing on non recursive lockmgr bufwait @
/usr/src/sys/ufs/ffs/ffs_softdep.c:11025
(kgdb) #0  doadump () at pcpu.h:224
#1  0x802af22c in db_fncall (dummy1=Variable "dummy1" is not available.
)
at /usr/src/sys/ddb/db_command.c:548
#2  0x802af561 in db_command (last_cmdp=0x808f93c0,
cmd_table=Variable "cmd_table" is not available.

) at /usr/src/sys/ddb/db_command.c:445
#3  0x802af7a9 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:498
#4  0x802b1737 in db_trap (type=Variable "type" is not available.
) at /usr/src/sys/ddb/db_main.c:229
#5  0x803f7d48 in kdb_trap (type=3, code=0, tf=0xff834e4c8ef0)
at /usr/src/sys/kern/subr_kdb.c:533
#6  0x80599da5 in trap (frame=0xff834e4c8ef0)
at /usr/src/sys/amd64/amd64/trap.c:590
#7  0x80584ef3 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:228
#8  0x803f7baf in kdb_enter (why=0x806178cf "panic",
msg=0xa ) at cpufunc.h:63
#9  0x803c4b6f in panic (fmt=Variable "fmt" is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:584
#10 0x803af3ac in __lockmgr_args (lk=0x100, flags=0,
ilk=0xfe00b95766c0, wmesg=Variable "wmesg" is not available.
) at /usr/src/sys/kern/kern_lock.c:720
#11 0x8054240b in softdep_sync_metadata (vp=0xfe017fe5d000)
at lockmgr.h:97
#12 0x80548e90 in ffs_syncvnode (vp=0xfe017fe5d000,
waitfor=Variable "waitfor" is not available.
)
at /usr/src/sys/ufs/ffs/ffs_vnops.c:331
#13 0x8053be23 in softdep_request_cleanup (fs=0xfe00086ef000,
vp=0xfe00b95765a0, cred=Variable "cred" is not available.
) at /usr/src/sys/ufs/ffs/ffs_softdep.c:11392
#14 0x80523895 in ffs_realloccg (ip=0xfe00b9285bd0, lbprev=0,
bprev=10092847, bpref=10723304, osize=2048, nsize=4096, flags=65536,
cred=0xfe026e905a00, bpp=0xff834e4c95f0)
at /usr/src/sys/ufs/ffs/ffs_alloc.c:423
#15 0x805266de in ffs_balloc_ufs2 (vp=0xfe00b95765a0,
startoffset=Variable "startoffset" is not available.

) at /usr/src/sys/ufs/ffs/ffs_balloc.c:699
#16 0x8054fbfb in ufs_direnter (dvp=0xfe00b95765a0,
tvp=0xfe00701c4000, dirp=0xff834e4c97b0, cnp=Variable
"cnp" is not available.
)
at /usr/src/sys/ufs/ufs/ufs_lookup.c:910
#17 0x80557af8 in ufs_mkdir (ap=0xff834e4c9a90)
at /usr/src/sys/ufs/ufs/ufs_vnops.c:1961
#18 0x805d666b in VOP_MKDIR_APV (vop=0x808c2a40,
a=0xff834e4c9a90) at vnode_if.c:1534
#19 0x80457eb8 in kern_mkdirat (td=0xfe0149df4000, fd=-100,
path=0x6096e0 , segflg=Variable
"segflg" is not available.
) at vnode_if.h:665
#20 0x80404cd1 in syscallenter (td=0xfe0149df4000,
sa=0xff834e4c9bb0) at /usr/src/sys/kern/subr_trap.c:344
#21 0x8059996e in syscall (frame=0xff834e4c9c50)
at /usr/src/sys/amd64/amd64/trap.c:910
#22 0x805851bd in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:384
#23 0x000800b3798c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)
$ sudo tunefs -p /usr
tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)   disabled
tunefs: MAC multilabel: (-l)   disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j)   disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  2048
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L)

Let me know what other commands you would like for me to run in kgdb.
Thanks,
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"