FreeBSD-HEAD gets stuck on vnode operations

2013-05-13 Thread Roger Pau Monné
Hello,

I've set up a FreeBSD-HEAD VM on Xen, and compiled the XENHVM kernel, 
last commit in the repository is:

Date: Tue, 7 May 2013 12:39:14 +
Subject: [PATCH] By request, add an arrow from NetBSD-0.8 to FreeBSD-1.0.

While here, add a few more NetBSD versions to the tree itself.

Submitted by:   Alan Barrett 
Submitted by:   Thomas Klausner 

And I've started stressing the VM using the following tight loop:

while [ 1 ]; do make -j12 buildkernel; done

>From time to time, I see the VM getting stuck and when breaking into 
KDB this is the output of ps:

  pid  ppid  pgrp   uid   state   wmesg wchancmd
32343   670   67025  L  *Name Cac 0xfe0017183780 sendmail
32342 32256  5273 0  L+ *Name Cac 0xfe0017183780 sh
32341 32335  5273 0  L+ *Name Cac 0xfe0017183780 cc
32340 32284  5273 0  D+  ufs  0xfe000dc76068 ctfconvert
32339 32293  5273 0  L+ *Name Cac 0xfe0017183780 ctfconvert
32337 32332  5273 0  L+ *vnode_fr 0xfe00872c23c0 cc
32335 32334  5273 0  S+  wait 0xfe010537d000 cc
32334 30655  5273 0  S+  wait 0xfe010ac974b8 sh
32333 32329  5273 0  L+ *vnode in 0xfe0017183180 cc
32332 32331  5273 0  S+  wait 0xfe00416a14b8 cc
32331 30655  5273 0  S+  wait 0xfe00b9846970 sh
32329 32328  5273 0  S+  wait 0xfe0049ace000 cc
32328 30655  5273 0  S+  wait 0xfe004167 sh
32324 32320  5273 0  L+ *Name Cac 0xfe0017183780 cc
32320 32318  5273 0  S+  wait 0xfe00416184b8 cc
32318 30655  5273 0  S+  wait 0xfe00879fd970 sh
32314 32313  5273 0  L+ *Name Cac 0xfe0017183780 cc
32313 32311  5273 0  S+  wait 0xfe00b9846000 cc
32312 32309  5273 0  L+ *Name Cac 0xfe0017183780 cc
32311 30655  5273 0  S+  wait 0xfe00415844b8 sh
32310 32305  5273 0  L+ *Name Cac 0xfe0017183780 cc
32309 32307  5273 0  S+  wait 0xfe00499624b8 cc
32307 30655  5273 0  S+  wait 0xfe010537c970 sh
32305 32304  5273 0  S+  wait 0xfe0041669000 cc
32304 30655  5273 0  S+  wait 0xfe00173d9970 sh
32303 32298  5273 0  L+ *Name Cac 0xfe0017183780 cc
32298 32295  5273 0  S+  wait 0xfe0049924000 cc
32295 30655  5273 0  S+  wait 0xfe0041631970 sh
32293 30655  5273 0  S+  wait 0xfe000d15b000 sh
32284 30655  5273 0  S+  wait 0xfe00416684b8 sh
32256 31391  5273 0  S+  select   0xfe000d965840 make
32022 30655  5273 0  L+ *Name Cac 0xfe0017183780 sh
31391 31386  5273 0  S+  wait 0xfe0041680970 sh
31386 30664  5273 0  S+  select   0xfe0017942cc0 make
30664 30663  5273 0  S+  wait 0xfe004169f4b8 sh
30663 30662  5273 0  S+  select   0xfe000d1c0d40 make
30662 30655  5273 0  S+  wait 0xfe00b9ddb4b8 sh
30655  5287  5273 0  S+  select   0xfe000d1c0040 make
 5287  5280  5273 0  S+  wait 0xfe004148f970 sh
 5280  5278  5273 0  S+  select   0xfe000d964540 make
 5278  5273  5273 0  S+  wait 0xfe000de3a970 sh
 5273   736  5273 0  S+  select   0xfe000bc81740 make
91658   735 91658 0  Ss+ ttyin0xfe000d6caca8 bash
  736   735   736 0  Ss+ wait 0xfe000d5084b8 bash
  735   734   735 0  Ss  select   0xfe000bc817c0 screen
  734   730   734 0  S+  pause0xfe000d6330a8 screen
  730   729   730 0  S+  wait 0xfe000d58b000 bash
  729   728   729  1001  S+  wait 0xfe000d7ab970 su
  728   727   728  1001  Ss+ wait 0xfe000de3a4b8 sh
  727   724   724  1001  S   select   0xfe000d98f340 sshd
  724   664   724 0  Ss  select   0xfe000d4180c0 sshd
  722   721   722 0  L+ *Name Cac 0xfe0017183780 bash
  721 1   721 0  Ss+ wait 0xfe000d7ab000 login
  720 1   720 0  Ss+ ttyin0xfe000d0470a8 getty
  719 1   719 0  Ss+ ttyin0xfe000d0474a8 getty
  718 1   718 0  Ss+ ttyin0xfe000bd914a8 getty
  717 1   717 0  Ss+ ttyin0xfe000bd918a8 getty
  716 1   716 0  Ss+ ttyin0xfe000bd91ca8 getty
  715 1   715 0  Ss+ ttyin0xfe000d0440a8 getty
  714 1   714 0  Ss+ ttyin0xfe000d0444a8 getty
  713 1   713 0  Ss+ ttyin0xfe000d0448a8 getty
  674 1   674 0  Ls *Name Cac 0xfe0017183780 cron
  670 1   67025  Ss  pause0xfe000d5090a8 sendmail
  667 1   667 0  Ls *Name Cac 0xfe0017183780 sendmail
  664 1   664 0  Ss  select   0xfe000d964840 sshd
  566 1   566 0  Ss  select   0xfe000d98f4c0 syslogd
  472 1   472 0  Ss  select 

Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-13 Thread Roger Pau Monné
On 13/05/13 13:18, Roger Pau Monné wrote:
> The VM can be stuck in this state for quite some time, it generally

I would like to explain this a little bit more, the syncer process
doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
looping forever in what seems to be an endless loop around
mnt_vnode_next_active/ffs_sync. Also while in this state there is no
noticeable disk activity, so I'm unsure of what is happening.

> varies between a couple of minutes (5-10min) to an hour or two, after
> this the VM recovers itself and resumes normal operation. I still have
> to test this on a bare metal FreeBSD install, but I would like to ask
> if someone has seen a similar behaviour, or if someone is suspicious of
> a change that could cause this.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-13 Thread Konstantin Belousov
On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
> On 13/05/13 13:18, Roger Pau Monn? wrote:
> > The VM can be stuck in this state for quite some time, it generally
> 
> I would like to explain this a little bit more, the syncer process
> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
> looping forever in what seems to be an endless loop around
> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
> noticeable disk activity, so I'm unsure of what is happening.
How many CPUs does your VM have ?

The loop you describing means that other thread owns the vnode
interlock. Can you track what this thread does ? E.g. look at the
vp->v_interlock.mtx_lock, which is basically a pointer to the struct
thread owning the mutex, clear low bits as needed. Then you can
inspect the thread and get a backtrace.

Does the loop you described stuck on the same vnode during the whole
lock-step time, or is the progress made, possibly slowly ?

I suppose that your HEAD is recent.

> 
> > varies between a couple of minutes (5-10min) to an hour or two, after
> > this the VM recovers itself and resumes normal operation. I still have
> > to test this on a bare metal FreeBSD install, but I would like to ask
> > if someone has seen a similar behaviour, or if someone is suspicious of
> > a change that could cause this.
> ___
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


pgprfk18UCLzj.pgp
Description: PGP signature


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-14 Thread Roger Pau Monné
On 13/05/13 17:00, Konstantin Belousov wrote:
> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
>> On 13/05/13 13:18, Roger Pau Monn? wrote:

Thanks for taking a look,

>> I would like to explain this a little bit more, the syncer process
>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
>> looping forever in what seems to be an endless loop around
>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
>> noticeable disk activity, so I'm unsure of what is happening.
> How many CPUs does your VM have ?

7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.

> 
> The loop you describing means that other thread owns the vnode
> interlock. Can you track what this thread does ? E.g. look at the
> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> thread owning the mutex, clear low bits as needed. Then you can
> inspect the thread and get a backtrace.

There are no other threads running, only syncer is running on CPU 1 (see
ps in previous email). All other CPUs are idle, and as seen from the ps
quite a lot of threads are blocked in vnode related operations, either
"*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
of alllocks in the previous email.

> 
> Does the loop you described stuck on the same vnode during the whole
> lock-step time, or is the progress made, possibly slowly ?

I'm not sure how to measure "progress", but indeed the syncer process is
not locked, it is iterating over mnt_vnode_next_active.

> 
> I suppose that your HEAD is recent.

Last commit in my local repository is:

Date: Tue, 7 May 2013 12:39:14 +
Subject: [PATCH] By request, add an arrow from NetBSD-0.8 to FreeBSD-1.0.

While here, add a few more NetBSD versions to the tree itself.

Submitted by:   Alan Barrett 
Submitted by:   Thomas Klausner 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-14 Thread Konstantin Belousov
On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
> On 13/05/13 17:00, Konstantin Belousov wrote:
> > On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
> >> On 13/05/13 13:18, Roger Pau Monn? wrote:
> 
> Thanks for taking a look,
> 
> >> I would like to explain this a little bit more, the syncer process
> >> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
> >> looping forever in what seems to be an endless loop around
> >> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
> >> noticeable disk activity, so I'm unsure of what is happening.
> > How many CPUs does your VM have ?
> 
> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
> 
> > 
> > The loop you describing means that other thread owns the vnode
> > interlock. Can you track what this thread does ? E.g. look at the
> > vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> > thread owning the mutex, clear low bits as needed. Then you can
> > inspect the thread and get a backtrace.
> 
> There are no other threads running, only syncer is running on CPU 1 (see
> ps in previous email). All other CPUs are idle, and as seen from the ps
> quite a lot of threads are blocked in vnode related operations, either
> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
> of alllocks in the previous email.
This is not useful.  You need to look at the mutex which fails the
trylock operation in the mnt_vnode_next_active(), see who owns it,
and then 'unwind' the locking dependencies from there.

I described the procedure above.

> 
> > 
> > Does the loop you described stuck on the same vnode during the whole
> > lock-step time, or is the progress made, possibly slowly ?
> 
> I'm not sure how to measure "progress", but indeed the syncer process is
> not locked, it is iterating over mnt_vnode_next_active.

Progress means that iteration moves from vnode to vnode, instead of looping
over the same vnode continuously.  I did read what you said about system
being un-stuck in some time, but I am asking about change of the iterator
during the stuck time.

> 
> > 
> > I suppose that your HEAD is recent.
> 
> Last commit in my local repository is:
> 
> Date: Tue, 7 May 2013 12:39:14 +
> Subject: [PATCH] By request, add an arrow from NetBSD-0.8 to FreeBSD-1.0.
> 
> While here, add a few more NetBSD versions to the tree itself.
> 
> Submitted by:   Alan Barrett 
> Submitted by:   Thomas Klausner 


pgpCVICedCnSv.pgp
Description: PGP signature


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-14 Thread Roger Pau Monné
On 14/05/13 18:31, Konstantin Belousov wrote:
> On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
>> On 13/05/13 17:00, Konstantin Belousov wrote:
>>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
 On 13/05/13 13:18, Roger Pau Monn? wrote:
>>
>> Thanks for taking a look,
>>
 I would like to explain this a little bit more, the syncer process
 doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
 looping forever in what seems to be an endless loop around
 mnt_vnode_next_active/ffs_sync. Also while in this state there is no
 noticeable disk activity, so I'm unsure of what is happening.
>>> How many CPUs does your VM have ?
>>
>> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
>>
>>>
>>> The loop you describing means that other thread owns the vnode
>>> interlock. Can you track what this thread does ? E.g. look at the
>>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
>>> thread owning the mutex, clear low bits as needed. Then you can
>>> inspect the thread and get a backtrace.
>>
>> There are no other threads running, only syncer is running on CPU 1 (see
>> ps in previous email). All other CPUs are idle, and as seen from the ps
>> quite a lot of threads are blocked in vnode related operations, either
>> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
>> of alllocks in the previous email.
> This is not useful.  You need to look at the mutex which fails the
> trylock operation in the mnt_vnode_next_active(), see who owns it,
> and then 'unwind' the locking dependencies from there.

Sorry, now I get it, let's see if I can find the locked vnodes and the
thread that owns them...
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-20 Thread John Baldwin
On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
> On 14/05/13 18:31, Konstantin Belousov wrote:
> > On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
> >> On 13/05/13 17:00, Konstantin Belousov wrote:
> >>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
>  On 13/05/13 13:18, Roger Pau Monn? wrote:
> >>
> >> Thanks for taking a look,
> >>
>  I would like to explain this a little bit more, the syncer process
>  doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
>  looping forever in what seems to be an endless loop around
>  mnt_vnode_next_active/ffs_sync. Also while in this state there is no
>  noticeable disk activity, so I'm unsure of what is happening.
> >>> How many CPUs does your VM have ?
> >>
> >> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
> >>
> >>>
> >>> The loop you describing means that other thread owns the vnode
> >>> interlock. Can you track what this thread does ? E.g. look at the
> >>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> >>> thread owning the mutex, clear low bits as needed. Then you can
> >>> inspect the thread and get a backtrace.
> >>
> >> There are no other threads running, only syncer is running on CPU 1 (see
> >> ps in previous email). All other CPUs are idle, and as seen from the ps
> >> quite a lot of threads are blocked in vnode related operations, either
> >> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
> >> of alllocks in the previous email.
> > This is not useful.  You need to look at the mutex which fails the
> > trylock operation in the mnt_vnode_next_active(), see who owns it,
> > and then 'unwind' the locking dependencies from there.
> 
> Sorry, now I get it, let's see if I can find the locked vnodes and the
> thread that owns them...

You can use 'show lock v_interlock>' to find an owning
thread and then use 'show sleepchain '.  If you are using kgdb on the 
live system (probably easier) then you can grab my scripts at 
www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You can 
then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
'sleepchain '

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-25 Thread Roger Pau Monné
On 20/05/13 20:34, John Baldwin wrote:
> On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
>> On 14/05/13 18:31, Konstantin Belousov wrote:
>>> On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
 On 13/05/13 17:00, Konstantin Belousov wrote:
> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
>> On 13/05/13 13:18, Roger Pau Monn? wrote:

 Thanks for taking a look,

>> I would like to explain this a little bit more, the syncer process
>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
>> looping forever in what seems to be an endless loop around
>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
>> noticeable disk activity, so I'm unsure of what is happening.
> How many CPUs does your VM have ?

 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.

>
> The loop you describing means that other thread owns the vnode
> interlock. Can you track what this thread does ? E.g. look at the
> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> thread owning the mutex, clear low bits as needed. Then you can
> inspect the thread and get a backtrace.

 There are no other threads running, only syncer is running on CPU 1 (see
 ps in previous email). All other CPUs are idle, and as seen from the ps
 quite a lot of threads are blocked in vnode related operations, either
 "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
 of alllocks in the previous email.
>>> This is not useful.  You need to look at the mutex which fails the
>>> trylock operation in the mnt_vnode_next_active(), see who owns it,
>>> and then 'unwind' the locking dependencies from there.
>>
>> Sorry, now I get it, let's see if I can find the locked vnodes and the
>> thread that owns them...
> 
> You can use 'show lock v_interlock>' to find an owning
> thread and then use 'show sleepchain '.  If you are using kgdb on the 
> live system (probably easier) then you can grab my scripts at 
> www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You can 
> then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
> 'sleepchain '

Hello,

Sorry for the delay in debugging this, but the fact that the issue 
itself is hard to reproduce, and that I didn't have much time prevented 
me from digging into this. It seems like the locked node is always the 
same, or at least I haven't been able to see this loop with more than 
one node locked (so far):

db> show lock 0xfe0030cdf448
 class: sleep mutex
 name: vnode interlock
 flags: {DEF}
 state: {OWNED}
 owner: 0xfe008d1e9000 (tid 101020, pid 66630, "cc")
db> show sleepchain 0xfe008d1e9000
thread 101020 (pid 66630, cc) inhibited
db> tr 66630
Tracing pid 66630 tid 101020 td 0xfe008d1e9000
sched_switch() at sched_switch+0x482/frame 0xff8120ff3630
mi_switch() at mi_switch+0x179/frame 0xff8120ff3670
turnstile_wait() at turnstile_wait+0x3ac/frame 0xff8120ff36c0
__mtx_lock_sleep() at __mtx_lock_sleep+0x255/frame 0xff8120ff3740
__mtx_lock_flags() at __mtx_lock_flags+0xda/frame 0xff8120ff3780
vdropl() at vdropl+0x255/frame 0xff8120ff37b0
vputx() at vputx+0x27c/frame 0xff8120ff3810
namei() at namei+0x3dd/frame 0xff8120ff38c0
kern_statat_vnhook() at kern_statat_vnhook+0x99/frame 0xff8120ff3a40
sys_stat() at sys_stat+0x2d/frame 0xff8120ff3ae0
amd64_syscall() at amd64_syscall+0x265/frame 0xff8120ff3bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xff8120ff3bf0
--- syscall (188, FreeBSD ELF64, sys_stat), rip = 0x18280ea, rsp = 
0x7fffa148, rbp = 0x7fffa170 ---
db> ps
  pid  ppid  pgrp   uid   state   wmesg wchancmd
66630 66591 66588 0  L+ *vnode_fr 0xfe000bc9fa80 cc
66591 66590 66588 0  S+  wait 0xfe0030dbd4b8 cc
66590 66588 66588 0  S+  wait 0xfe008de2 sh
66588 66572 66588 0  S+  wait 0xfe008d8a8970 sh
66572 49649 49649 0  S+  select   0xfe003029dac0 make
49649 49631 49649 0  S+  wait 0xfe000d801970 sh
49631 49629 49629 0  S+  select   0xfe00302b3cc0 make
49629 49614 49629 0  S+  wait 0xfe008dbfa970 sh
49614 45214 45214 0  S+  select   0xfe00303e3640 make
45214 45207 45214 0  S+  wait 0xfe008dbfa000 sh
45207 45205 45200 0  S+  select   0xfe000d66d3c0 make
45205 45200 45200 0  S+  wait 0xfe008da1b4b8 sh
45200   757 45200 0  S+  select   0xfe000d6093c0 make
30611   737 30611 0  S+  ttyin0xfe000bd70ca8 bash
 2325   756  2325 0  Ss+ ttyin0xfe008dd6eca8 bash
  757   756   757 0  Ss+ wait 0xfe000d7fd4b8 bash
  756   755   756 0  Ss  select   0xfe000d783a40 screen
  755   744   755 0  S+  pause0xfe000d9560a8 screen
  744   743   744  

Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-26 Thread Roger Pau Monné
On 25/05/13 19:52, Roger Pau Monné wrote:
> On 20/05/13 20:34, John Baldwin wrote:
>> On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
>>> On 14/05/13 18:31, Konstantin Belousov wrote:
 On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
> On 13/05/13 17:00, Konstantin Belousov wrote:
>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
>>> On 13/05/13 13:18, Roger Pau Monn? wrote:
>
> Thanks for taking a look,
>
>>> I would like to explain this a little bit more, the syncer process
>>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
>>> looping forever in what seems to be an endless loop around
>>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
>>> noticeable disk activity, so I'm unsure of what is happening.
>> How many CPUs does your VM have ?
>
> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
>
>>
>> The loop you describing means that other thread owns the vnode
>> interlock. Can you track what this thread does ? E.g. look at the
>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
>> thread owning the mutex, clear low bits as needed. Then you can
>> inspect the thread and get a backtrace.
>
> There are no other threads running, only syncer is running on CPU 1 (see
> ps in previous email). All other CPUs are idle, and as seen from the ps
> quite a lot of threads are blocked in vnode related operations, either
> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
> of alllocks in the previous email.
 This is not useful.  You need to look at the mutex which fails the
 trylock operation in the mnt_vnode_next_active(), see who owns it,
 and then 'unwind' the locking dependencies from there.
>>>
>>> Sorry, now I get it, let's see if I can find the locked vnodes and the
>>> thread that owns them...
>>
>> You can use 'show lock v_interlock>' to find an owning
>> thread and then use 'show sleepchain '.  If you are using kgdb on 
>> the 
>> live system (probably easier) then you can grab my scripts at 
>> www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You can 
>> then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
>> 'sleepchain '

Hello,

I've been looking into this issue a little bit more, and the lock
dependencies look right to me, the lockup happens when the thread owning
the v_interlock mutex tries to acquire the vnode_free_list_mtx mutex
which is already owned by the syncer thread, at this point, the thread
owning the v_interlock mutex goes to sleep, and the syncer process will
start doing a sequence of:

VI_TRYLOCK -> mtx_unlock vnode_free_list_mtx -> kern_yield -> mtx_lock
vnode_free_list_mtx ...

It seems like kern_yield, which I assume is placed there in order to
allow the thread owning v_interlock to be able to also lock
vnode_free_list_mtx, doesn't get a window big enough to wake up the
waiting thread and get the vnode_free_list_mtx mutex. Since the syncer
is the only process runnable on the CPU there is no context switch, and
the syncer process continues to run.

Relying on kern_yield to provide a window big enough that allows any
other thread waiting on vnode_free_list_mtx to run doesn't seem like a
good idea on SMP systems. I've not tested this on bare metal, but waking
up an idle CPU in a virtualized environment might be more expensive than
doing it on bare metal.

Bear in mind that I'm not familiar with either the scheduler or the ufs
code, my proposed naive fix is to replace the kern_yield call with a
pause, that will allow any other threads waiting on vnode_free_list_mtx
to lock the vnode_free_list_mtx mutex and finish whatever they are doing
and release the v_interlock mutex, so the syncer thread can also finish
it's work. I've tested the patch for a couple of hours and seems to be
fine, I haven't been able to reproduce the issue anymore.

>From fec90f7bb9cdf05b49d11dbe4930d3c595c147f5 Mon Sep 17 00:00:00 2001
From: Roger Pau Monne 
Date: Sun, 26 May 2013 19:55:43 +0200
Subject: [PATCH] mnt_vnode_next_active: replace kern_yield with pause

On SMP systems there is no way to assure that a kern_yield will allow
any other threads waiting on the vnode_free_list_mtx to be able to
acquire it. The syncer process can get stuck in a loop trying to lock
the v_interlock mutex, without allowing other threads waiting on
vnode_free_list_mtx to run. Replace the kern_yield with a pause, that
should allow any thread owning v_interlock and waiting on
vnode_free_list_mtx to finish it's work and release v_interlock.
---
 sys/kern/vfs_subr.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 0da6764..597f4b7 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -4703,7 +4703,15 @@ restart:
if (mp_ncpus == 1

Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-26 Thread Jilles Tjoelker
On Sun, May 26, 2013 at 09:28:05PM +0200, Roger Pau Monné wrote:
> On 25/05/13 19:52, Roger Pau Monné wrote:
> > On 20/05/13 20:34, John Baldwin wrote:
> >> On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
> >>> On 14/05/13 18:31, Konstantin Belousov wrote:
>  On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
> > On 13/05/13 17:00, Konstantin Belousov wrote:
> >> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
> >>> On 13/05/13 13:18, Roger Pau Monn? wrote:

> > Thanks for taking a look,

> >>> I would like to explain this a little bit more, the syncer process
> >>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
> >>> looping forever in what seems to be an endless loop around
> >>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
> >>> noticeable disk activity, so I'm unsure of what is happening.
> >> How many CPUs does your VM have ?

> > 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.

> >> The loop you describing means that other thread owns the vnode
> >> interlock. Can you track what this thread does ? E.g. look at the
> >> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> >> thread owning the mutex, clear low bits as needed. Then you can
> >> inspect the thread and get a backtrace.

> > There are no other threads running, only syncer is running on CPU 1 (see
> > ps in previous email). All other CPUs are idle, and as seen from the ps
> > quite a lot of threads are blocked in vnode related operations, either
> > "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
> > of alllocks in the previous email.
>  This is not useful.  You need to look at the mutex which fails the
>  trylock operation in the mnt_vnode_next_active(), see who owns it,
>  and then 'unwind' the locking dependencies from there.

> >>> Sorry, now I get it, let's see if I can find the locked vnodes and the
> >>> thread that owns them...

> >> You can use 'show lock v_interlock>' to find an owning
> >> thread and then use 'show sleepchain '.  If you are using kgdb on 
> >> the 
> >> live system (probably easier) then you can grab my scripts at 
> >> www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You 
> >> can 
> >> then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
> >> 'sleepchain '

> I've been looking into this issue a little bit more, and the lock
> dependencies look right to me, the lockup happens when the thread owning
> the v_interlock mutex tries to acquire the vnode_free_list_mtx mutex
> which is already owned by the syncer thread, at this point, the thread
> owning the v_interlock mutex goes to sleep, and the syncer process will
> start doing a sequence of:

> VI_TRYLOCK -> mtx_unlock vnode_free_list_mtx -> kern_yield -> mtx_lock
> vnode_free_list_mtx ...

> It seems like kern_yield, which I assume is placed there in order to
> allow the thread owning v_interlock to be able to also lock
> vnode_free_list_mtx, doesn't get a window big enough to wake up the
> waiting thread and get the vnode_free_list_mtx mutex. Since the syncer
> is the only process runnable on the CPU there is no context switch, and
> the syncer process continues to run.

> Relying on kern_yield to provide a window big enough that allows any
> other thread waiting on vnode_free_list_mtx to run doesn't seem like a
> good idea on SMP systems. I've not tested this on bare metal, but waking
> up an idle CPU in a virtualized environment might be more expensive than
> doing it on bare metal.

> Bear in mind that I'm not familiar with either the scheduler or the ufs
> code, my proposed naive fix is to replace the kern_yield call with a
> pause, that will allow any other threads waiting on vnode_free_list_mtx
> to lock the vnode_free_list_mtx mutex and finish whatever they are doing
> and release the v_interlock mutex, so the syncer thread can also finish
> it's work. I've tested the patch for a couple of hours and seems to be
> fine, I haven't been able to reproduce the issue anymore.

Instead of a pause() that may be too short or too long, how about
waiting for the necessary lock? In other words, replace the kern_yield()
call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
to acquire two locks without imposing an order between them.

I expect blocking on a mutex to be safe enough; a mutex may not be held
across waiting for hardware or other events.

> >From fec90f7bb9cdf05b49d11dbe4930d3c595c147f5 Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne 
> Date: Sun, 26 May 2013 19:55:43 +0200
> Subject: [PATCH] mnt_vnode_next_active: replace kern_yield with pause
> 
> On SMP systems there is no way to assure that a kern_yield will allow
> any other threads waiting on the vnode_free_list_mtx to be able to
> acquire it. The syncer process can get stuck in a loop trying to lock
> 

Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-26 Thread Roger Pau Monné
On 26/05/13 22:20, Jilles Tjoelker wrote:
> Instead of a pause() that may be too short or too long, how about
> waiting for the necessary lock? In other words, replace the kern_yield()
> call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
> to acquire two locks without imposing an order between them.

Since there might be more than one locked vnode, waiting on a specific
locked vnode seemed rather arbitrary, but I agree that the pause is also
rather arbitrary.

Also, can we be sure that the v_interlock mutex will not be destroyed
while the syncer process is waiting for it to be unlocked?

> I expect blocking on a mutex to be safe enough; a mutex may not be held
> across waiting for hardware or other events.
> 

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-26 Thread Jilles Tjoelker
On Sun, May 26, 2013 at 10:52:07PM +0200, Roger Pau Monné wrote:
> On 26/05/13 22:20, Jilles Tjoelker wrote:
> > Instead of a pause() that may be too short or too long, how about
> > waiting for the necessary lock? In other words, replace the kern_yield()
> > call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
> > to acquire two locks without imposing an order between them.

> Since there might be more than one locked vnode, waiting on a specific
> locked vnode seemed rather arbitrary, but I agree that the pause is also
> rather arbitrary.

> Also, can we be sure that the v_interlock mutex will not be destroyed
> while the syncer process is waiting for it to be unlocked?

I think this is a major problem. My idea was too easy and will not work.

That said, the code in mnt_vnode_next_active() appears to implement some
sort of adaptive spinning for SMP. It tries VI_TRYLOCK for 200ms
(default value of hogticks) and then yields. This is far too long for a
mutex lock and if it takes that long it means that either the thread
owning the lock is blocked by us somehow or someone is abusing a mutex
to work like a sleepable lock such as by spinning or DELAY.

Given that it has been spinning for 200ms, it is not so bad to pause for
one additional microsecond.

The adaptive spinning was added fairly recently, so apparently it
happens fairly frequently that VI_TRYLOCK fails transiently.
Unfortunately, the real adaptive spinning code cannot be used because it
will spin forever as long as the thread owning v_interlock is running,
including when that is because it is spinning for vnode_free_list_mtx.
Perhaps we can try to VI_TRYLOCK a certain number of times. It is also
possible to check the contested bit of vnode_free_list_mtx
(sys/netgraph/netflow/netflow.c does something similar) and stop
spinning in that case.

A cpu_spinwait() invocation should also be added to the spin loop.

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-26 Thread Konstantin Belousov
On Mon, May 27, 2013 at 12:22:54AM +0200, Jilles Tjoelker wrote:
> On Sun, May 26, 2013 at 10:52:07PM +0200, Roger Pau Monn? wrote:
> > On 26/05/13 22:20, Jilles Tjoelker wrote:
> > > Instead of a pause() that may be too short or too long, how about
> > > waiting for the necessary lock? In other words, replace the kern_yield()
> > > call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
> > > to acquire two locks without imposing an order between them.
> 
> > Since there might be more than one locked vnode, waiting on a specific
> > locked vnode seemed rather arbitrary, but I agree that the pause is also
> > rather arbitrary.
> 
> > Also, can we be sure that the v_interlock mutex will not be destroyed
> > while the syncer process is waiting for it to be unlocked?
> 
> I think this is a major problem. My idea was too easy and will not work.
> 
> That said, the code in mnt_vnode_next_active() appears to implement some
> sort of adaptive spinning for SMP. It tries VI_TRYLOCK for 200ms
> (default value of hogticks) and then yields. This is far too long for a
> mutex lock and if it takes that long it means that either the thread
> owning the lock is blocked by us somehow or someone is abusing a mutex
> to work like a sleepable lock such as by spinning or DELAY.
> 
> Given that it has been spinning for 200ms, it is not so bad to pause for
> one additional microsecond.
> 
> The adaptive spinning was added fairly recently, so apparently it
> happens fairly frequently that VI_TRYLOCK fails transiently.
> Unfortunately, the real adaptive spinning code cannot be used because it
> will spin forever as long as the thread owning v_interlock is running,
> including when that is because it is spinning for vnode_free_list_mtx.
> Perhaps we can try to VI_TRYLOCK a certain number of times. It is also
> possible to check the contested bit of vnode_free_list_mtx
> (sys/netgraph/netflow/netflow.c does something similar) and stop
> spinning in that case.
> 
> A cpu_spinwait() invocation should also be added to the spin loop.

There are two 'proper' solutions for this issue:

One is to change the handling of the vnode lifecycle to allow the
safe block for the vnode interlock acquisition. In particular, the
change would add some stability of the vnode memory when vnode is
put on the free list. As example, the vnode zone could be marked as
type-stable again, and then the vnode interlock can be obtained with
dropped free list lock. Arguably, marking the zone as non-freeable would
be a regression, esp. for the zone which accounts for largest allocation
on the kernel memory.

Another one is to somehow ensure that the priority is properly
propagated from the spinning thread to the vnode interlock owner.
I think that it is desirable to donate some amount of priority
from the spinning thread.  Unfortunately, I was unable to come
up with elegant solution for this which would be also contained
and did not require rewamp of the mutex interfaces.

BTW, if anybody come up with the idea of the restructuring the free list
handling to avoid the free list/vnode interlock LOR altogether, it would
be the best.

I do not have objections against the pause() addition, but I would
argue that should_yield() should be removed then, switching the code to
unconditionally pause when the collision detected.



pgpx2bUtXS6KF.pgp
Description: PGP signature


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-27 Thread Roger Pau Monné
On 27/05/13 08:07, Konstantin Belousov wrote:
> On Mon, May 27, 2013 at 12:22:54AM +0200, Jilles Tjoelker wrote:
>> On Sun, May 26, 2013 at 10:52:07PM +0200, Roger Pau Monn? wrote:
>>> On 26/05/13 22:20, Jilles Tjoelker wrote:
 Instead of a pause() that may be too short or too long, how about
 waiting for the necessary lock? In other words, replace the kern_yield()
 call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
 to acquire two locks without imposing an order between them.
>>
>>> Since there might be more than one locked vnode, waiting on a specific
>>> locked vnode seemed rather arbitrary, but I agree that the pause is also
>>> rather arbitrary.
>>
>>> Also, can we be sure that the v_interlock mutex will not be destroyed
>>> while the syncer process is waiting for it to be unlocked?
>>
>> I think this is a major problem. My idea was too easy and will not work.
>>
>> That said, the code in mnt_vnode_next_active() appears to implement some
>> sort of adaptive spinning for SMP. It tries VI_TRYLOCK for 200ms
>> (default value of hogticks) and then yields. This is far too long for a
>> mutex lock and if it takes that long it means that either the thread
>> owning the lock is blocked by us somehow or someone is abusing a mutex
>> to work like a sleepable lock such as by spinning or DELAY.
>>
>> Given that it has been spinning for 200ms, it is not so bad to pause for
>> one additional microsecond.
>>
>> The adaptive spinning was added fairly recently, so apparently it
>> happens fairly frequently that VI_TRYLOCK fails transiently.
>> Unfortunately, the real adaptive spinning code cannot be used because it
>> will spin forever as long as the thread owning v_interlock is running,
>> including when that is because it is spinning for vnode_free_list_mtx.
>> Perhaps we can try to VI_TRYLOCK a certain number of times. It is also
>> possible to check the contested bit of vnode_free_list_mtx
>> (sys/netgraph/netflow/netflow.c does something similar) and stop
>> spinning in that case.
>>
>> A cpu_spinwait() invocation should also be added to the spin loop.
> 
> There are two 'proper' solutions for this issue:
> 
> One is to change the handling of the vnode lifecycle to allow the
> safe block for the vnode interlock acquisition. In particular, the
> change would add some stability of the vnode memory when vnode is
> put on the free list. As example, the vnode zone could be marked as
> type-stable again, and then the vnode interlock can be obtained with
> dropped free list lock. Arguably, marking the zone as non-freeable would
> be a regression, esp. for the zone which accounts for largest allocation
> on the kernel memory.
> 
> Another one is to somehow ensure that the priority is properly
> propagated from the spinning thread to the vnode interlock owner.
> I think that it is desirable to donate some amount of priority
> from the spinning thread.  Unfortunately, I was unable to come
> up with elegant solution for this which would be also contained
> and did not require rewamp of the mutex interfaces.
> 
> BTW, if anybody come up with the idea of the restructuring the free list
> handling to avoid the free list/vnode interlock LOR altogether, it would
> be the best.
> 
> I do not have objections against the pause() addition, but I would
> argue that should_yield() should be removed then, switching the code to
> unconditionally pause when the collision detected.

Taking the idea from Jilles, what about replacing should_yield with a
check to see if the vnode_free_list_mtx mutex is contented?

That would prevent us from doing unnecessary pauses, and only releasing
the vnode_free_list_mtx mutex when there's someone else that actually
needs it.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-27 Thread Konstantin Belousov
On Mon, May 27, 2013 at 10:19:51AM +0200, Roger Pau Monn? wrote:
> On 27/05/13 08:07, Konstantin Belousov wrote:
> > On Mon, May 27, 2013 at 12:22:54AM +0200, Jilles Tjoelker wrote:
> >> On Sun, May 26, 2013 at 10:52:07PM +0200, Roger Pau Monn? wrote:
> >>> On 26/05/13 22:20, Jilles Tjoelker wrote:
>  Instead of a pause() that may be too short or too long, how about
>  waiting for the necessary lock? In other words, replace the kern_yield()
>  call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
>  to acquire two locks without imposing an order between them.
> >>
> >>> Since there might be more than one locked vnode, waiting on a specific
> >>> locked vnode seemed rather arbitrary, but I agree that the pause is also
> >>> rather arbitrary.
> >>
> >>> Also, can we be sure that the v_interlock mutex will not be destroyed
> >>> while the syncer process is waiting for it to be unlocked?
> >>
> >> I think this is a major problem. My idea was too easy and will not work.
> >>
> >> That said, the code in mnt_vnode_next_active() appears to implement some
> >> sort of adaptive spinning for SMP. It tries VI_TRYLOCK for 200ms
> >> (default value of hogticks) and then yields. This is far too long for a
> >> mutex lock and if it takes that long it means that either the thread
> >> owning the lock is blocked by us somehow or someone is abusing a mutex
> >> to work like a sleepable lock such as by spinning or DELAY.
> >>
> >> Given that it has been spinning for 200ms, it is not so bad to pause for
> >> one additional microsecond.
> >>
> >> The adaptive spinning was added fairly recently, so apparently it
> >> happens fairly frequently that VI_TRYLOCK fails transiently.
> >> Unfortunately, the real adaptive spinning code cannot be used because it
> >> will spin forever as long as the thread owning v_interlock is running,
> >> including when that is because it is spinning for vnode_free_list_mtx.
> >> Perhaps we can try to VI_TRYLOCK a certain number of times. It is also
> >> possible to check the contested bit of vnode_free_list_mtx
> >> (sys/netgraph/netflow/netflow.c does something similar) and stop
> >> spinning in that case.
> >>
> >> A cpu_spinwait() invocation should also be added to the spin loop.
> > 
> > There are two 'proper' solutions for this issue:
> > 
> > One is to change the handling of the vnode lifecycle to allow the
> > safe block for the vnode interlock acquisition. In particular, the
> > change would add some stability of the vnode memory when vnode is
> > put on the free list. As example, the vnode zone could be marked as
> > type-stable again, and then the vnode interlock can be obtained with
> > dropped free list lock. Arguably, marking the zone as non-freeable would
> > be a regression, esp. for the zone which accounts for largest allocation
> > on the kernel memory.
> > 
> > Another one is to somehow ensure that the priority is properly
> > propagated from the spinning thread to the vnode interlock owner.
> > I think that it is desirable to donate some amount of priority
> > from the spinning thread.  Unfortunately, I was unable to come
> > up with elegant solution for this which would be also contained
> > and did not require rewamp of the mutex interfaces.
> > 
> > BTW, if anybody come up with the idea of the restructuring the free list
> > handling to avoid the free list/vnode interlock LOR altogether, it would
> > be the best.
> > 
> > I do not have objections against the pause() addition, but I would
> > argue that should_yield() should be removed then, switching the code to
> > unconditionally pause when the collision detected.
> 
> Taking the idea from Jilles, what about replacing should_yield with a
> check to see if the vnode_free_list_mtx mutex is contented?
> 
> That would prevent us from doing unnecessary pauses, and only releasing
> the vnode_free_list_mtx mutex when there's someone else that actually
> needs it.
This would still be racy, and possibly allows the lock convoy in the same
manner as the current code.  Also, AFAIR, the real problem was when
two iterators start synchronized, it usually ended in live-lock.

If you are willing, try this, of course, but I tend to agree with just
an addition of pause() for now.


pgpusDk8a1XnT.pgp
Description: PGP signature


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-05-27 Thread Roger Pau Monné
On 27/05/13 12:23, Konstantin Belousov wrote:
> On Mon, May 27, 2013 at 10:19:51AM +0200, Roger Pau Monn? wrote:
>> On 27/05/13 08:07, Konstantin Belousov wrote:
>>> On Mon, May 27, 2013 at 12:22:54AM +0200, Jilles Tjoelker wrote:
 On Sun, May 26, 2013 at 10:52:07PM +0200, Roger Pau Monn? wrote:
> On 26/05/13 22:20, Jilles Tjoelker wrote:
>> Instead of a pause() that may be too short or too long, how about
>> waiting for the necessary lock? In other words, replace the kern_yield()
>> call with VI_LOCK(vp); VI_UNLOCK(vp);. This is also the usual approach
>> to acquire two locks without imposing an order between them.

> Since there might be more than one locked vnode, waiting on a specific
> locked vnode seemed rather arbitrary, but I agree that the pause is also
> rather arbitrary.

> Also, can we be sure that the v_interlock mutex will not be destroyed
> while the syncer process is waiting for it to be unlocked?

 I think this is a major problem. My idea was too easy and will not work.

 That said, the code in mnt_vnode_next_active() appears to implement some
 sort of adaptive spinning for SMP. It tries VI_TRYLOCK for 200ms
 (default value of hogticks) and then yields. This is far too long for a
 mutex lock and if it takes that long it means that either the thread
 owning the lock is blocked by us somehow or someone is abusing a mutex
 to work like a sleepable lock such as by spinning or DELAY.

 Given that it has been spinning for 200ms, it is not so bad to pause for
 one additional microsecond.

 The adaptive spinning was added fairly recently, so apparently it
 happens fairly frequently that VI_TRYLOCK fails transiently.
 Unfortunately, the real adaptive spinning code cannot be used because it
 will spin forever as long as the thread owning v_interlock is running,
 including when that is because it is spinning for vnode_free_list_mtx.
 Perhaps we can try to VI_TRYLOCK a certain number of times. It is also
 possible to check the contested bit of vnode_free_list_mtx
 (sys/netgraph/netflow/netflow.c does something similar) and stop
 spinning in that case.

 A cpu_spinwait() invocation should also be added to the spin loop.
>>>
>>> There are two 'proper' solutions for this issue:
>>>
>>> One is to change the handling of the vnode lifecycle to allow the
>>> safe block for the vnode interlock acquisition. In particular, the
>>> change would add some stability of the vnode memory when vnode is
>>> put on the free list. As example, the vnode zone could be marked as
>>> type-stable again, and then the vnode interlock can be obtained with
>>> dropped free list lock. Arguably, marking the zone as non-freeable would
>>> be a regression, esp. for the zone which accounts for largest allocation
>>> on the kernel memory.
>>>
>>> Another one is to somehow ensure that the priority is properly
>>> propagated from the spinning thread to the vnode interlock owner.
>>> I think that it is desirable to donate some amount of priority
>>> from the spinning thread.  Unfortunately, I was unable to come
>>> up with elegant solution for this which would be also contained
>>> and did not require rewamp of the mutex interfaces.
>>>
>>> BTW, if anybody come up with the idea of the restructuring the free list
>>> handling to avoid the free list/vnode interlock LOR altogether, it would
>>> be the best.
>>>
>>> I do not have objections against the pause() addition, but I would
>>> argue that should_yield() should be removed then, switching the code to
>>> unconditionally pause when the collision detected.
>>
>> Taking the idea from Jilles, what about replacing should_yield with a
>> check to see if the vnode_free_list_mtx mutex is contented?
>>
>> That would prevent us from doing unnecessary pauses, and only releasing
>> the vnode_free_list_mtx mutex when there's someone else that actually
>> needs it.
> This would still be racy, and possibly allows the lock convoy in the same
> manner as the current code.  Also, AFAIR, the real problem was when
> two iterators start synchronized, it usually ended in live-lock.
> 
> If you are willing, try this, of course, but I tend to agree with just
> an addition of pause() for now.

OK, I've tested replacing kern_yield with a pause the whole night and
that seems to be working as expected, I did no longer see any lockups,
usually during a whole night run I saw at least 3 or 4 lockups.

If you are happy (and others) with the replacement you can commit it and
we can replace the should_yield call later if needed.

Thanks, Roger.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD-HEAD gets stuck on vnode operations

2013-07-02 Thread Andriy Gapon

In addition, should_yield() seems to have a problem:
http://article.gmane.org/gmane.os.freebsd.devel.cvs.src/167287

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"