Re: more on Re: Please review: bugfix for vinvalbuf()
Unfortunately, the recent patch to vinvalbuf() hasn't solved all of our problems. We had another, different panic today. The process that caused it was a 'tail' of a growing logfile over NFS. I have actually had this problem before, with FreeBSD 3.4, and reported it then. I believed this PR to be relevant at the time, however, I do not believe this client was writing to the file. [1998/06/23] kern/7028 http://www.freebsd.org/cgi/query-pr.cgi?pr=7028 panic in vinvalbuf when appending/looking at tail of NFS file The system is running 4.4-RELEASE with the vinvalbuf() patch. Debugging information is below. If I can provide any additional information, let me know. Thanks for any help, Doug -- panic message -- SMP 2 cpus IdlePTD 3555328 initial pcb at 2cf300 panicstr: vinvalbuf: flush failed panic messages: --- panic: vinvalbuf: flush failed mp_lock = 0101; cpuid = 1; lapic.id = boot() called on cpu#1 syncing disks... 8 done Uptime: 19d20h13m23s -- gdb session -- (kgdb) back #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:473 #1 0xc016cf8f in boot (howto=0x100) at /usr/src/sys/kern/kern_shutdown.c:313 #2 0xc016d3a9 in panic (fmt=0xc028745a vinvalbuf: flush failed) at /usr/src/sys/kern/kern_shutdown.c:581 #3 0xc019a719 in vinvalbuf (vp=0xd7dde8c0, flags=0x1, cred=0xc2c60780, p=0xd79a0680, slpflag=0x100, slptimeo=0x0) at /usr/src/sys/kern/vfs_subr.c:753 #4 0xc01d0b30 in nfs_vinvalbuf (vp=0xd7dde8c0, flags=0x1, cred=0xc2c60780, p=0xd79a0680, intrflg=0x1) at /usr/src/sys/nfs/nfs_bio.c:1190 #5 0xc01cf668 in nfs_bioread (vp=0xd7dde8c0, uio=0xd7a42ed4, ioflag=0x7f, cred=0xc2c60780) at /usr/src/sys/nfs/nfs_bio.c:401 #6 0xc01f68d2 in nfs_read (ap=0xd7a42e64) at /usr/src/sys/nfs/nfs_vnops.c:1008 #7 0xc01a235c in vn_read (fp=0xc254cd40, uio=0xd7a42ed4, cred=0xc2c60780, flags=0x0, p=0xd79a0680) at vnode_if.h:334 #8 0xc017b690 in dofileread (p=0xd79a0680, fp=0xc254cd40, fd=0x3, buf=0x804d000, nbyte=0x200, offset=0x, flags=0x0) at /usr/src/sys/sys/file.h:146 #9 0xc017b556 in read (p=0xd79a0680, uap=0xd7a42f80) at /usr/src/sys/kern/sys_generic.c:117 #10 0xc025d7b5 in syscall2 (frame={tf_fs = 0x2f, tf_es = 0x2f, tf_ds = 0xbfbf002f, tf_edi = 0x4, tf_esi = 0x280fc3a0, tf_ebp = 0xbfbff8c0, tf_isp = 0xd7a42fd4, tf_ebx = 0x280ea424, tf_edx = 0x37, tf_ecx = 0x37, tf_eax = 0x3, tf_trapno = 0x7, tf_err = 0x2, tf_eip = 0x280defcc, tf_cs = 0x1f, tf_eflags = 0x293, tf_esp = 0xbfbff894, tf_ss = 0x2f}) at /usr/src/sys/i386/i386/trap.c:1155 #11 0xc024a62b in Xint0x80_syscall () cannot read proc at 0 (kgdb) up #1 0xc016cf8f in boot (howto=0x100) at /usr/src/sys/kern/kern_shutdown.c:313 313 dumpsys(); (kgdb) up #2 0xc016d3a9 in panic (fmt=0xc028745a vinvalbuf: flush failed) at /usr/src/sys/kern/kern_shutdown.c:581 581 boot(bootopt); (kgdb) up #3 0xc019a719 in vinvalbuf (vp=0xd7dde8c0, flags=0x1, cred=0xc2c60780, p=0xd79a0680, slpflag=0x100, slptimeo=0x0) at /usr/src/sys/kern/vfs_subr.c:753 753 panic(vinvalbuf: flush failed); (kgdb) print vp-v_dirtyblkhd $1 = {tqh_first = 0x0, tqh_last = 0xd7dde8f4} (kgdb) print vp-v_cleanblkhd $2 = {tqh_first = 0xcc5fa5ec, tqh_last = 0xcc5fa5f4} (kgdb) print *(vp-v_cleanblkhd-tqh_first) $3 = {b_hash = {le_next = 0xcc607e80, le_prev = 0xcc666e9c}, b_vnbufs = { tqe_next = 0x0, tqe_prev = 0xd7dde8ec}, b_freelist = { tqe_next = 0xcc5f8bfc, tqe_prev = 0xcc6060bc}, b_act = {tqe_next = 0x0, tqe_prev = 0xc2001d90}, b_flags = 537919520, b_qindex = 2, b_xflags = 2 '\002', b_lock = {lk_interlock = {lock_data = 0}, lk_flags = 0, lk_sharecount = 0, lk_waitcount = 0, lk_exclusivecount = 0, lk_prio = 20, lk_wmesg = 0xc02860b0 bufwait, lk_timo = 0, lk_lockholder = -1}, b_error = 0, b_bufsize = 3584, b_runningbufspace = 0, b_bcount = 3147, b_resid = 0, b_dev = 0x, b_data = 0xceeec000 ..., b_kvabase = 0xceeec000 ..., b_kvasize = 16384, b_lblkno = 6949, b_blkno = 84, b_offset = 56926208, b_iodone = 0, b_iodone_chain = 0x0, b_vp = 0xd7dde8c0, b_dirtyoff = 0, b_dirtyend = 0, b_rcred = 0x0, b_wcred = 0x0, b_pblkno = 1771566, b_saveaddr = 0x0, b_driver1 = 0x0, b_driver2 = 0x0, b_caller1 = 0x0, b_caller2 = 0x0, b_pager = {pg_spc = 0x0, pg_reqpage = 0}, b_cluster = { cluster_head = {tqh_first = 0xcc5c66a0, tqh_last = 0xcc640720}, cluster_entry = {tqe_next = 0xcc5c66a0, tqe_prev = 0xcc640720}}, b_pages = {0xc0afb4ac, 0x0 repeats 31 times}, b_npages = 1, b_dep = { lh_first = 0x0}, b_chain = {parent = 0x0, count = 0}} (kgdb) up #4 0xc01d0b30 in nfs_vinvalbuf (vp=0xd7dde8c0, flags=0x1, cred=0xc2c60780, p=0xd79a0680, intrflg=0x1) at /usr/src/sys/nfs/nfs_bio.c:1190 1190error = vinvalbuf(vp, flags, cred, p, 0, slptimeo); (kgdb) print p-p_pid $4 = 0x14594 (kgdb) btp 83348 frame 0 at 0xd7a42cb4: ebp d7a42cd8, eip 0xc016cf8f
Re: more on Re: Please review: bugfix for vinvalbuf()
On Fri, Oct 19, 2001 at 09:51:10PM -0700, Matthew Dillon wrote: : :Unfortunately, the recent patch to vinvalbuf() hasn't solved all of :our problems. We had another, different panic today. The process that :caused it was a 'tail' of a growing logfile over NFS. : :I have actually had this problem before, with FreeBSD 3.4, and reported :it then. I believed this PR to be relevant at the time, however, I do :not believe this client was writing to the file. : : [1998/06/23] kern/7028 http://www.freebsd.org/cgi/query-pr.cgi?pr=7028 : panic in vinvalbuf when appending/looking at tail of NFS file : :The system is running 4.4-RELEASE with the vinvalbuf() patch. Debugging :information is below. If I can provide any additional information, :let me know. : :Thanks for any help, :Doug How easily can you reproduce this? How often does it occur if you leave a tail running? -Matt I'm not able to reproduce this at will at the moment. The PR I mention has a program which it claims can cause the crash, which I will try running. I'll also try tailing various logfiles, including the one which caused this crash, which is being written to on the machine that is the NFS server. Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: more on Re: Please review: bugfix for vinvalbuf()
:I'm not able to reproduce this at will at the moment. The PR I mention :has a program which it claims can cause the crash, which I will try :running. I'll also try tailing various logfiles, including the one which :caused this crash, which is being written to on the machine that is the :NFS server. : :Doug In looking at the code the solution may simply be to loop back up to the top of vinvalbuf() instead of panic if we wind up with buffer cache buffers. If V_SAVE is set the act of writing dirty VM pages can cause new buffer to instantiate and result in the panic. But I would really like to get the bug reproduceable to make sure that the solution really does fix it. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: more on Re: Please review: bugfix for vinvalbuf()
:@@ -721,9 +721,9 @@ : } : } : :- while (vp-v_numoutput 0) { :- vp-v_flag |= VBWAIT; :- tsleep(vp-v_numoutput, PVM, vnvlbv, 0); :+ if (VOP_GETVOBJECT(vp, object) == 0) { :+ while (object-paging_in_progress) :+ vm_object_pip_sleep(object, vnvlbv); : } : : splx(s); Hey Douglas, try the patch fragment below and see if you can reproduce the problem. -Matt @@ -721,10 +746,21 @@ } } - while (vp-v_numoutput 0) { - vp-v_flag |= VBWAIT; - tsleep(vp-v_numoutput, PVM, vnvlbv, 0); - } + /* +* Wait for I/O to complete. XXX needs cleaning up. The vnode can +* have write I/O in-progress but if there is a VM object then the +* VM object can also have read-I/O in-progress. +*/ + do { + while (vp-v_numoutput 0) { + vp-v_flag |= VBWAIT; + tsleep(vp-v_numoutput, PVM, vnvlbv, 0); + } + if (VOP_GETVOBJECT(vp, object) == 0) { + while (object-paging_in_progress) + vm_object_pip_sleep(object, vnvlbx); + } + } while (vp-v_numoutput 0); splx(s); To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: more on Re: Please review: bugfix for vinvalbuf()
I totally forgot about that one. Your fix looks good, I'll start testing it. The bigger picture here is that vinvalbuf() is not typically called while a vnode is still active. NFS calls it on active vnodes in order to invalidate the cache when the client detects that the file mtime has been changed by someone else (ugly ugly ugly). So this sort of crash can occur if one client is mmap()ing a file another another client (or the server) writes to the file. -Matt :I recently mentioned on freebsd-stable in message : : [EMAIL PROTECTED] : :a recurring rslock panic which I believe has been caused by the below :mentioned problem in vinvalbuf(). I have worked up a patch for -STABLE :(which should also apply to -CURRENT if there have not been major changes :to this code). The patch is appended to this message for review. : :Data from the crash dump revealed the following: : :SMP 2 cpus :IdlePTD 3555328 :initial pcb at 2cf280 :panicstr: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x0001 :panic messages: :--- :panic: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x0001 :mp_lock = 0001; cpuid = 0; lapic.id = 0100 :boot() called on cpu#0 : :--- : :#0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:473 :#1 0xc016cf8f in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:313 :#2 0xc016d3a9 in panic (fmt=0xc025bcce rslock: cpu: %d, addr: 0x%08x, lock: 0x%08x) :at /usr/src/sys/kern/kern_shutdown.c:581 :#3 0xc025bcce in bsl1 () :#4 0xc021eeac in _unlock_things (fs=0xd7a6dec4, dealloc=0) :at /usr/src/sys/vm/vm_fault.c:147 :#5 0xc021f8c7 in vm_fault (map=0xd7a6bf40, vaddr=673968128, fault_type=1 '\001', : fault_flags=0) at /usr/src/sys/vm/vm_fault.c:826 :#6 0xc025d016 in trap_pfault (frame=0xd7a6dfa8, usermode=1, eva=673972223) :at /usr/src/sys/i386/i386/trap.c:829 :#7 0xc025ca8b in trap (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 99049, : tf_esi = 0, tf_ebp = -1077937884, tf_isp = -676929580, tf_ebx = 48110729, : tf_edx = 0, tf_ecx = 1835007, tf_eax = 672137216, tf_trapno = 12, tf_err = 4, : tf_eip = 134517190, tf_cs = 31, tf_eflags = 66070, tf_esp = -1077937940, : tf_ss = 47}) :at /usr/src/sys/i386/i386/trap.c:359 :#8 0x80491c6 in ?? () :#9 0x8048d9e in ?? () :#10 0x804a078 in ?? () :#11 0x8048b45 in ?? () : :--- : :A quick review of processes revealed a process stuck in vmopar: : :(kgdb) ps :... :46479 d7ffc560 d806e000 235886 1 46394 004006 3 tail vmopar c09120c8 :... : :which was sleeping in vm_object_page_remove() in vinvalbuf(): : :(kgdb) btp 46479 : frame 0 at 0xd806fc8c: ebp d806fcb8, eip 0xc0170051 tsleep+429: mov :0x141(%ebx),%al : frame 1 at 0xd806fcb8: ebp d806fce0, eip 0xc02262e8 vm_object_page_remove+268: : add$0x10,%esp : frame 2 at 0xd806fce0: ebp d806fd2c, eip 0xc019a667 vinvalbuf+803: add :$0x10,%esp : frame 3 at 0xd806fd2c: ebp d806fd60, eip 0xc01d0aa0 nfs_vinvalbuf+264: add : $0x18,%esp : frame 4 at 0xd806fd60: ebp d806fe2c, eip 0xc01cf5d8 nfs_bioread+540: mov :%eax,0xff74(%ebp) : frame 5 at 0xd806fe2c: ebp d806fe44, eip 0xc01f6842 nfs_read+30: jmp :0xc01f6849 nfs_read+37 : frame 6 at 0xd806fe44: ebp d806fe78, eip 0xc01a22cc vn_read+280: mov :%eax,0xffe8(%ebp) : frame 7 at 0xd806fe78: ebp d806fef4, eip 0xc017b690 dofileread+176: mov :%eax,%esi : frame 8 at 0xd806fef4: ebp d806ff28, eip 0xc017b556 read+54: mov%eax,%esi : frame 9 at 0xd806ff28: ebp d806ffa0, eip 0xc025d745 syscall2+537:mov :%eax,0xffb8(%ebp) : : :The patch is below, against vfs_subr.c 1.249.2.11 2001/09/11 : :--- vfs_subr.c Tue Sep 11 04:49:53 2001 :+++ vfs_subr.c.new Wed Sep 26 15:23:09 2001 :@@ -721,9 +721,9 @@ : } : } : :- while (vp-v_numoutput 0) { :- vp-v_flag |= VBWAIT; :- tsleep(vp-v_numoutput, PVM, vnvlbv, 0); :+ if (VOP_GETVOBJECT(vp, object) == 0) { :+ while (object-paging_in_progress) :+ vm_object_pip_sleep(object, vnvlbv); : } : : splx(s); To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: more on Re: Please review: bugfix for vinvalbuf()
On Tue, Jul 10, 2001 at 11:43:29PM -0700, Matt Dillon wrote: :Hi, : :I've just tripped over an obviously long-standing (since about :Jan. 1998) bug in vinvalbuf while looking into PR kern/26224. The :problematic code looks like (on -CURRENT): : : /* : * Destroy the copy in the VM cache, too. : */ : mtx_lock(vp-v_interlock); : if (VOP_GETVOBJECT(vp, object) == 0) { : vm_object_page_remove(object, 0, 0, : (flags V_SAVE) ? TRUE : FALSE); : } : mtx_unlock(vp-v_interlock); : :The locks seem to be needed for file systems that don't perform real :locking (on -STABLE, they are simplelocks). :This, however, is incorrect because vm_object_page_remove may sleep. :I've attached a patch that passes the interlock to :vm_object_page_remove, which in turn passes it to a modified version :of vm_page_sleep, which unlocks it around the sleep. :I think that this is correct, because the object should be in a valid :state when we sleep (and all checks are reexecuted in that case). : :Since I'm not very experienced with vfs and vm stuff, I'd be glad if :this patch could get some review. In particular, is the lock/unlock :pair really still needed, and are there notable differeces in -STABLE :(because the fix would need the be MFC'ed)? : :Thanks, : - thomas Ok, I've looked at this more. What is supposed to happen is that the 'while (vp-v_numoutput) ...' code just above the section you cited is supposed to prevent the deadlock. The while code looks like this: while (vp-v_numoutput 0) { vp-v_flag |= VBWAIT; tsleep(vp-v_numoutput, PVM, vnvlbv, 0); } However, as Ian points out in his followup, it doesn't work: :... :I've seen a related deadlock here in 4.x with NFS; vm_fault locks :a page and then calls vput which aquires the v_interlock. This code :in vinvalbuf locks the v_interlock first, and then vm_object_page_remove() :locks the page. That's a simple lock-order reversal deadlock which I :guess would still exist even with this patch. It doesn't work for the simple reason that vm_fault isn't busying the page for writing, it's busying it for reading, so v_numoutput will be 0. I think the best solution is to change vinvalbuf's wait loop to wait on vm_object-paging_in_progress instead of vp-v_numoutput, or perhaps to wait for both conditions to be satisfied. vm_object-paging_in_progress applies to reads and writes, while vp-v_numoutput only applies to writes. I know this isn't the most correct solution... there is still the lock reversal if vm_object_remove_pages() were ever to sleep. The idea is that it won't sleep if there is no paging in progress because nobody will have any of the object's pages locked. I think it is the best we can do for the moment. There are several places in vm/vm_object.c where the same assumption is made (testing against vm_object-paging_in_progress, though, which works, not vp-v_numoutput). - Thomas, if you are interested this could be a little project for you. See if you can code-up another while() loop to also wait for the object's paging_in_progress count to become 0 in the vinvalbuf() code. Look in vm/vm_object.c for examples of how to construct such a loop. I will be happy to review and test whatever you come up with and I'm sure Ian will too! -Matt I recently mentioned on freebsd-stable in message [EMAIL PROTECTED] a recurring rslock panic which I believe has been caused by the below mentioned problem in vinvalbuf(). I have worked up a patch for -STABLE (which should also apply to -CURRENT if there have not been major changes to this code). The patch is appended to this message for review. Data from the crash dump revealed the following: SMP 2 cpus IdlePTD 3555328 initial pcb at 2cf280 panicstr: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x0001 panic messages: --- panic: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x0001 mp_lock = 0001; cpuid = 0; lapic.id = 0100 boot() called on cpu#0 --- #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:473 #1 0xc016cf8f in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:313 #2 0xc016d3a9 in panic (fmt=0xc025bcce rslock: cpu: %d, addr: 0x%08x, lock: 0x%08x) at /usr/src/sys/kern/kern_shutdown.c:581 #3 0xc025bcce in bsl1 () #4 0xc021eeac in _unlock_things (fs=0xd7a6dec4, dealloc=0) at /usr/src/sys/vm/vm_fault.c:147 #5 0xc021f8c7 in vm_fault (map=0xd7a6bf40, vaddr=673968128, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:826 #6 0xc025d016 in trap_pfault (frame=0xd7a6dfa8, usermode=1, eva=673972223) at /usr/src/sys/i386/i386/trap.c:829 #7 0xc025ca8b in trap (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 99049, tf_esi =