Re: problem for the VM gurus
: VM lookup the page again. Always. vm_fault already does this, : in fact. We would clean up the code and document it to this effect. : : This change would allow us to immediately fix the self-referential : deadlocks and I think it would also allow me to fix a similar bug : in NFS trivially. : : I should point out here that the process of looking up the pages is a :significant amount of the overhead of the routines involved. Although :doing this for just one page is probably sufficiently in the noise as to :not be a concern. It would be for only one page and, besides, it *already* relooksup the page in vm_fault ( to see if the page was ripped out from under the caller ), so the overhead on the change would be very near zero. : The easiest interim solution is to break write atomicy. That is, : unlock the vnode if the backing store of the uio being written is : (A) vnode-pager-backed and (B) not all in-core. : : Uh, I don't think you can safely do that. I thought one of the reasons :for locking a vnode for writes is so that the file metadata doesn't change :underneath you while the write is in progress, but perhaps I'm wrong about :that. : :-DG : :David Greenman The problem can be distilled into the fact that we currently hold an exclusive lock *through* a uiomove that might possibly incur read I/O due to pages not being entirely in core. The problem does *not* occur when we are blocked on meta-data I/O ( such as a BMAP operation ) since meta-data cannot be mmaped. Under current circumstances we already lose read atomicy on the source during the write(), but do not lose write() atomicy. The simple solution is to give up or downgrade the lock on the destination when blocked within the uiomove. We can pre-fault the first two pages of the uio to guarentee a minimum write atomicy I/O size. I suppose this could be extended to pre-faulting the first N pages of the uio, where N is chosen to be reasonably large - like 64K, but we could not guarentee arbitrary write atomicy because the user might decide to write a very large mmap'd buffer ( e.g. megabytes or gigabytes ) and obviously wiring that many pages just won't work. The more complex solution is to implement a separate range lock for I/O that is independant of the vnode lock. This solution would also require deadlock detection and restart handling. Atomicy would be maintained from the point of view of the processes running on the machine but not from the point of view of the physical storage. Since write atomicy is already not maintained from the point of view of the physical storage I don't think this would present a problem. Due to the complexity, however, it could not be used as an interim solution. It would have to be a permanent solution for the programming time to be worth it. Doing range-based deadlock detection and restart handling properly is not trivial. It is something that only databases usually need to do. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
: A permanent vnode locking fix is many months away because core : decided to ask Kirk to fix it, which was news to me at the time. : However, I agree with the idea of having Kirk fix VNode locking. : : Actually, core did no such thing. Kirk told me a month or so ago that he :intended to fix the vnode locking. Not that this is particularly important, :but people shouldn't get the idea that Kirk's involvement had anything to :do with core since it did not. : :-DG : :David Greenman Let me put it this way: You didn't bother to inform anyone else who might have reason to be interested until it came up as an offhand comment at USENIX. Perhaps you should consider not keeping such important events to yourself, eh? Frankly, I am rather miffed -- if I had known that Kirk had expressed an interest a month ago I would have been able to pool our interests earlier. Instead I've been working in a vacuum for a month because I didn't know that someone else was considering trying to solve the problem. This does not fill me with rosy feelings. -Matt Matthew Dillon dil...@backplane.com :Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org :Creator of high-performance Internet servers - http://www.terasolutions.com : To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Matthew Dillon dil...@apollo.backplane.com writes: :A permanent vnode locking fix is many months away because core :decided to ask Kirk to fix it, which was news to me at the time. :However, I agree with the idea of having Kirk fix VNode locking. : : Actually, core did no such thing. Kirk told me a month or so ago that he :intended to fix the vnode locking. Not that this is particularly important, :but people shouldn't get the idea that Kirk's involvement had anything to :do with core since it did not. Let me put it this way: You didn't bother to inform anyone else who might have reason to be interested until it came up as an offhand comment at USENIX. Perhaps you should consider not keeping such important events to yourself, eh? Frankly, I am rather miffed -- if I had known that Kirk had expressed an interest a month ago I would have been able to pool our interests earlier. Instead I've been working in a vacuum for a month because I didn't know that someone else was considering trying to solve the problem. This does not fill me with rosy feelings. Eivind Eklund has also been working on this. It is my understanding that he has a working Perl version of vnode_if.sh, and is about halfway through adding invariants to the locking code to track down locking errors. He stopped working on it about a month or two ago for lack of time; I seem to recall that he had managed to get the kernel to boot and was working on panics (from violated invariants) which occurred during fsck. DES -- Dag-Erling Smorgrav - d...@flood.ping.uio.no To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
* We hack a fix to deal with the mmap/write case. A permanent vnode locking fix is many months away because core decided to ask Kirk to fix it, which was news to me at the time. However, I agree with the idea of having Kirk fix VNode locking. But since this sort of permanent fix is months away, we really need an interim solution to the mmap/write deadlock case. The easiest interim solution is to break write atomicy. That is, unlock the vnode if the backing store of the uio being written is (A) vnode-pager-backed and (B) not all in-core. This will generally fix all known deadlock situations but at the cost of write atomicy in certain cases. We can use the same hack that pipe code uses and only guarentee write atomicy for small block sizes. We would do this by wiring ( and faulting, if necessary ) the first N pages of the uio prior to locking the vnode. We cannot wire all the pages of the uio since the user may specify a very large buffer - megabytes or gigabytes. * Stage 3: Permanent fix is committed by generally fixing vnode locks and VFS layering. ... which may be 6 months if Kirk agrees to do a complete rewrite of the vnode locking algorithms. Regarding atomicy: Remember that you cannot assume that the mappings stay the same during almost any I/O mechanism anymore. The issue of wiring pages and assuming constant mapping has to be resolved. A careful definition of whether or not one is doing I/O to an address or I/O to a specific piece of memory. I know that this is an end condition, but it has consequences as to the effects on the design. (I suspect that a punt to do I/O to a virtual address is correct, but those change, and also disappear.) John To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
A permanent vnode locking fix is many months away because core decided to ask Kirk to fix it, which was news to me at the time. However, I agree with the idea of having Kirk fix VNode locking. Actually, core did no such thing. Kirk told me a month or so ago that he intended to fix the vnode locking. Not that this is particularly important, but people shouldn't get the idea that Kirk's involvement had anything to do with core since it did not. -DG David Greenman Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org Creator of high-performance Internet servers - http://www.terasolutions.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
On Sun, 13 Jun 1999, John S. Dyson wrote: Remember that you cannot assume that the mappings stay the same during almost any I/O mechanism anymore. The issue of wiring pages and assuming constant mapping has to be resolved. A careful definition of whether or not one is doing I/O to an address or I/O to a specific piece of memory. I know that this is an end condition, but it has consequences as to the effects on the design. (I suspect that a punt to do I/O to a virtual address is correct, but those change, and also disappear.) Which brings up the fact that some of us have been talking about making all IO operations refer to PHYSICAL pages at the strategy layer. Consider for Raw IO: read()... user address physio() ... user pages are faulted to ensur ethey are present, then physical addreses are extracted and remapped to KV. addresses strategy.. for DMA devices (most ones we really care about) ... KV addresses are converted to PHYSICAL addresses again. If we changed the iterface so that the UIO passed from physio to the strategy routine held an array of physical addresses we could save quite a bit of work. Also, it wouldn't matter that the pages were or were not mapped, as long as they are locked in ram. For domb devices that don't do DMA, the pages would be mapped by some other special scheme. For pages coming from the buffer cache/vm system, the physical page addresses should be known already somewhere and the physical UIO addresses should be pretty trivially collected for the Strategy. This sounds like a project that could be bitten off and completed pretty quickly. 1/ redefine UIO correctly to include UIO_PHYSSPACE and an appropriate change in iovec to allow physical addresses.. (they may be differnt to virtual addresses in some architectures).. maybe define a phys_iovec[] and make the pointer in UIO a pointer to a union. (?) 2/ change drivers to be able to handle getting a UIO_PHYSSPACE request. This would require adding a routine to map such requests into KV space, for use by the dumb drivers. (all drivers still know how to handle old type requests) 3/ Change the callers at our leasure (e.g. physio, buffercache etc.) anyone have comments? I know I discussed this with the NetBSD guys (e.g. chuq, chuck and jason) and they said htay were looking at similar things. possible gotcha's: you would have to be careful about blocking when allocating teh IOVEC of physical addresses. maybe they would be allocated as part of allocating the UIO. (maybe you'd hav eot specify how many pages the UIO should hold when you allocate it to optimise the allocation. How this fit's into the proposed rewriting of the iorequest (struct buf) rewrite that's been rumbling in the background needs evaluation. julian p.s. david.. you didn't comment on Matt's subission of a plan for attacking the deadlock problem. It sounded reasonable to me but I'm only marginal in this area. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Interesting. It's an overlapping same-process deadlock with mmap/write. This bug also hits NFS, though in a slightly different way, and also occurs with mmap/write when two processes are mmap'ing two files and write()ing the other descriptor using the map as a buffer. I see a three-stage solution: * We change the API for the VM pager *getpages() code. At the moment the caller busies all pages being passed to getpages() and expects the primary page (but not any of the others) to be returned busied. I also believe that some of the code assumes that the page will not be unbusied at all for the duration of the operation ( though vm_fault was hacked to handle the situation where it might have been ). This API is screwing up NFS and would also make it very difficult for general VFS deadlock avoidance to be implemented properly and for a fix to the specific case being discussed in this thread to be implemented properly. I recommend changing the API such that *ALL* passed pages are unbusied prior to return. The caller of getpages() must then VM lookup the page again. Always. vm_fault already does this, in fact. We would clean up the code and document it to this effect. This change would allow us to immediately fix the self-referential deadlocks and I think it would also allow me to fix a similar bug in NFS trivially. I should point out here that the process of looking up the pages is a significant amount of the overhead of the routines involved. Although doing this for just one page is probably sufficiently in the noise as to not be a concern. The easiest interim solution is to break write atomicy. That is, unlock the vnode if the backing store of the uio being written is (A) vnode-pager-backed and (B) not all in-core. Uh, I don't think you can safely do that. I thought one of the reasons for locking a vnode for writes is so that the file metadata doesn't change underneath you while the write is in progress, but perhaps I'm wrong about that. -DG David Greenman Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org Creator of high-performance Internet servers - http://www.terasolutions.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Interesting. It's an overlapping same-process deadlock with mmap/write. This bug also hits NFS, though in a slightly different way, and also occurs with mmap/write when two processes are mmap'ing two files and write()ing the other descriptor using the map as a buffer. I see a three-stage solution: * We change the API for the VM pager *getpages() code. At the moment the caller busies all pages being passed to getpages() and expects the primary page (but not any of the others) to be returned busied. I also believe that some of the code assumes that the page will not be unbusied at all for the duration of the operation ( though vm_fault was hacked to handle the situation where it might have been ). This API is screwing up NFS and would also make it very difficult for general VFS deadlock avoidance to be implemented properly and for a fix to the specific case being discussed in this thread to be implemented properly. I recommend changing the API such that *ALL* passed pages are unbusied prior to return. The caller of getpages() must then VM lookup the page again. Always. vm_fault already does this, in fact. We would clean up the code and document it to this effect. This change would allow us to immediately fix the self-referential deadlocks and I think it would also allow me to fix a similar bug in NFS trivially. * We hack a fix to deal with the mmap/write case. A permanent vnode locking fix is many months away because core decided to ask Kirk to fix it, which was news to me at the time. However, I agree with the idea of having Kirk fix VNode locking. But since this sort of permanent fix is months away, we really need an interim solution to the mmap/write deadlock case. The easiest interim solution is to break write atomicy. That is, unlock the vnode if the backing store of the uio being written is (A) vnode-pager-backed and (B) not all in-core. This will generally fix all known deadlock situations but at the cost of write atomicy in certain cases. We can use the same hack that pipe code uses and only guarentee write atomicy for small block sizes. We would do this by wiring ( and faulting, if necessary ) the first N pages of the uio prior to locking the vnode. We cannot wire all the pages of the uio since the user may specify a very large buffer - megabytes or gigabytes. * Stage 3: Permanent fix is committed by generally fixing vnode locks and VFS layering. ... which may be 6 months if Kirk agrees to do a complete rewrite of the vnode locking algorithms. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Howard Goldstein said: On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : On Mon, 7 Jun 1999, Matthew Dillon wrote: : ... what version of the operating system? : 4.0-CURRENT 3.2R too... I just checked the source (CVS) tree, and something bad happend between 1.27 and 1.29 on ufs_readwrite.c. Unless other things had been changed to make the problem go away, the recursive vnode thing was broken then. I am surprised that was changed that long ago. (The breakage is an example of someone making a change, and not either understanding why the code was there, or forgetting to put the alternative into the code.) -- John | Never try to teach a pig to sing, dy...@iquest.net | it makes one look stupid jdy...@nc.com | and it irritates the pig. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
On Wed, 9 Jun 1999, John S. Dyson wrote: Howard Goldstein said: On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : On Mon, 7 Jun 1999, Matthew Dillon wrote: : ... what version of the operating system? : 4.0-CURRENT 3.2R too... I just checked the source (CVS) tree, and something bad happend between 1.27 and 1.29 on ufs_readwrite.c. Unless other things had been changed to make the problem go away, the recursive vnode thing was broken then. I am surprised that was changed that long ago. (The breakage is an example of someone making a change, and not either understanding why the code was there, or forgetting to put the alternative into the code.) Is that the limit to Bruce's fu*kup, or did he break it elsewhere, too? It'd be nice to get this reversed since it's been found. And FWIW, semenu seems to be the only one to have anything to handle IN_RECURSE, probably because his NTFS code was recently committed and not mangled. -- John | Never try to teach a pig to sing, dy...@iquest.net | it makes one look stupid jdy...@nc.com | and it irritates the pig. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message Brian Feldman_ __ ___ ___ ___ ___ gr...@unixhelp.org_ __ ___ | _ ) __| \ FreeBSD: The Power to Serve! _ __ | _ \._ \ |) | http://www.freebsd.org _ |___)___/___/ To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
On Wed, 9 Jun 1999, John S. Dyson wrote: Howard Goldstein said: On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : On Mon, 7 Jun 1999, Matthew Dillon wrote: : ... what version of the operating system? : 4.0-CURRENT 3.2R too... I just checked the source (CVS) tree, and something bad happend between 1.27 and 1.29 on ufs_readwrite.c. Unless other things had been changed to make the problem go away, the recursive vnode thing was broken then. I am surprised that was changed that long ago. (The breakage is an example of someone making a change, and not either understanding why the code was there, or forgetting to put the alternative into the code.) Is that the limit to Bruce's fu*kup, or did he break it elsewhere, too? It'd be nice to get this reversed since it's been found. And FWIW, semenu seems to be the only one to have anything to handle IN_RECURSE, probably because his NTFS code was recently committed and not mangled. I think that I had most of the filesystems fixed somewhere (in my private tree or in the standard one.) It is easy to make mistakes, but he was also right that there is probably a better way to do it. I suggest putting the recurse stuff back in for a quick fix, and working the problem in more detail in the future. (I could even be wrong if this is where the problem came in -- so much has happened since then :-)). John To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
John S. Dyson writes: Howard Goldstein said: On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : 4.0-CURRENT 3.2R too... I just checked the source (CVS) tree, and something bad happend between 1.27 and 1.29 on ufs_readwrite.c. Unless other things had been changed to make the problem go away, the recursive vnode thing was broken then. I can pretty easily test patches and try other stuff out on a couple of dozen brand new, architecturally (sp) stressed out (memorywise (zero swap, 16mb RAM, mfsroot) and cpu bandwidth wise (386sx40)) 3.1-R (switchable to 3.2R) systems, if it'd be helpful. Should it bring out clues leading to the fix for 'the' golden page-not-present instability it'd be awesome karma. This very limited environment is especially fragile and highly susceptible to consistently reproducing the popular = 3.1R page not present panics. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
John S. Dyson writes: Howard Goldstein said: On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : 4.0-CURRENT 3.2R too... I just checked the source (CVS) tree, and something bad happend between 1.27 and 1.29 on ufs_readwrite.c. Unless other things had been changed to make the problem go away, the recursive vnode thing was broken then. I can pretty easily test patches and try other stuff out on a couple of dozen brand new, architecturally (sp) stressed out (memorywise (zero swap, 16mb RAM, mfsroot) and cpu bandwidth wise (386sx40)) 3.1-R (switchable to 3.2R) systems, if it'd be helpful. Should it bring out clues leading to the fix for 'the' golden page-not-present instability it'd be awesome karma. This very limited environment is especially fragile and highly susceptible to consistently reproducing the popular = 3.1R page not present panics. BTW, one more thing that is useful for testing limited memory situations is setting the MAXMEM config variable. Last time that I looked, it allows you to set the number of K of avail mem. If you try to run with less than MAXMEM=4096 or MAXMEM=5120, you'll have troubles through. John To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Arun Sharma said: bread ffs_read ffs_getpages vnode_pager_getpages vm_fault --- slow_copyin ffs_write vn_write dofilewrite write syscall getblk finds that the buffer is marked B_BUSY and sleeps on it. But I can't figure out who marked it busy. This looks like the historical problem of doing I/O to an mmap'ed region. There are two facets to the problem: One where the I/O is to the same vn, and the other is where the I/O is to a different vn. The case where the I/O is to the same vn had a (short term) fix previously in the code, by allowing for recursive usage of a vn under certain circumstances. The problem of different vn's can be fixed by proper resource handling in vfs_bio (and perhaps other places.) (My memory isn't 100% clear on the code anymore, but you have shown alot of info with your backtrace.) -- John | Never try to teach a pig to sing, dy...@iquest.net | it makes one look stupid jdy...@nc.com | and it irritates the pig. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
One of the problems that would make it sensible to do a complete rewrite of vfs_bio.c is this? Brian Feldman_ __ ___ ___ ___ ___ gr...@unixhelp.org_ __ ___ | _ ) __| \ FreeBSD: The Power to Serve! _ __ | _ \._ \ |) | http://www.freebsd.org _ |___)___/___/ To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
One of the problems that would make it sensible to do a complete rewrite of vfs_bio.c is this? Specifically for that reason, probably not. However, if the effort was taken as an entire and encompassing effort, with the understanding of what is really happening in the code regarding policy (and there is alot more than the original vfs_bio type things), then it would certainly be best. Note that some of the policy might even be marginalized given a restructuring by eliminating the conventional struct buf's for everything except for I/O. In the case of I/O, it would be good to talk to those who work on block drivers, and collect info on what they need. The new definition could replace the struct bufs for the block I/O subsystems, but in many ways could be similar to struct bufs (for backwards compat.) In the current vfs_bio, the continual remapping is problematical, and was one of the very negative side-effects of the backwards compatibility choice. The original vfs_bio merged cache design actually (mostly) eliminated the struct bufs for the buffer cache interfacing, and the temporary mappings thrashed much less often. It would also be good to design in the ability to use physical addressing (for those architectures that don't incur significant additional cost for physically mapping all of memory.) Along with proper design, the fully mapped physical memory would eliminate the need for remapping entirely. Uiomove in this case wouldn't need virtually mapped I/O buffers, and this would be ideal. However, it is unlikely that X86 machines would ever support this option. PPC's, R(X) and Alpha can support mapping all of memory by their various means though. In a sense, the deadlock issue is an example of the initially unforseen problems when hacking on that part of the code. I suggest a carefully orchestrated and organized migration towards the more efficient and perhaps architecturally cleaner approach. The deadlock was an after the fact bug that we found very early on, and there was a temporary fix for part of it, and a mitigation of the other part. Issues like that can be very, very nasty to deal with. John To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Brian Feldman said: In the long-standing tradition of deadlocks, I present to you all a new one. This one locks in getblk, and causes other processes to lock in inode. It's easy to induce, but I have no idea how I'd go about fixing it myself (being very new to that part of the kernel.) Here's the program which induces the deadlock: tmp = mmap(NULL, psize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (tmp == MAP_FAILED) { perror(mmap); exit(1); } printf(write retval == %lld, write(fd, tmp, psize)); I responded earlier to a reply to this message :-). This did work about the time that I left, and it appears that it is likely that code has been removed that mitigated this as a problem. It is important to either modify the way that read or write operations occur (perhaps prefault before letting the uiomove operation occur (yuck, and it also still doesn't close all windows), or reinstate the handling of recursive operations on a vnode by the same process.) Handling the vnode locking in a more sophistcated way would be better, but reinstating (or fixing) the already existant code that used to handle this would be a good fix that will mitigate the problem for now. -- John | Never try to teach a pig to sing, dy...@iquest.net | it makes one look stupid jdy...@nc.com | and it irritates the pig. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
... what version of the operating system? -Matt : In the long-standing tradition of deadlocks, I present to you all a new one. :This one locks in getblk, and causes other processes to lock in inode. It's :easy to induce, but I have no idea how I'd go about fixing it myself :(being very new to that part of the kernel.) : Here's the program which induces the deadlock: : :#include sys/types.h :#include sys/mman.h :... To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
On Mon, 7 Jun 1999, Matthew Dillon wrote: ... what version of the operating system? -Matt 4.0-CURRENT : In the long-standing tradition of deadlocks, I present to you all a new one. :This one locks in getblk, and causes other processes to lock in inode. It's :easy to induce, but I have no idea how I'd go about fixing it myself :(being very new to that part of the kernel.) : Here's the program which induces the deadlock: : :#include sys/types.h :#include sys/mman.h :... Brian Feldman_ __ ___ ___ ___ ___ gr...@unixhelp.org_ __ ___ | _ ) __| \ FreeBSD: The Power to Serve! _ __ | _ \._ \ |) | http://www.freebsd.org _ |___)___/___/ To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
On Mon, 7 Jun 1999 18:38:51 -0400 (EDT), Brian Feldman gr...@unixhelp.org wrote: : On Mon, 7 Jun 1999, Matthew Dillon wrote: : ... what version of the operating system? : 4.0-CURRENT 3.2R too... To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
problem for the VM gurus
In the long-standing tradition of deadlocks, I present to you all a new one. This one locks in getblk, and causes other processes to lock in inode. It's easy to induce, but I have no idea how I'd go about fixing it myself (being very new to that part of the kernel.) Here's the program which induces the deadlock: #include sys/types.h #include sys/mman.h #include stdio.h #include fcntl.h #include unistd.h int main(int argc, char **argv) { int psize = getpagesize() * 2; void *tmp; char name[] = m.XX; int fd = mkstemp(name); if (fd == -1) { perror(open); exit(1); } tmp = mmap(NULL, psize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (tmp == MAP_FAILED) { perror(mmap); exit(1); } printf(write retval == %lld, write(fd, tmp, psize)); unlink(name); exit(0); } Brian Feldman_ __ ___ ___ ___ ___ gr...@unixhelp.org_ __ ___ | _ ) __| \ FreeBSD: The Power to Serve! _ __ | _ \._ \ |) | http://www.freebsd.org _ |___)___/___/ To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: problem for the VM gurus
Brian Feldman gr...@unixhelp.org writes: In the long-standing tradition of deadlocks, I present to you all a new one. This one locks in getblk, and causes other processes to lock in inode. It's easy to induce, but I have no idea how I'd go about fixing it myself (being very new to that part of the kernel.) Here's the program which induces the deadlock: I could reproduce it with 4.0-current. The stack trace was: tsleep getblk bread ffs_read ffs_getpages vnode_pager_getpages vm_fault --- slow_copyin ffs_write vn_write dofilewrite write syscall getblk finds that the buffer is marked B_BUSY and sleeps on it. But I can't figure out who marked it busy. -Arun PS: Does anyone know how to get the stack trace by pid in ddb ? I can manually type trace p-p_addr. But is there an easier way ? To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message