Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
In article <[EMAIL PROTECTED]>, Sergey Babkin <[EMAIL PROTECTED]> writes > >By the way the journaling filesystems don't neccessary guarantee that >you won't need fsck: for example, if VXFS crashes at a particularly >bad moment, it will require you to do "fsck -o full" which is as slow >as the fsck on traditional UFS. JFS still scores against traditional Unix file systems on large volumes, (e.g. Terabytes), as it requires very small amounts of virtual memory during a full fsck. ttfn, Tony To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Tue, 18 Dec 2001, Brandon D. Valentine wrote: > On Tue, 18 Dec 2001, Mike Bristow wrote: > > >I suspect that the background fsck[1] that's available in FreeBSD-current > >fits the bill just as well as JFS or XFS - and I'll also bet that it'll > >be available in a FreeBSD-release before I'd trust data to a port of > >JFS or XFS. > > This is a killer feature. Has anyone decided whether snapshots and > background fsck will ever be backported to the RELENG_4 branch or are > they destined for 5.0? In a word (or two): highly unlikely. This code has been considered experimental for a while now, and I expect that it will remain so. While it has been gradually improving stability (it no longer toasts your system when you send a kill signal to fsck_ffs in the background), a number of usability factors are still being addressed. Kirk recently committed several performance improvements that (apparently) result in a far more usable system during the background fsck. Previously, my system was available, but largely unuseful, during the background fsck. This code relies on the FFS snapshot feature, which is also not as widely tested, and has some compatibility concerns. If the support for snapshots hasn't yet been MFC'd to -STABLE fsck, we may want to consider doing so; last time I checked, if a snapshot was found by RELENG_4's fsck, it would be rather sadly removed with some unhappiness from fsck. As such, I'd probably resist efforts to MFC this code, and just go for inclusion in 5.0-RELEASE. We'll need to give it a lot of testing however. :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Project [EMAIL PROTECTED] NAI Labs, Safeport Network Services To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
Alfred Perlstein wrote: > > By the way the journaling filesystems don't neccessary guarantee that > > you won't need fsck: for example, if VXFS crashes at a particularly > > bad moment, it will require you to do "fsck -o full" which is as slow > > as the fsck on traditional UFS. > > Yeah, but that's not mentioned in the whitepaper! :) Your insane humor quotient is very high today... Actually, this is mentioned in the white papers of all journalling FSs, but is generally glossed over with application specific hardware that is missing on PCs, which will record the cause of the failure across a reboot, and will throw a chock in front of the wheels before a bad write on a power failure... something IDE drives fail to do, but SCSI drives do not (or did not, until recently). Of course, you can't just use PC CMOS for this because of the lack of DC hold up time and AC fail notification in standard PC power supplies. You owe the Oracle your first born child, and , because of the GPL, anyone who marries your first born child owes the Oracle their first born child, and so on, recursively and eternally, forever after. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
* Sergey Babkin <[EMAIL PROTECTED]> [011218 19:45] wrote: > Dan Nelson wrote: > > > > In the last episode (Dec 18), Mike Bristow said: > > > I suspect that the background fsck[1] that's available in FreeBSD-current > > > fits the bill just as well as JFS or XFS - and I'll also bet that it'll > > > be available in a FreeBSD-release before I'd trust data to a port of > > > JFS or XFS. > > > > The problems with a background fsck is you still have to run fsck, > > which can take 10 minutes on a large volume when it's idle, and who > > By the way the journaling filesystems don't neccessary guarantee that > you won't need fsck: for example, if VXFS crashes at a particularly > bad moment, it will require you to do "fsck -o full" which is as slow > as the fsck on traditional UFS. Yeah, but that's not mentioned in the whitepaper! :) -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
Dan Nelson wrote: > > In the last episode (Dec 18), Mike Bristow said: > > I suspect that the background fsck[1] that's available in FreeBSD-current > > fits the bill just as well as JFS or XFS - and I'll also bet that it'll > > be available in a FreeBSD-release before I'd trust data to a port of > > JFS or XFS. > > The problems with a background fsck is you still have to run fsck, > which can take 10 minutes on a large volume when it's idle, and who By the way the journaling filesystems don't neccessary guarantee that you won't need fsck: for example, if VXFS crashes at a particularly bad moment, it will require you to do "fsck -o full" which is as slow as the fsck on traditional UFS. -SB To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
In the last episode (Dec 18), Mike Bristow said: > I suspect that the background fsck[1] that's available in FreeBSD-current > fits the bill just as well as JFS or XFS - and I'll also bet that it'll > be available in a FreeBSD-release before I'd trust data to a port of > JFS or XFS. The problems with a background fsck is you still have to run fsck, which can take 10 minutes on a large volume when it's idle, and who knows how long as a background process when the system's up. It might not even finish at all if a user starts modifying a large file, causing the snapshot file that the background fsck is using to grow and fill up the filesystem. Unlikely, but possible if your disk is almost full already. -- Dan Nelson [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Thu, Dec 13, 2001 at 04:39:58AM -0500, Brandon D. Valentine wrote: > On Wed, 12 Dec 2001, Matthew Dillon wrote: > > >All I can say is... holy shit! > > Dude, you kick ass. At work I've been dealing with Linux's crappy NFS > implementation for years, while FreeBSD has always been pretty damn good > by comparison. Linux finally got a decent amount of performance under > 2.4 (which finally does NFSv3 to hosts other than other Linux boxen), > but it still can't touch the FreeBSD NFS implementation. The more > robust you make it the easier it is for me to argue for deployment of > more FreeBSD systems in NFS server roles. The only advantage Linux has > got right now is XFS, which is admittedly a pretty large advantage on > multi terabyte filesystems where fsck is impossible. I'm guessing that the real requirment here is is "when the system is turned on after an unclean shutdown (eg, power failure), it should be able to export it's NFS filesystems quickly". I suspect that the background fsck[1] that's available in FreeBSD-current fits the bill just as well as JFS or XFS - and I'll also bet that it'll be available in a FreeBSD-release before I'd trust data to a port of JFS or XFS. [1] If you've missed it, the basic idea is: for $fs in $all_filesystems ; do if is_a_softupdate_filesystem($fs) ; then fsck $fs & else fsck $fs fi done except it happens in fsck itself, rather than a shell script. -- Mike Bristow, embonpointful, but not managerial, damnit. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
I'm trying to get the license issue clarified, then it can go in /usr/src/tools/regression. - Jordan > Jordan Hubbard <[EMAIL PROTECTED]> writes: > > > Guy Harris of NetApp sent me a whole mess-o-changes to it and when I > > went to forward them to you, I found that I must have been in > > delete-o-matic mode at some point earlier in my inbox since it was > > gone. I've requested that he send them to me again and will forward > > them to you once I get a copy again. Whoops! > > Would it be worth making a port for this tool? It sounds like it's > too important to get lost in a mailing list archive. There's a > precedence set by having /usr/ports/sysutils/crashme. :-) > > -Dom > > -- > | Semantico: creators of major online resources | > | URL: http://www.semantico.com/ | > | Tel: +44 (1273) 72 | > | Address: 33 Bond St., Brighton, Sussex, BN1 1RD, UK. | To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
In article <[EMAIL PROTECTED]>, Brandon D. Valentine <[EMAIL PROTECTED]> writes > [snip] >but it still can't touch the FreeBSD NFS implementation. The more >robust you make it the easier it is for me to argue for deployment of >more FreeBSD systems in NFS server roles. The only advantage Linux has >got right now is XFS, which is admittedly a pretty large advantage on >multi terabyte filesystems where fsck is impossible. That is what I wanted to hear, an unambiguous argument that a solid implementation of JFS would be useful to some user segment. Tony To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
JFWIW, you can build fsx with minimal or no changes on Windows with David Korn's UWIN kit. All of the other posix-y kits have internal problems that will cause spurious failures. If you want to use Windows boxes as test clients (probably a good idea) this is fairly important... > > I gave out fsx source code at the recent CIFS (SMB) plugfest. If I make > > the 2002 Connectathon I'll give it out there too. I don't test it on > > Windows so those defines may be in need of repair. Please send me any > > patches or cool additions. > > Guy Harris of NetApp sent me a whole mess-o-changes to it and when I > went to forward them to you, I found that I must have been in > delete-o-matic mode at some point earlier in my inbox since it was > gone. I've requested that he send them to me again and will forward > them to you once I get a copy again. Whoops! > > - Jordan -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
> I gave out fsx source code at the recent CIFS (SMB) plugfest. If I make > the 2002 Connectathon I'll give it out there too. I don't test it on > Windows so those defines may be in need of repair. Please send me any > patches or cool additions. Guy Harris of NetApp sent me a whole mess-o-changes to it and when I went to forward them to you, I found that I must have been in delete-o-matic mode at some point earlier in my inbox since it was gone. I've requested that he send them to me again and will forward them to you once I get a copy again. Whoops! - Jordan To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Thu, Dec 13, 2001 at 01:40:46PM -0800, Matthew Dillon wrote: > > :Matt, > : > :what the hell, this seems to very near by a problem I wanted to > :report since a week: > : > :in a data acquisition I have a write process writing to a file > :backed shared mmapped ringbuffer. There can be several reader > :processes on this this ringbuffer. Now once i killed the writer for > :resizing of the ringbuffer and forgot about the readers. The writer > :truncated the database without unlinking it before. This lead the > :readers to be running for ever, it seemed so at least. After > :attaching with gdb I saw, that they were only page faulting nothing > :more, for ever > : > :Something similar I saw with netscape going mad. > : > :cheers, Thomas > > That's something else. There's no OS bug there. When you mmap() > a file only those pages that are within the file's boundries are > valid. So if you ftruncate() the file then all the pages occuring > after the (new) file EOF will become invalid and BUSfault if the > process touches them. > > You touched upon the correct solution... remove() the file instead > of ftruncate()ing it. The file's data then remains intact for the > processes still referencing it. > > The readers must be catching SIGBUS and retrying (not exiting), > causing them to run in a signal loop forever. This is a case of > bad programming. I've seen it before... there was a popular IRC > bot back in my BEST days which constantly got itself into infinite > loops because the guy who wrote it installed a signal handler for > SIGBUS. > > -Matt > Matthew Dillon > <[EMAIL PROTECTED]> well, I know, that this was a bug in my software, not to unlink the file first and then truncating :-). But SIGBUS was not catched in the readers. Will try to reproduce it. Thomas To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
:Matt, : :what the hell, this seems to very near by a problem I wanted to :report since a week: : :in a data acquisition I have a write process writing to a file :backed shared mmapped ringbuffer. There can be several reader :processes on this this ringbuffer. Now once i killed the writer for :resizing of the ringbuffer and forgot about the readers. The writer :truncated the database without unlinking it before. This lead the :readers to be running for ever, it seemed so at least. After :attaching with gdb I saw, that they were only page faulting nothing :more, for ever : :Something similar I saw with netscape going mad. : :cheers, Thomas That's something else. There's no OS bug there. When you mmap() a file only those pages that are within the file's boundries are valid. So if you ftruncate() the file then all the pages occuring after the (new) file EOF will become invalid and BUSfault if the process touches them. You touched upon the correct solution... remove() the file instead of ftruncate()ing it. The file's data then remains intact for the processes still referencing it. The readers must be catching SIGBUS and retrying (not exiting), causing them to run in a signal loop forever. This is a case of bad programming. I've seen it before... there was a popular IRC bot back in my BEST days which constantly got itself into infinite loops because the guy who wrote it installed a signal handler for SIGBUS. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Thu, Dec 13, 2001 at 02:58:28AM -0800, Matthew Dillon wrote: > > @#$@#$ crap. I think I found a dirty-mmap edge case with truncation. > It requires a change to vm_page_set_validclean(), which of course is > one of the core routines in the VM system. > > Basically what happens is that ftruncate() calls vnode_pager_setsize() > which eventually calls vm_page_set_validclean(). > > If you happened to mmap() the truncation point shared R+W and > dirty it, then truncate to something that isn't a multiple DEV_BSIZE.. > for example, if you were to truncate to an offset of '10', and a buffer Matt, what the hell, this seems to very near by a problem I wanted to report since a week: in a data acquisition I have a write process writing to a file backed shared mmapped ringbuffer. There can be several reader processes on this this ringbuffer. Now once i killed the writer for resizing of the ringbuffer and forgot about the readers. The writer truncated the database without unlinking it before. This lead the readers to be running for ever, it seemed so at least. After attaching with gdb I saw, that they were only page faulting nothing more, for ever Something similar I saw with netscape going mad. cheers, Thomas To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
@#$@#$ crap. I think I found a dirty-mmap edge case with truncation. It requires a change to vm_page_set_validclean(), which of course is one of the core routines in the VM system. Basically what happens is that ftruncate() calls vnode_pager_setsize() which eventually calls vm_page_set_validclean(). If you happened to mmap() the truncation point shared R+W and dirty it, then truncate to something that isn't a multiple DEV_BSIZE.. for example, if you were to truncate to an offset of '10', and a buffer has not been instantiated or marked dirty for the block yet, then the truncate operation will clear the dirty bit on the page and your 10 bytes of dirty data will never get synced and will disappear if the page is freed. vm_page_set_validclean() needs to set the valid bits and clear the dirty bits associated with (base,size) within the page. If base and/or size is unaligned then the valid and dirty bits encompass the bits associated with any overlapping DEV_BSIZEd chunks. This is fine for setting valid, but not correct when clearing dirty. Only dirty bits for DEV_BSIZE chunks that are fully enclosed in the range can be cleared. The fix is easy, but a little scary due to being right smack in the middle of the VM system. -- In anycase, I think I got it licked. I'm going to run this nfs tester program overnight on a local filesystem, NFSv2, and NFSv3 mount. Cross your fingers! If it survives I'll start comitting to -current tomorrow. I give it about a 70% chance of surviving. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
> Thanks! I'm slowly whacking the bugs. I just fixed another one... That's awesome... I'd hoped this program might help you find a few things, but I never expected you to find so many bugs in NFS so... quickly! I certainly didn't expect you to tickle any local filesystem problems either. :) > I think I can make it perfect. I'll post another patch tomorrow. Thanks. With 4.5 imminent these improvements are, to say state the flagrantly obvious, very timely indeed. - Jordan To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
: : Very cool. Good job! : :-DG : :David Greenman :Co-founder, The FreeBSD Project - http://www.freebsd.org Thanks! I'm slowly whacking the bugs. I just fixed another one... vtruncbuf() handles the buffers beyond the file EOF but doesn't handle the buffer straddling the truncation point, so I had to augment the NFS client's truncation code to deal with that. With that fixed the tester program got to 34483 operations before finding a problem. Hopefully I'm in the home stretch now :-) What I really love about this program is that the problems are so repeatable. So far the same failure occurs at exactly the same place, every time. It makes it unbelievably easy to track the bugs down. I think I can make it perfect. I'll post another patch tomorrow. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
Very cool. Good job! -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org President, TeraSolutions, Inc. - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message