Re: Patch to allow a driver to report unrecoverable write errors to the buf layer
On Tue, Oct 29, 2002 at 06:34:36PM +0100, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Hanspeter Roth writes: On Oct 18 at 20:45, Maxim Sobolev spoke: again, then again ad infinitum. The same effect if you'll mount write-protected floppy in read/write mode. This is just lame, but I'm not willing to to take a shouting match with the person who committed this brain-damage. As of a write-protected floppy, why is it allowd to be mounted as writeable? The mount should be degraded to readonly or rejected. That's a slightly more involved issue because you would have to actually try to write to it before you find out that you can't. for stable I have a patch, which checks during open for write protection of the floppy if FWRITE bit is set and fails with EPERM if this is the case. This works reliably for me. The reason I haven't sent this patch in is, there is a possible conflict with accesses to a second floppy disk drive at the same time. Anyway, better than panic'ing the machine... Actually all accesses to the controller hardware are serialized thru a state machine fdstate. The Bad Thing is, that this state machine is bound too tight to the strategy (i.e. you get some job done via a buffer or nothing). Best example is the interfacing of formatting via the B_FORMAT/B_XXX kludge. The Right Thing would be to redesign the interface to the state machine to get jobs done from any source (with or without buffer) and maintaining state of write protection. Index: sys/isa/fd.c === RCS file: /usr/cvs/FreeBSD/src/sys/isa/fd.c,v retrieving revision 1.176.2.8 diff -u -r1.176.2.8 fd.c --- sys/isa/fd.c15 May 2002 21:56:14 - 1.176.2.8 +++ sys/isa/fd.c31 Oct 2002 13:06:05 - @@ -1448,6 +1448,21 @@ } } fd-ft = fd_types + type - 1; + if (flags FWRITE) { /* check for write protection */ + int r, s, st3; + s = splbio(); + set_motor(fdc, fd-fdsu, TURNON); /* select drive */ + r = fd_sense_drive_status(fdc, st3); + set_motor(fdc, fd-fdsu, TURNOFF); + fdc-state = RESETCTLR; + splx(s); + if(r != 0) + return(ENXIO); + if (st3 NE7_ST3_WP) { + device_printf(fd-dev, write protected\n); + return(EPERM); + } + } fd-flags |= FD_OPEN; /* * Clearing the DMA overrun counter at open time is a bit messy. Cheers, -- Thomas Zenker c/o Lennartz electronic GmbH Bismarckstrasse 136, D-72072 Tuebingen, Germany Phone: +49-(0)7071-93550 Email: [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Thu, Dec 13, 2001 at 02:58:28AM -0800, Matthew Dillon wrote: @#$@#$ crap. I think I found a dirty-mmap edge case with truncation. It requires a change to vm_page_set_validclean(), which of course is one of the core routines in the VM system. Basically what happens is that ftruncate() calls vnode_pager_setsize() which eventually calls vm_page_set_validclean(). If you happened to mmap() the truncation point shared R+W and dirty it, then truncate to something that isn't a multiple DEV_BSIZE.. for example, if you were to truncate to an offset of '10', and a buffer Matt, what the hell, this seems to very near by a problem I wanted to report since a week: in a data acquisition I have a write process writing to a file backed shared mmapped ringbuffer. There can be several reader processes on this this ringbuffer. Now once i killed the writer for resizing of the ringbuffer and forgot about the readers. The writer truncated the database without unlinking it before. This lead the readers to be running for ever, it seemed so at least. After attaching with gdb I saw, that they were only page faulting nothing more, for ever Something similar I saw with netscape going mad. cheers, Thomas To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD fall on its face in one easy step )
On Thu, Dec 13, 2001 at 01:40:46PM -0800, Matthew Dillon wrote: :Matt, : :what the hell, this seems to very near by a problem I wanted to :report since a week: : :in a data acquisition I have a write process writing to a file :backed shared mmapped ringbuffer. There can be several reader :processes on this this ringbuffer. Now once i killed the writer for :resizing of the ringbuffer and forgot about the readers. The writer :truncated the database without unlinking it before. This lead the :readers to be running for ever, it seemed so at least. After :attaching with gdb I saw, that they were only page faulting nothing :more, for ever : :Something similar I saw with netscape going mad. : :cheers, Thomas That's something else. There's no OS bug there. When you mmap() a file only those pages that are within the file's boundries are valid. So if you ftruncate() the file then all the pages occuring after the (new) file EOF will become invalid and BUSfault if the process touches them. You touched upon the correct solution... remove() the file instead of ftruncate()ing it. The file's data then remains intact for the processes still referencing it. The readers must be catching SIGBUS and retrying (not exiting), causing them to run in a signal loop forever. This is a case of bad programming. I've seen it before... there was a popular IRC bot back in my BEST days which constantly got itself into infinite loops because the guy who wrote it installed a signal handler for SIGBUS. -Matt Matthew Dillon [EMAIL PROTECTED] well, I know, that this was a bug in my software, not to unlink the file first and then truncating :-). But SIGBUS was not catched in the readers. Will try to reproduce it. Thomas To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message