On Mon, 11 Jul 2016, Mark Hounschell wrote: > Well, all that was specified in my original post. I can no longer open the > floppy drive with no floppy media inserted. Worse, I can also no longer open a > floppy with media inserted that is not a "linux" recognized format. A floppy > drive is a removable media device and should be treated as such. The original > implementation of the O_NDELAY flag allowed it to be. > > Any removable media device should be capable of being opened with no, or even > unrecognizable media installed. The kernel and its utilities should not > "assume" to much when it comes to removable media. Consider a SCSI tape drive > or even a removable media SCSI disk drive. How would you explain an open > failure to someone trying to open a SCSI tape drive that had no tape or even a > "non-tar" formatted tape media in it??? > Or better yet, trying to open a removable media device the was write protected > but didn't include O_RDONLY in the open?
Alright, so you are basically supplementing O_NDELAY flag in order to avoid check_disk_change() being called. It's rather a coincidence that it has worked this way, but I agree with you that we can't ignore the fact that there is userspace relying on this behavior. > The original behavior of the floppy driver was correct. I have no idea > what BUG these changes were supposed to fix but the "fix" obviously > broke user land. Was this bug reported by some new ROBOT test or > something? The kernel floppy driver has been stable for years now That's not really true; the code is a racy mess, and this is being uncovered only when virtualized floppy devices started to exist (because they are much faster than a real hardware, and the different timing reveals bugs that were not visible before). This particular fix was because syzkaller found a way how easily corrupt kernel memory using O_NDELAY to floppy driver; see https://lkml.org/lkml/2016/2/2/848 > so I am really confused as to why these changes were induced. The floppy driver is in an orphan mode; no new "features" are added "just because". Everything that's happening there is to fix real bugs in the kernel. I'll look into ways how to fix this, but I am afraid this is going to be really tricky. Therefore we'd have to very likely proceed asap with revert of 09954bad448 and coming up with a workaround that'd still avoid the bug reported by syzkaller. -- Jiri Kosina SUSE Labs