Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
Alexander Viro wrote: ioctls are evil, period. At least with these names you can use normal scripting and don't need any special tools. Every ioctl means a binary that has no business to exist. Special names are butt-ugly. ioctl's can be replaced with games on /proc or whatever, which are better than special names. What about partition editing on other OSs? There's no reason why fdisk/parted/etc. should be Linux only. Why should the kernel need to know how to write partition tables? It needs to read them. Writing doesn't add much. Wrong. When you read, you throw out 90% of the useless crap. When you write, you need to know about it, and provide interfaces for it. I'd rather see trivial partitioning tools that consist only of UI code in case of Linux. Some stuff friendly partition tools should have, IMHO: (1) ability to predict what's going to happen. That way, you can play around until it looks nice, and hit the friendly commit button. (2) ability to do data recovery (eg: probe for signatures where it expects the start of partitions to occur. You can be intelligent/quick about it, by knowing about alignment stuff, for example) (3) ability to convert between partition table types (and even LVM ;-) This can be tricky because of alignment stuff. So: (1) could be done in-kernel by being able to discard changes, and re-reading, I guess. (2) and (3) really only need alignment stuff. Also, you need to be able to deal with legacy stuff, like setting magic flags for booting. Also, different partition table formats have different alignment constraints (which is relevant for creating partitions). These mainly need to be respected for other braindead OS's and/or BIOSes. Communicating those between user/kernel space doesn't excite me. So don't communicate them. So, what do you do? Sometimes, you want to force alignment violations (eg: recovering an accidently deleted partition) The real problem happens when you want to resize file systems, and you need to simultaneously satisfy resizer and partition table constraints. (there are currently no resizers like this, but an ext2-resize-the-start and NTFS-resize-the-start would definitely be like this... when I get time to write them. It's pure luck that you don't need this for FAT, but this causes all sorts of headaches for Linux...) Anyway, you have one constraint in user space, and one in the kernel... how do you find the intersection? Libtool friends deals with version skew (ugly, but it works...) With statically linked binaries? How? Why do we need them? You can write wrappers for libraries. Uh-huh. And you can write them for ioctls. We had been busily doing that for years. Results are not pretty, to put it very mildly. If you can get everything into a nice file system interface, then you've convinced me. BTW, most of the code can very well sit in the userland, but that's another story (userland filesystems). Anyway, there's only one way to settle such stuff - sit down and write the patch. Which is what I'm going to do. Have fun. So, my patch will be about 50 lines in parted, to call blkpg, and provide a kernelread command... But, philosophy essay to write... :-( (you have to wait until Monday) Then you can rm -r fs/partitions But, I don't see how patches will settle anything, when we're arguing over interfaces stuff needed for partition tools. Or are you writing patches for Parted as well? Andrew Clausen - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
On Sat, 19 May 2001, Andrew Morton wrote: Alexander Viro wrote: (2) what about bootstrapping? how do you find the root device? Do you do root=/dev/hda/offset=63,limit=1235823? Bit nasty. Ben's patch makes initrd mandatory. Can this be fixed? I've *never* had to futz with initrd. Probably most systems are the same. It seems a step backward to make it necessary. Well, if you remove partition table parsing from the kernel - you've got to boot with root on unpartitioned device (e.g. /dev/ram0) and either stay that way or bring the userland code that understands partitioning on that device... - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
Here's a dumb question, and I apologize if I am questioning computer science dogma... Why are LVM and EVMS(competing LVM project) needed at all? Surely the same can be accomplished with * md * snapshot blkdev (attached in previous e-mail) * giving partitions and blkdevs the ability to grow and shrink * giving filesystems the ability to grow and shrink On-line optimization (defrag, etc) shouldn't be hard once you have the ability to move blocks and files around, which would come with the ability to grow and shrink blkdevs and fs's. -- Jeff Garzik | Do you have to make light of everything?! Building 1024| I'm extremely serious about nailing your MandrakeSoft | step-daughter, but other than that, yes. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
Alexander Viro wrote: Folks, before you get all excited about cramming side effects into open(2), consider the following case: 1) opening /dev/zero/start_nuclear_war has a certain side effect. 2) Local user does the following: ln -sf /dev/zero/start_nuclear_war bar while true; do mkdir foo rmdir foo ln -sf bar foo rm foo done 3) Comes the night and root runs (from crontab) updatedb(8). Said beast includes find(1). With sufficiently bad timing find _will_ be tricked into attempt to open foo. It will honestly lstat() it, all right. But there's no way to make sure that subsequent open() on the found directory will get the same object. 4) Side effect happens... Similar scenarios can be found for other programs run by/as root, but I think that the point is obvious - side effects on open() are not a good idea. Yes, we can play with checking for O_DIRECTORY, yodda, yodda, but I wouldn't bet a dime on security of a system with such side effects. A lot of stuff relies on the fact that close(open(foo, O_RDONLY)) is a no-op. Breaking that assumption is a Bad Thing(tm). Can't this easily avoided if the needed action is not /dev/zero/start_nuclear_war or /dev/zero/start_nuclear_war but echo I'm evil /dev/zero/start_nuclear_war ? -- Abramo Bagnara mailto:[EMAIL PROTECTED] Opera Unica Phone: +39.546.656023 Via Emilia Interna, 140 48014 Castel Bolognese (RA) - Italy ALSA project http://www.alsa-project.org It sounds good! - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
[EMAIL PROTECTED] wrote: Hmm. You know that I wrote this long ago? Well, let's not get too hung up on the disk thing (yeah, I started it...). Ben's intent here is to *demonstrate* how argv-style info can be passed into device nodes. It seems neat, and nice. We can also make use of a strong argument parsing library in the kernel - there are a great number of open-coded string bashing functions which could be rationalised and regularised. So. When am I going to be able to: open(/bin/ls,-l,/etc/passwd, O_RDONLY); ? - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion codein userspace
On Sat, 19 May 2001, Ben LaHaise wrote: 1. Generic lookup method and argument parsiing (fs/lookupargs.c) Looks sane. 2. Restricted block device (drivers/block/blkrestrict.c) This is not very user-friendly, but along with symlinks this makes perfect sense. It would make partition handling a _lot_ simpler. Note, however, that I think the restricted block device is a much more generic issue than just block devices. I've already discussed with Alan the possibility of making _all_ file descriptors have the notion of restrictions, notably the start, end kind of things. It is very useful for other things too - imagine opening /dev/mem, and wanting to pass a restricted portiong of it to other processes with the standard file descriptor passing facilities (think secure DGA for the X server, but also think untrusted users that can read parts of shared files etc - a suid program that opens a file, restricts it, drops privileges and knows that the program can only access a specific part of the file) 3. Userspace partition code proposal Yes and no. I absolutely thihnk the idea that users actually _using_ these names is a horrible one, and fraught with potential for much too easy mistakes that end up being disastrous. But having symlinks that are created by a special program would be ok. [ Also, note how symlinks would make the point of initrd completely moot. You don't have to have initrd to initialize the thing, you can initialize the thing at installation time and when doing fdisk, and the symlinks would act as the permanent markers. ] HOWEVER, you have to realize that there are serious security and maintenance issues here, and I think your idea breaks down completely because of that. The thing is, you only have permissions on a per-object basis, and it's common practice to have different permissions for different partitions. Your scheme does not allow this. Which means that it is fundamentally broken. Sorry. So don't go overboard. The name-based thing is useful, but it's useful for only certain things. And you must _never_ forget the security and management issues. For example, if you can open a serial port in the first place, you can set its baud-rate. So it's ok to make baud-rate part of the name. And once you have permission to read /dev/fd0 it doesn't make sense to limit you to one particular format. So it's ok to have the disk format be part of the name. But it's not possible to make the partition be a name issue. Because while you obviously need different names, you _also_ need different permissions. Linus - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
Alexander Viro writes: On Sat, 19 May 2001, Richard Gooch wrote: The transaction(2) syscall can be just as easily abused as ioctl(2) in this respect. People can pass pointers to ill-designed structures very Right. Moreover, it's not needed. The same functionality can be trivially implemented by write() and read(). As the matter of fact, had been done in userland context for decades. Go and buy Stevens. Read it. Then come back. I don't need to read it. Don't be insulting. Sure, you *can* use a write(2)/read(2) cycle. But that's two syscalls compared to one with ioctl(2) or transaction(2). That can matter to some applications. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, May 19, 2001 at 12:51:23PM -0600, Richard Gooch wrote: Al, if you really want to kill ioctl(2), then perhaps you should implement a transaction(2) syscall. Something like: int transaction (int fd, void *rbuf, size_t rlen, void *wbuf, size_t wlen); Of course, there wouldn't be any practical gain, since we already have ioctl(2). Any gain would be aesthetic. I can tell you haven't had to write any 32-bit ioctl emulation code for a 64-bit kernel recently. -- Revolutions do not require corporate support. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sun, 20 May 2001, Ingo Oeser wrote: PS: English is neither mine, nor Linus native language. Why do the English natives complain instead of us? ;-) Because we had some experience with, erm, localized systems and for Alan it's most likely pure theory? ;-) I think its important its considered. I do like the idea of a sensible ioctl encoding (including ascii potentially) and being able to ship ioctls over the network. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, May 19, 2001 at 10:22:55PM -0400, Richard Gooch wrote: The transaction(2) syscall can be just as easily abused as ioctl(2) in this respect. But read() and write() cannot. -- Revolutions do not require corporate support. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH] device arguments from lookup)
Alexander Viro writes: Folks, before you get all excited about cramming side effects into open(2), consider ... I agree completely. A lot of stuff relies on the fact that close(open(foo, O_RDONLY)) is a no-op. Breaking that assumption is a Bad Thing(tm). Also here I would like to agree. Unfortunately this is false. Opening device files often has interesting side effects. Andries - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
On Sat, 19 May 2001, Jeff Garzik wrote: Are we talking about device arguments just for chrdevs and blkdevs? (ie. drivers) or for regular files too? Let's distinguish between per-fd effects (that's what name in open(name, flags) is for - you are asking for descriptor and telling what behaviour do you want for IO on it) and system-wide side effects. IMO encoding the former into name is perfectly fine, and no write on another file can be sanely used for that purpose. For the latter, though, we need to write commands into files and here your miscdevices (or procfs files, or /dev/foo/ctl - whatever) is needed. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
Aaron Lehmann wrote: On Sat, May 19, 2001 at 08:05:02PM +0200, [EMAIL PROTECTED] wrote: initrd is an unnecessary pain in the ass for most people. It had better not become mandatory. You would not notice the difference, only your kernel would be a bit smaller and the RRPART ioctl disappears. Would I not notice the difference as a user, as a sysadmin, as a kernel builder, as a kernel hacker, or all of the above? If I understand the status of stuff correctly, I think this would make it a lot more painful to admin if it became a requirement to use initrd on everything just to be able to boot. If you've ever seen the way some of the bootloaders for alterate platforms (like ppc and 68k) handle booting, you'd see what a pain it can be to have anything more than a simple string of options getting passed to the kernel. It's particularly bad on some of the embedded ppc platforms. I suspect that if this happened, it would never be allowed into many people's trees, because it would be worth their effort to maintain different code so they don't have to squeeze an initrd on flash along with their kernel and root filesystem. If I'm missing the boat here, please tell me, but it sure seems like a bad idea to me. Brad Boyer - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
Abramo Bagnara wrote: Alexander Viro wrote: Folks, before you get all excited about cramming side effects into open(2), consider the following case: 1) opening /dev/zero/start_nuclear_war has a certain side effect. [...] Can't this easily avoided if the needed action is not /dev/zero/start_nuclear_war or /dev/zero/start_nuclear_war but echo I'm evil /dev/zero/start_nuclear_war ? Yes, and that is exactly the difference between having a side effect on the open(2), versus having the effect as a result of a write(2). Unfortunately, there are already some cases where an open on a device can have unexpected results. If you don't want to get blocked waiting for the carrier-detect signal from the modem when opening a tty device, you had better specify the O_NONBLOCK option on the open. If you don't want this flag to be active during the actual I/O operations, then you would have to do an fcntl to clear the O_NONBLOCK again after the open. So I guess things have already been a bit messy in this area for many years, even before linux even existed, and in some cases you can't really do anything about it because the behaviour is mandated by the applicable standards, like POSIX, SUS, or whatever. (The blocking of the open on a tty device is explicitly documented in my copy of the X/Open specification.) Fortunately, blocking the nightly backup program by making it accidentally open a tty is not quite as catastrophic as having it start a nuclear war, or format the disks, or something, just because a user was playing games with symlinks. -- Willem Konynenberg [EMAIL PROTECTED] I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question -- Charles Babbage - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
On Sat, 19 May 2001, Ben LaHaise wrote: It's not done yet, but similar techniques would be applied. I envision that a raid device would support operations such as open(/dev/md0/slot=5,hot-add=/dev/sda) Think for a moment and you'll see why it's not only ugly as hell, but simply won't work. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, May 19, 2001 at 06:48:19PM +0200, Erik Mouw wrote: One of the fundamentals of Unix is that everything is a file and that you can do everything by reading or writing that file. But /dev/sda/offset=234234,limit=626737537 isn't a file! ls it and see if it's there. writing to files that aren't shown in directory listings is plain evil. I really don't want to explain why. It's extremely messy and unintuitive. It would be better to do this with a file that does exist, for example writing something to /proc/disks/sda/arguments. Then again, I don't even think much of dynamic file systems in the first place. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
Matthew Wilcox writes: On Sat, May 19, 2001 at 12:51:23PM -0600, Richard Gooch wrote: Al, if you really want to kill ioctl(2), then perhaps you should implement a transaction(2) syscall. Something like: int transaction (int fd, void *rbuf, size_t rlen, void *wbuf, size_t wlen); Of course, there wouldn't be any practical gain, since we already have ioctl(2). Any gain would be aesthetic. I can tell you haven't had to write any 32-bit ioctl emulation code for a 64-bit kernel recently. The transaction(2) syscall can be just as easily abused as ioctl(2) in this respect. People can pass pointers to ill-designed structures very easily. The main advantage of transaction(2) is that hopefully, people will not be so bone-headed as to forget to pass sizeof *structptr as the size field. So perhaps some error trapping is possible. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
On Sat, 19 May 2001, Abramo Bagnara wrote: Can't this easily avoided if the needed action is not /dev/zero/start_nuclear_war or /dev/zero/start_nuclear_war but echo I'm evil /dev/zero/start_nuclear_war Sure. And that's the right thing to do (not the implied action, that is - _that_ would be too messy). - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
ioctls are evil, period. At least with these names you can use normal scripting and don't need any special tools. Every ioctl means a binary that has no business to exist. That is not IMHO a rational argument. It isn't my fault that your shell does not support ioctls usefully. If you used perl as your login shell you would have no problem there. Alan - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH] device arguments from lookup)
On Sat, May 19, 2001 at 12:51:07PM -0400, Alexander Viro wrote: clone(), walk(), clunk(), stat() and open() ;-) Basically, we can add unopened descriptors. I.e. no IO until you open it (turning the thing into opened one), but we can do lookups (move to child), we can clone and kill them and we can stat them. Those who would like a more detailed explanation can find one at http://plan9.bell-labs.com/sys/man/5/INDEX.html -- Revolutions do not require corporate support. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, 19 May 2001, Richard Gooch wrote: The transaction(2) syscall can be just as easily abused as ioctl(2) in this respect. People can pass pointers to ill-designed structures very Right. Moreover, it's not needed. The same functionality can be trivially implemented by write() and read(). As the matter of fact, had been done in userland context for decades. Go and buy Stevens. Read it. Then come back. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
On Sat, 19 May 2001, Alexander Viro wrote: Folks, before you get all excited about cramming side effects into open(2), consider the following case: Your argument is stupid, imnsho. Side-effects are perfectly fine if they are _local_ to the file descriptor. Your example is contrieved and idiotic. Filename extensions would not replace ioctl's. But they are wonderful ways to avoid unnecessary binary name-spaces, like the ones we have with callout TTY names, and the one that the fb people had. For example, do a ls -l /dev/fd0*, and ponder. Also, realize that we have these hard-coded names in _addition_ to the magic ioctl to set even more parameters. These are all stupid and bad, and it would have been a _lot_ cleaner to be able to do open(/dev/fd0/H1440, O_RDWR).. or open(/dev/fd0/HD,18,85, O_RDWD) to open special non-standard high-density modes. We already did this, in a very limited and stupid way, by encoding the minor number and generating a standard naming scheme. We can do the same thing in a _much_ more generic way by just realizing that we wanted the open to be name-based in the first place. These are _not_ side effects. They are very much naming conventions. If I want to open a the floppy in one of the special extended modes, it makes a LOT more sense to just open it with the naming, than to open a generic floppy device only to them use a magic and very unreadable ioctl to set the mode of the device. In short, I don't buy your arguments for one single second. Linus - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion codein userspace
On Sat, 19 May 2001, Andrew Morton wrote: So. When am I going to be able to: open(/bin/ls,-l,/etc/passwd, O_RDONLY); You are not. Think for a minute and you'll see why. Linus' idea of /dev/tty/parameters is marginally sane - it makes sense to consider that as configuring-upon-open. You _are_ going to do IO on that file. Ben's /dev/md0/living_horror is ugly - it's open just for side effects, with no IO supposed to happen. His idea of passing file descriptor instead of name makes these side effects even messier. The stuff you've proposed is a perversion worth of Albert. You've introduced additional metacharacter into filenames, you will need some form of quoting to be able to pass literal commas and you will need to quote slashes. It's way past ugly. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
Are we talking about device arguments just for chrdevs and blkdevs? (ie. drivers) or for regular files too? Speaking about drivers specifically, a controlling miscdev, one per device or one per group of devices depending on your needs, is a much more clean solution for passing ioctl-type data. You are free to come up with whatever method of communication with the driver is most efficient for your needs -- without perverting open(2). Notice also a metadata miscdev solves the problem of passing options on open -- just pass those options to the miscdev before you open it... metadata miscdevs are a clean solution to what procfs hacks and ioctls are trying to accomplish. Jeff -- Jeff Garzik | Do you have to make light of everything?! Building 1024| I'm extremely serious about nailing your MandrakeSoft | step-daughter, but other than that, yes. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sun, 20 May 2001, Ingo Oeser wrote: PS: English is neither mine, nor Linus native language. Why do the English natives complain instead of us? ;-) Because we had some experience with, erm, localized systems and for Alan it's most likely pure theory? ;-) Al, still shuddering at the memories - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, May 19, 2001 at 05:25:22PM +0100, Alan Cox wrote: Only to an English speaker. I suspect Quebec City canadians would prefer a different command set. Should we support `pas387' as well as `no387' as a kernel boot parameter then? Face it, a sysadmin has to know the limited subset of english which is used to configure a kernel. -- Revolutions do not require corporate support. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
Linus Torvalds wrote: There are some strong arguments that we should have filesystem backdoors for maintenance purposes, including backup. I think I agree with something Al said over IRC, that fs-level snapshots are preferred over block level snapshots. fs-level snapshots should become easy if you have a generic transaction layer. The OS spits out file ops, which get processed into a set of fs transactions. (remember that fs-level stuff like change this block bitmap is also a transaction, just like the more generic update this inode's mtime) Also, I think there should be generic block allocation strategies that fs's can use. Implementing fs-specific strategies such as ext2's readahead or XFS's delayed allocation is not a solution, IMHO, but working towards solving the real problem. /ramble You can, of course, so parts of this on a LVM level, and doing backups with disk snapshots may be a valid approach. However, even that is debatable: there is very little that says that the disk image has to be up-to-date at any particular point in time, so even with a disk snapshot capability (which is not necessarily reasonable under all circumstances) there are arguments for maintenance interfaces. I've been hacking on the attached, a snapshot block device driver, which doesn't require LVM at all. (warning: compiled and updated per outside review, but very alpha... do not apply) The point of the driver is to provide a sync point at snapshot time, at which all metadata and data is flushed to the block device. My question... is there a fundamental flaw in this plan? Ideally when userspace says start snapshot, the fsync_dev occurs [a simplification]. At that point, userspace can safely run dump or tar or whatever on the virtual snapshot device. -- Jeff Garzik | Do you have to make light of everything?! Building 1024| I'm extremely serious about nailing your MandrakeSoft | step-daughter, but other than that, yes. Index: linux_2_4/drivers/block/Config.in diff -u linux_2_4/drivers/block/Config.in:1.1.1.44 linux_2_4/drivers/block/Config.in:1.1.1.44.4.1 --- linux_2_4/drivers/block/Config.in:1.1.1.44 Tue May 15 04:43:24 2001 +++ linux_2_4/drivers/block/Config.in Wed May 16 15:44:59 2001 @@ -46,4 +46,6 @@ fi dep_bool ' Initial RAM disk (initrd) support' CONFIG_BLK_DEV_INITRD $CONFIG_BLK_DEV_RAM +tristate 'Snapshot device support' CONFIG_BLK_DEV_SNAP + endmenu Index: linux_2_4/drivers/block/Makefile diff -u linux_2_4/drivers/block/Makefile:1.1.1.46 linux_2_4/drivers/block/Makefile:1.1.1.46.4.1 --- linux_2_4/drivers/block/Makefile:1.1.1.46 Tue May 15 04:43:24 2001 +++ linux_2_4/drivers/block/MakefileWed May 16 15:44:59 2001 @@ -31,6 +31,7 @@ obj-$(CONFIG_BLK_DEV_DAC960) += DAC960.o obj-$(CONFIG_BLK_DEV_NBD) += nbd.o +obj-$(CONFIG_BLK_DEV_SNAP) += snap.o subdir-$(CONFIG_PARIDE) += paride Index: linux_2_4/drivers/block/snap.c diff -u /dev/null linux_2_4/drivers/block/snap.c:1.1.6.10 --- /dev/null Sat May 19 17:36:30 2001 +++ linux_2_4/drivers/block/snap.c Thu May 17 11:48:54 2001 @@ -0,0 +1,1055 @@ +/* + Copyright 2001 Jeff Garzik [EMAIL PROTECTED] + Copyright (C) 2000 Jens Axboe [EMAIL PROTECTED] + + May be copied or modified under the terms of the GNU General Public + License. See linux/COPYING for more information. + + Several ideas and some code taken from Jens Axboe's pktcdvd.c 0.0.2j. + + To-Do list: + * Write support. It's easy, and might be useful in isolated circumstances. + * Convert MAX_SNAPDEVS to a module parameter. + * Wrap use of % operator, to prepare for 64-bit-sized blockdevs on + 32-bit processors + + */ + +#define VERSION_CODE v0.5.0-take6 17 May 2001 Jeff Garzik +[EMAIL PROTECTED] +#define MODNAMEsnap +#define PFXMODNAME : +#define MAX_SNAPDEVS 16 + +#include linux/module.h +#include linux/kernel.h +#include linux/slab.h +#include linux/errno.h +#include linux/spinlock.h +#include linux/interrupt.h +#include linux/file.h +#include linux/blk.h +#include linux/blkpg.h +#include linux/init.h +#include linux/snap.h +#include asm/uaccess.h + +static int *snap_sizes; +static int *snap_blksize; +static int *snap_readahead; +static struct snap_device *snap_devs; +static int snap_major = -1; +static spinlock_t snap_lock = SPIN_LOCK_UNLOCKED; + + +/* + * a bit of a kludge, but we want to be able to pass source, log, + * or snap dev and get the right one. + */ +static struct snap_device *snap_find_dev(kdev_t dev) +{ + int i, j; + struct snap_device *sd; + + spin_lock(snap_lock); + + for (i = 0; i MAX_SNAPDEVS; i++) { + sd = snap_devs[i]; + if ((sd-src.dev == dev) || (sd-snap_dev == dev)) + goto out; + for (j = 0; j sd-n_logs; j++) + if (sd-logs[j].dev == dev) + goto out; + } + sd = NULL; + +out: +
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
On Sat, 19 May 2001, Alan Cox wrote: Now that I'm awake and refreshed, yeah, that's awful. But echo hot-add,slot=5,device=/dev/sda /dev/md0/control *is* sane. Heck, the system can even send back result codes that way. Only to an English speaker. I suspect Quebec City canadians would prefer a different command set. Well... Around here we've been used to Microsoft translations like: ETES-VOUS CERTAIN [O/N] ? ... and of course pressing 'o' doesn't work while 'y' does. :-) Wanting to localize such low-level keywords is utopia. Otherwise you'll want to translate command names like free, rm, mv, etc. and yet programming languages as well like C keywords. And then you come to a point where nothing could be interoperable any more. Nicolas - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH]device arguments from lookup)
On Sat, 19 May 2001 [EMAIL PROTECTED] wrote: A lot of stuff relies on the fact that close(open(foo, O_RDONLY)) is a no-op. Breaking that assumption is a Bad Thing(tm). Also here I would like to agree. Unfortunately this is false. Opening device files often has interesting side effects. Too bad. They can be triggered by similar races between attacker changing the type of object (file-symlink) and backup. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
initrd is an unnecessary pain in the ass for most people. It had better not become mandatory. You would not notice the difference, only your kernel would be a bit smaller and the RRPART ioctl disappears. [Besides: we have lived with DOS-type partition tables for ten years, but they will not last another ten years. Very soon disk partitions will look very different. It will be good to move knowledge about these things out of the kernel before this happens.] Andries - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
On Sat, 19 May 2001, Ben LaHaise wrote: On Sat, 19 May 2001, Alexander Viro wrote: On Sat, 19 May 2001, Ben LaHaise wrote: It's not done yet, but similar techniques would be applied. I envision that a raid device would support operations such as open(/dev/md0/slot=5,hot-add=/dev/sda) Think for a moment and you'll see why it's not only ugly as hell, but simply won't work. Yeah, I shouldn't be replying to email anymore in my bleery-eyed state. =) Of course slash seperated data doesn't work, so it would have to be hot-add=filedescriptor or somesuch. Gah, that's why the options are all parsed from a single lookup name anyways... That's why you want to use write(2) to pass that information instead of encoding it into open(2). - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
Alexander Viro wrote: (2) what about bootstrapping? how do you find the root device? Do you do root=/dev/hda/offset=63,limit=1235823? Bit nasty. Ben's patch makes initrd mandatory. Can this be fixed? I've *never* had to futz with initrd. Probably most systems are the same. It seems a step backward to make it necessary. - - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: Why side-effects on open(2) are evil. (was Re: [RFD w/info-PATCH] device arguments from lookup)
Opening device files often has interesting side effects. Too bad. They can be triggered by similar races between attacker changing the type of object (file-symlink) and backup. Yes. This is a well-known security problem. Doing stat(file, s); if (action desired) { action(file); } is no good because there is a race. But doing fd = open(file, flags); fstat(fd, s); if (action desired) { f_action(fd); } is no good either because the open() has unknown side effects. It helps to add flags like O_NONBLOCK and perhaps O_NOCTTY, but that is not quite good enough. One would like to have a version of the open() call that was guaranteed free of side effects, and gave a fd only - perhaps for stat(), perhaps for ioctl(). This guarantee could perhaps be obtained by omitting the f-f_op-open(inode,f); call in dentry_open() when the open call is open(file, O_FDONLY); Of course it may be that we afterwards decide that fd must be used, and then it needs upgrading: fd = f_open(fd, O_RDWR); Andries [Such a construction allows various cleanups. But no doubt it has problems that I have not yet thought of.] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion code inuserspace
From [EMAIL PROTECTED] Sat May 19 20:07:23 2001 initrd is an unnecessary pain in the ass for most people. It had better not become mandatory. You would not notice the difference, only your kernel would be a bit smaller and the RRPART ioctl disappears. Would I not notice the difference as a user, as a sysadmin, as a kernel builder, as a kernel hacker, or all of the above? All of the above. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]
Re: [RFD w/info-PATCH] device arguments from lookup, partion codein userspace
Alexander Viro wrote: It's way past ugly. I knew you'd like it. It kind of makes sense, because it puts the two primary stream-of-bytes objects in Unix into the same namespace, with the same accessors. So if some random application is expecting a filename well heck, you just give it a path-to-executable with args. It won't care, although it may have trouble lseek()ing on it. It wasn't very serious at all. - - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED]