On Thu, Dec 30, 2010 at 4:20 PM, Henrique de Moraes Holschuh <h...@debian.org> wrote: >> What if the target name is actually a symlink? To a different volume? > > Indeed. You have to check that first, of course :-( This is about safe > handling of such functions, symlinks always have to be derreferenced and > their target checked. After that, you operate on the target, if the symlink > changes, your operations will not.
That's not really atomic. >> What if you're not allowed to create a file in that dir. > > You fail the write. That's a regression from the non-atomic case. > Or the user has to request the unsafe handling > (truncate + write). Or you have to detect it will happen and switch modes > if you're allowed to. > >> > If we could use some syscall to make [1] into a simple barrier request >> > (guaranteed to degrade to fsync if barriers are not operating), it would >> > be better performance-wise. This is what one should request of libc and >> > the kernels with a non-zero chance of getting it implemented (in fact, >> > it might even already exist). >> >> My proposal was O_ATOMIC: >> // begin transaction >> open(fname, O_ATOMIC | O_TRUNC); >> write; // 0+ times >> close; >> >> Seems like the ideal API from the app's point of view. > > POSIX filesystems do not support it, so you'd need glibc to do everything Not yet, but I assume it'll be added when there's enough demand. > your application would have to get that atomicity. I.e. it should go in a > separate lib, anyway, and you will have to code for it in the app :( Why would it have to go in a separate lib? > It is not transparent. It cannot be. What about mmap()? What about > read+write patterns? They either happen before or after this atomic transaction. Comparable to the rename workaround. > At most you could have an "open+write+close" function that encapsulate most > of the crap, with a few options to tell it what to do if it finds a symlink > or mismatched owner, what to do if it cannot do it in an atomic way, etc. > > I suppose one could actually ask for a non-posix interface to do all those > three operations in one syscall, but I don't think the kernel people will There's no need for a single syscall. > want to implement it. It would make sense only if object stores become > commonplace (where this thing is likely an object store primitive, anyway). Nah. Tons of files are written in one go. All could use this atomic flag. >> >> I've brought this up on linux-fsdevel and linux-ext4 but they (Ted) >> >> claim those exceptions aren't really a problem. >> > >> > Indeed they are not. Code has been dealing with them for years. You >> >> Code has been wrong for years to, based on the reason reports about >> file corruption with ext4. > > Code written to *deal with files safely* by people who wanted to get it > right and actually checked what needs to be done, has been right for years. > And has piss-poor performance. Isn't fixing / improving that a good thing? > Code written by random joe which has no clue about the braindamages of POSIX > and Unix, well... this thread shows how much crap is really needed. So you agree that this should be improved? > One can, obviously, have most filesystems be super-safe, and create a new > fadvise or something to say "this is crap, be unsafe if you can". > Performance will be poor, everything will be safe, and the extra fsyncs() > will not hurt much because the fs would do it anyway. I actually think this can be done with better performance then the rename workaround. >> > name the temp file properly, and teach your program to clean old ones up >> > *safely* (see vim swap file handling for an example) when it starts. >> >> What about restoring meta-data? File-owner? > > Hmm, yes, more steps if you want to do something like that, as you must do > it with the target open in exclusive mode. close target only after the > rename went ok. > > But if the file owner is not yourself, you really should change it, not to > mention you might not want to complete the operation in the first place. Why? Of course write access to the file is required. >> I'll ask glibc. > > This really should be in a separate lib. You want it to be usable outside > of glibc systems, and you CAN implement it (slow that it will be) on > anything POSIX. You need only some help of the kernel to speed it up, and > that has to be detected at compile time (support) and runtime (availability > of the feature) anyway. Olaf -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/aanlktinhoftnychhjsd6og04jrvyube8ul55szyyl...@mail.gmail.com