Re: [sqlite] Apple announces new File System with better ACID support
On Tue, 14 Jun 2016 10:49:05 +0900 ?? wrote: > > On 13 Jun 2016, at 10:13pm, Richard Hipp wrote: > > > > The rename-is-atomic assumption is so wide-spread in the Linux > > world, that the linux kernel was modified to make renames closer to > > being atomic on common filesystems such as EXT4. > > http://man7.org/linux/man-pages/man2/rename.2.html rename(2) *is* atomic. That doesn't mean it's synchronous with respect to external storage. It only means that no two processes will ever see the file "in flight" in two places. If process A calls rename(N,M), at no point will process B have acceess to both N and M. Once M is available, N is extinquished. That's a useful property for a process that succeeds, and for which the OS successfully flushes the data to disk. When Richard says rename isn't atomic, he means that it's not synchronous with respect to the disk. It makes no guarantee that the directory entries were updated on disk. The rename happens in the kernel's filesystem memory structures, which *eventually* are persisted to disk. I have heard that that time lag may be measured in seconds. > I am interested to know what it would take to make linux renames > fully atomic. Reading it as is it feels like the action of rename > would be the most important piece to making rename atomic. The docs > claim this is atomic. What other aspects would be necessary? To make Linux rename fully synchronous is technically infeasible and politically impossible. On the political side, the preference in Linux is invariably for performance, often at ever-finer divisions of responsibility. As an example, Unix fsync(2) traditionally updated both the file and its metadata; Linux divided those into fsync and fdatasync, and added the requirement to call fsync on the directory. What was once a single call became 2 or 3. As a technical matter, it's really infeasible because there are too many moving parts: kernel, filesystem driver, and hardware. It is possible for a human being to know what kind of disk is installed and how configured, and to know the semantics of a given filesystem. It is not possible for the kernel to patrol all those things, and hence the kernel cannot make any guarantees about them. (To take an extreme example: NFS.) By the way, every DBMS I know anything about (and SQLite no exception), tends to eschew OS services except at the most minimal level. The internals of a DBMS carry a lot of state information unavailable to the kernel that the DBMS uses to prioritize how memory is used and when and where I/O is required. That's why every DBMS has its own logging mechnism, and some bypass the filesystem altogether. --jkl ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Apple announces new File System with better ACID support
> On 13 Jun 2016, at 10:13pm, Richard Hipp wrote: > > The rename-is-atomic assumption is so > wide-spread in the Linux world, that the linux kernel was modified to > make renames closer to being atomic on common filesystems such as > EXT4. http://man7.org/linux/man-pages/man2/rename.2.html I am interested to know what it would take to make linux renames fully atomic. Reading it as is it feels like the action of rename would be the most important piece to making rename atomic. The docs claim this is atomic. What other aspects would be necessary? Maybe the issue is simply that although there "is no point at which another process attempting to access newpath will find it missing", the "another process" doesn't know when the file is fully written to disk? Apologies if this is too off topic or obvious to everybody else. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Apple announces new File System with better ACID support
On 13 Jun 2016, at 10:13pm, Richard Hipp wrote: > On 6/13/16, Simon Slavin wrote: > >> The relevance to this list is mostly in the last item above: atomic >> safe-save primitives. > > The documentation indicates that safe-save only does file rename > operations atomically. Aaah you're right. I was hoping for better support at the file writing or locking/unlocking level. Disappointed. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Apple announces new File System with better ACID support
On 6/13/16, Simon Slavin wrote: > > The relevance to this list is mostly in the last item above: atomic > safe-save primitives. The documentation indicates that safe-save only does file rename operations atomically. This of no help in making SQLite transactions atomic. SQLite cannot use file renaming because SQLite databases are used concurrently by multiple processes, and so if one process moves the file, it would move the file out from under other processes. The safe-save feature appears to be an effort to aid grow-your-own style atomicity that is commonly implemented by writing new content into a new file, then renaming the new file over top of the old. This comes up a lot for applications that treat the filesystem as a key/value database. Many programmers have assumed that rename is atomic on unix. It is not. The rename-is-atomic assumption is so wide-spread in the Linux world, that the linux kernel was modified to make renames closer to being atomic on common filesystems such as EXT4. I think this new feature of HFS+ is likely a similar effort. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users