Re: Atomic file data replace API

2011-01-26 Thread Olaf van der Spek
On Wed, Jan 26, 2011 at 8:30 PM, Chris Mason wrote: > My answer hasn't really changed ;)  Replacing file data is a common > operation, but it is still surprisingly complex.  Again, the truncate is > O(size of the file) and it is actually impossible to do this atomically > in most filesystems. Unf

Re: Atomic file data replace API

2011-01-26 Thread Chris Mason
Excerpts from Olaf van der Spek's message of 2011-01-26 13:30:08 -0500: > On Sat, Jan 8, 2011 at 3:40 PM, Olaf van der Spek > wrote: > > On Fri, Jan 7, 2011 at 8:29 PM, Chris Mason wrote: > >> The exact amount of tracking is going to vary.  The reason why is that > >> actually doing the truncate

Re: Atomic file data replace API

2011-01-26 Thread Olaf van der Spek
On Sat, Jan 8, 2011 at 3:40 PM, Olaf van der Spek wrote: > On Fri, Jan 7, 2011 at 8:29 PM, Chris Mason wrote: >> The exact amount of tracking is going to vary.  The reason why is that >> actually doing the truncate is an O(size of the file) operation and so >> you can't just flip a switch when th

Re: Atomic file data replace API

2011-01-09 Thread Phillip Susi
On 01/09/2011 01:56 PM, Thomas Bellman wrote: That particular problem was solved with the introduction of the rename(2) system call in 4.2BSD a bit more than a quarter of a century ago. There is no need to introduce another, less flexible, API for doing the same thing. I'm curious if there are

Re: Atomic file data replace API

2011-01-09 Thread Olaf van der Spek
On Sun, Jan 9, 2011 at 7:56 PM, Thomas Bellman wrote: >> True, that's why this feature request is here. >> Note that it's (ATM) only about  single file data replace. > > That particular problem was solved with the introduction of the > rename(2) system call in 4.2BSD a bit more than a quarter of a

Re: Atomic file data replace API

2011-01-09 Thread Thomas Bellman
Olaf van der Spek wrote: On Sat, Jan 8, 2011 at 10:43 PM, Thomas Bellman wrote: So, basically database transactions with an isolation level of "committed read", for file operations. That's something I have wanted for a long time, especially if I also get a rollback() operation, but have never

Re: Atomic file data replace API

2011-01-09 Thread Olaf van der Spek
On Sat, Jan 8, 2011 at 10:43 PM, Thomas Bellman wrote: > So, basically database transactions with an isolation level of > "committed read", for file operations.  That's something I have > wanted for a long time, especially if I also get a rollback() > operation, but have never heard of any Unix th

Re: Atomic file data replace API

2011-01-08 Thread Thomas Bellman
Olaf van der Spek wrote: On Fri, Jan 7, 2011 at 8:29 PM, Thomas Bellman wrote: What is the visibility of the changes for other processes supposed to be in the meantime? I.e., if things happen in this order: Should be atomic too, at close time. 1. Process A does fda = open("foo.txt", O_TRU

Re: Atomic file data replace API

2011-01-08 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 8:29 PM, Chris Mason wrote: > The exact amount of tracking is going to vary.  The reason why is that > actually doing the truncate is an O(size of the file) operation and so > you can't just flip a switch when the write or the close comes in.  You > have to run through all t

Re: Atomic file data replace API

2011-01-08 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 8:29 PM, Thomas Bellman wrote: > What is the visibility of the changes for other processes supposed > to be in the meantime?  I.e., if things happen in this order: Should be atomic too, at close time. > 1. Process A does fda = open("foo.txt", O_TRUNC|O_ATOMIC) > 2. Process

Re: Atomic file data replace API

2011-01-07 Thread Phillip Susi
On 01/07/2011 09:58 AM, Chris Mason wrote: Yes and no. We have a best effort mechanism where we try to guess that since you've done this truncate and the write that you want the writes to show up quickly. But its a guess. It is a pretty good guess, and one that the NT kernel has been making

Re: Atomic file data replace API

2011-01-07 Thread Thomas Bellman
Olaf van der Spek wrote: On Fri, Jan 7, 2011 at 5:32 PM, Massimo Maggi wrote: Are you suggesting to do: 1)fopen with O_TRUNC, O_ATOMIC: returns fd to a temporary file 2)application writes to that fd, with one or more system calls, in a short time or in long time, at his will. 3)at fclose (or e

Re: Atomic file data replace API

2011-01-07 Thread Chris Mason
Excerpts from Hubert Kario's message of 2011-01-07 11:26:02 -0500: > On Friday, January 07, 2011 17:12:11 Chris Mason wrote: > > Excerpts from Olaf van der Spek's message of 2011-01-07 10:17:31 -0500: > > > On Fri, Jan 7, 2011 at 4:13 PM, Chris Mason > wrote: > > > >> That's not what I asked. ;)

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 5:32 PM, Massimo Maggi wrote: > Are you suggesting to do: > 1)fopen with O_TRUNC, O_ATOMIC: returns fd to a temporary file > 2)application writes to that fd, with one or more system calls, in a > short time or in long time, at his will. > 3)at fclose (or even at fsync ) atom

Re: Atomic file data replace API

2011-01-07 Thread Massimo Maggi
Are you suggesting to do: 1)fopen with O_TRUNC, O_ATOMIC: returns fd to a temporary file 2)application writes to that fd, with one or more system calls, in a short time or in long time, at his will. 3)at fclose (or even at fsync ) atomically swap "data pointer" of "real file" with "temp file", then

Re: Atomic file data replace API

2011-01-07 Thread Hubert Kario
On Friday, January 07, 2011 17:12:11 Chris Mason wrote: > Excerpts from Olaf van der Spek's message of 2011-01-07 10:17:31 -0500: > > On Fri, Jan 7, 2011 at 4:13 PM, Chris Mason wrote: > > >> That's not what I asked. ;) > > >> I asked to wait until the first write (or close). That way, you don't

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 5:12 PM, Chris Mason wrote: >> I'm not sure why you would run out of memory in that case. > > Well, lets make sure I've got a good handle on the proposed interface: > > 1) fd = open(some_file, O_ATOMIC) No, O_TRUNC should be used in open. Maybe it works with a separate trun

Re: Atomic file data replace API

2011-01-07 Thread Chris Mason
Excerpts from Olaf van der Spek's message of 2011-01-07 10:17:31 -0500: > On Fri, Jan 7, 2011 at 4:13 PM, Chris Mason wrote: > >> That's not what I asked. ;) > >> I asked to wait until the first write (or close). That way, you don't > >> get unintentional empty files. > >> One step further, you do

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 4:13 PM, Chris Mason wrote: >> That's not what I asked. ;) >> I asked to wait until the first write (or close). That way, you don't >> get unintentional empty files. >> One step further, you don't have to keep the data in memory, you're >> free to write them to disk. You jus

Re: Atomic file data replace API

2011-01-07 Thread Chris Mason
Excerpts from Olaf van der Spek's message of 2011-01-07 10:08:24 -0500: > On Fri, Jan 7, 2011 at 4:05 PM, Chris Mason wrote: > >> > The problem is the write() // 0+ times.  The kernel has no idea what > >> > new result you want the file to contain because the application isn't > >> > telling us. >

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 4:05 PM, Chris Mason wrote: >> > The problem is the write() // 0+ times.  The kernel has no idea what >> > new result you want the file to contain because the application isn't >> > telling us. >> >> Isn't it safe for the kernel to wait until the first write or close >> befo

Re: Atomic file data replace API

2011-01-07 Thread Chris Mason
Excerpts from Olaf van der Spek's message of 2011-01-07 10:01:59 -0500: > On Fri, Jan 7, 2011 at 3:58 PM, Chris Mason wrote: > > Excerpts from Olaf van der Spek's message of 2011-01-06 15:01:15 -0500: > >> Hi, > >> > >> Does btrfs support atomic file data replaces? Basically, the atomic > >> varia

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 3:58 PM, Chris Mason wrote: > Excerpts from Olaf van der Spek's message of 2011-01-06 15:01:15 -0500: >> Hi, >> >> Does btrfs support atomic file data replaces? Basically, the atomic >> variant of this: >> // old stage >> open(O_TRUNC) >> write() // 0+ times >> close() >> //

Re: Atomic file data replace API

2011-01-07 Thread Chris Mason
Excerpts from Olaf van der Spek's message of 2011-01-06 15:01:15 -0500: > Hi, > > Does btrfs support atomic file data replaces? Basically, the atomic > variant of this: > // old stage > open(O_TRUNC) > write() // 0+ times > close() > // new state Yes and no. We have a best effort mechanism where

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 3:01 PM, Olaf van der Spek wrote: > According to Ted, via-truncate and via-rename are unsafe. Only fsync, > rename is safe. > Disadvantage of rename is resetting file owner (if non-root), having > issues with meta-data and other stuff. > > My proposal was for an open flag, O

Re: Atomic file data replace API

2011-01-07 Thread Olaf van der Spek
On Fri, Jan 7, 2011 at 2:55 PM, Mike Fleetwood wrote: > On 6 January 2011 20:01, Olaf van der Spek wrote: >> Hi, >> >> Does btrfs support atomic file data replaces? > > Hi Olaf, > > Yes btrfs does support atomic replace, since kernel 2.6.30 circa June 2009. > [1] > > Special handling was added t

Re: Atomic file data replace API

2011-01-07 Thread Mike Fleetwood
On 6 January 2011 20:01, Olaf van der Spek wrote: > Hi, > > Does btrfs support atomic file data replaces? Hi Olaf, Yes btrfs does support atomic replace, since kernel 2.6.30 circa June 2009. [1] Special handling was added to ext3, ext4, btrfs (and probably other Linux FSs) for your replace-via-

Atomic file data replace API

2011-01-06 Thread Olaf van der Spek
Hi, Does btrfs support atomic file data replaces? Basically, the atomic variant of this: // old stage open(O_TRUNC) write() // 0+ times close() // new state -- Olaf -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More