Re: [PATCH 0/5] fallocate system call
On May 02, 2007 18:23 +0530, Amit K. Arora wrote: > On Sun, Apr 29, 2007 at 10:25:59PM -0700, Chris Wedgwood wrote: > > On Mon, Apr 30, 2007 at 10:47:02AM +1000, David Chinner wrote: > > > > > For FA_ALLOCATE, it's supposed to change the file size if we > > > allocate past EOF, right? > > > > I would argue no. Use truncate for that. > > The patch I posted for ext4 *does* change the filesize after > preallocation, if required (i.e. when preallocation is after EOF). > I may have to change that, if we decide on not doing this. I think I'd agree - it may be useful to allow preallocation beyond EOF for some kinds of applications (e.g. PVR preallocating live TV in 10 minute segments or something, but not knowing in advance how long the show will actually be recorded or the final encoded size). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Sun, Apr 29, 2007 at 10:25:59PM -0700, Chris Wedgwood wrote: > On Mon, Apr 30, 2007 at 10:47:02AM +1000, David Chinner wrote: > > > For FA_ALLOCATE, it's supposed to change the file size if we > > allocate past EOF, right? > > I would argue no. Use truncate for that. The patch I posted for ext4 *does* change the filesize after preallocation, if required (i.e. when preallocation is after EOF). I may have to change that, if we decide on not doing this. -- Regards, Amit Arora - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Mon, Apr 30, 2007 at 03:56:32PM +1000, David Chinner wrote: > On Sun, Apr 29, 2007 at 10:25:59PM -0700, Chris Wedgwood wrote: > IIRC, the argument for FA_ALLOCATE changing file size is that > posix_fallocate() is supposed to change the file size. But it's not posix_fallocate; it's something more generic. glibc can do posix_fallocate using truncate + fallocate. > Note that the way XFS implements growing the file size after the > allocation is via a truncate What's wrong with that? That seems very reasonable. > That's would what I did because otherwise you'd use ftruncate64(). > Without documented behaviour or an ext4 implementation, I have to > ask what it's supposed to do, though ;) How many *real* users are there for ext4? Why does 'what ext4 does' define 'the semantics'? Surely semantics should be decided either by precedent (if there is an existing relevant userbase) or sensible thought and some debate? - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Sun, Apr 29, 2007 at 10:25:59PM -0700, Chris Wedgwood wrote: > On Mon, Apr 30, 2007 at 10:47:02AM +1000, David Chinner wrote: > > > For FA_ALLOCATE, it's supposed to change the file size if we > > allocate past EOF, right? > > I would argue no. Use truncate for that. I'm going from the ext4 implementation because the semantics have not been documented yet. IIRC, the argument for FA_ALLOCATE changing file size is that posix_fallocate() is supposed to change the file size. I think that having a mode for real preallocation and another for posix_fallocate is a valid thing to do... Note that the way XFS implements growing the file size after the allocation is via a truncate > > For FA_DEALLOCATE, does it change the filesize at all? > > Same as above. > > > Or does > > it just punch a hole in the file? > > Yes. That's would what I did because otherwise you'd use ftruncate64(). Without documented behaviour or an ext4 implementation, I have to ask what it's supposed to do, though ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Mon, Apr 30, 2007 at 10:47:02AM +1000, David Chinner wrote: > For FA_ALLOCATE, it's supposed to change the file size if we > allocate past EOF, right? I would argue no. Use truncate for that. > For FA_DEALLOCATE, does it change the filesize at all? Same as above. > Or does > it just punch a hole in the file? Yes. > FWIW, we definitely need a FA_PREALLOCATE mode (FA_ALLOCATE but does > not change file size) so we can preallocate beyond EOF for apps > which use O_APPEND (i.e. changing file size would cause problems for > them). FA_ALLOCATE should be able to allocate past-EOF I would argue. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Thu, Apr 26, 2007 at 11:20:56PM +0530, Amit K. Arora wrote: > Based on the discussion, this new patchset uses following as the > interface for fallocate() system call: > > asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) Ok, so now for the hard questions - what are the semantics of FA_ALLOCATE and FA_DEALLOCATE? For FA_ALLOCATE, it's supposed to change the file size if we allocate past EOF, right? What's the return value supposed to be? Zero for success, error otherwise? Does this update a/m/ctime at all? How persistent is this preallocation? Should it be there "forever" or for the lifetime of the currently open fd that it was preallocated on? For FA_DEALLOCATE, does it change the filesize at all? Or does it just punch a hole in the file? If it does change file size, what happens when you punch out preallocation beyond EOF? What's the return value supposed to be? > Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for > preallocation and deallocation of preallocated blocks respectively. More > modes can be added, when required. FWIW, we definitely need a FA_PREALLOCATE mode (FA_ALLOCATE but does not change file size) so we can preallocate beyond EOF for apps which use O_APPEND (i.e. changing file size would cause problems for them). > ToDos: > = > 1> Implementation on other architectures (other than i386, x86_64, > ppc64 and s390(x)) I'll have ia64 soon. > 2> A generic file system operation to handle fallocate > (generic_fallocate), for filesystems that do _not_ have the fallocate > inode operation implemented. > 3> Changes to glibc, > a) to support fallocate() system call > b) so that posix_fallocate() and posix_fallocate64() call > fallocate() system call > 4> Changes to XFS to implement the fallocate inode operation And that's what I'm doing now, hence all the questions ;) BTW, do you have a test program for this, or will I need to write one myself? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Fri, Apr 27, 2007 at 07:46:13PM +0200, Heiko Carstens wrote: > If one insists to have fd at first argument, what is wrong with > having u32 arguments only? Well, I was one of those who objected as it seems *UGLY* to me. > It's not that this syscall comes even close to what can be > considered performance critical... Right. > It adds userspace overhead for one architecture. Every *trace and > *libc needs special handling on s390 for this syscall. I would > prefer to avoid this. I'm not that bothered about it. I would prefer it did use clean 64-bit arguments, but given it's a non-critical syscall I'm don't think the aesthetics are worth impossing crud on s390 for. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Fri, Apr 27, 2007 at 04:43:28PM +0200, Jörn Engel wrote: > On Fri, 27 April 2007 14:10:03 +0200, Heiko Carstens wrote: > > > > After long discussions where at least two possible implementations > > were suggested that would work on _all_ architectures you chose one > > which doesn't and causes extra effort. > > I believe the long discussion also showed that every possible > implementation has drawbacks. To me this one appeared to be the best of > many bad choices. If one insists to have fd at first argument, what is wrong with having u32 arguments only? It's not that this syscall comes even close to what can be considered performance critical... > Is this implementation worse than we thought? It adds userspace overhead for one architecture. Every *trace and *libc needs special handling on s390 for this syscall. I would prefer to avoid this. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Fri, 27 April 2007 14:10:03 +0200, Heiko Carstens wrote: > > After long discussions where at least two possible implementations > were suggested that would work on _all_ architectures you chose one > which doesn't and causes extra effort. I believe the long discussion also showed that every possible implementation has drawbacks. To me this one appeared to be the best of many bad choices. Is this implementation worse than we thought? Jörn -- The grand essentials of happiness are: something to do, something to love, and something to hope for. -- Allan K. Chalmers - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] fallocate system call
On Thu, Apr 26, 2007 at 11:20:56PM +0530, Amit K. Arora wrote: > Based on the discussion, this new patchset uses following as the > interface for fallocate() system call: > > asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > It seems that only s390 architecture has a problem with such a layout of > arguments in fallocate(). Thus for s390, we plan to have a wrapper > (say, sys_s390_fallocate()) for the sys_fallocate(), which will get > called by glibc when an application issues a fallocate() system call > on s390. The s390 arch specific changes will be part of a separate > patch (PATCH 2/5). It will be great if some s390 expert can verify the > patch, since I have not been able to test it on s390 so far. After long discussions where at least two possible implementations were suggested that would work on _all_ architectures you chose one which doesn't and causes extra effort. > It was also noted that minor changes might be required to strace code > to take care of "different arguments on s390" issue. This is not limited to strace... Besides that the s390 backend looks ok. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5] fallocate system call
Based on the discussion, this new patchset uses following as the interface for fallocate() system call: asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) It seems that only s390 architecture has a problem with such a layout of arguments in fallocate(). Thus for s390, we plan to have a wrapper (say, sys_s390_fallocate()) for the sys_fallocate(), which will get called by glibc when an application issues a fallocate() system call on s390. The s390 arch specific changes will be part of a separate patch (PATCH 2/5). It will be great if some s390 expert can verify the patch, since I have not been able to test it on s390 so far. It was also noted that minor changes might be required to strace code to take care of "different arguments on s390" issue. Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for preallocation and deallocation of preallocated blocks respectively. More modes can be added, when required. ToDos: = 1> Implementation on other architectures (other than i386, x86_64, ppc64 and s390(x)) 2> A generic file system operation to handle fallocate (generic_fallocate), for filesystems that do _not_ have the fallocate inode operation implemented. 3> Changes to glibc, a) to support fallocate() system call b) so that posix_fallocate() and posix_fallocate64() call fallocate() system call 4> Changes to XFS to implement the fallocate inode operation Following patches follow: Patch 1/5 : fallocate() implementation in i86, x86_64 and powerpc Patch 2/5 : fallocate() on s390 Patch 3/5 : ext4: Extent overlap bugfix Patch 4/5 : ext4: fallocate support in ext4 Patch 5/5 : ext4: write support for preallocated blocks -- Regards, Amit Arora - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html