Re: [RFC] Heads up on sys_fallocate()

2007-03-13 Thread David Chinner
On Tue, Mar 06, 2007 at 10:46:56AM -0600, Eric Sandeen wrote: Ulrich Drepper wrote: Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on an fd to do placement

Re: [RFC] Heads up on sys_fallocate()

2007-03-07 Thread Jan Kara
On Tue 06-03-07 12:23:22, Eric Sandeen wrote: Jan Kara wrote: On Tue 06-03-07 06:36:09, Ulrich Drepper wrote: Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on

Re: [RFC] Heads up on sys_fallocate()

2007-03-07 Thread Jörn Engel
On Wed, 7 March 2007 09:51:35 +0100, Jan Kara wrote: I'll probably first write some userspace fs-reorganizer to find out how much these changes in layout are able to give you in performance (i.e. whether it's worth the effort of more complicated kernel online defragmenter). Have tried

Re: [RFC] Heads up on sys_fallocate()

2007-03-06 Thread Ulrich Drepper
Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on an fd to do placement decissions for any further allocations (prealloc, write, etc) Yes, posix_fallocate shouldn't

Re: [RFC] Heads up on sys_fallocate()

2007-03-06 Thread Jan Kara
On Tue 06-03-07 06:36:09, Ulrich Drepper wrote: Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on an fd to do placement decissions for any further allocations

Re: [RFC] Heads up on sys_fallocate()

2007-03-06 Thread Christoph Hellwig
On Tue, Mar 06, 2007 at 06:36:09AM -0800, Ulrich Drepper wrote: Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on an fd to do placement decissions for any further

Re: [RFC] Heads up on sys_fallocate()

2007-03-06 Thread Eric Sandeen
Ulrich Drepper wrote: Christoph Hellwig wrote: fallocate with the whence argument and flags is already quite complicated, I'd rather have another call for placement decisions, that would be called on an fd to do placement decissions for any further allocations (prealloc, write, etc) Yes,

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Jörn Engel
On Mon, 5 March 2007 01:36:36 +0100, Arnd Bergmann wrote: Using the current glibc implementation on a compressed file system ideally should be a very expensive no-op because you won't actually allocate much space for a file when writing zeroes to it. You also don't benefit of a contiguous

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Christoph Hellwig
On Sat, Mar 03, 2007 at 11:45:32PM +0100, Arnd Bergmann wrote: I'd be more happy to have the write out zeroes loop in glibc. ?And glibc needs to have it anyway, for older kernels. A generic_fallocate makes sense to me iff we can do it in the kernel more significantly more efficiently than

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Jörn Engel
On Mon, 5 March 2007 00:32:14 +, Anton Altaparmakov wrote: I don't know how your compression algorithm works [...] LogFS is designed for flash media, so it does not have to worry much about reducing disk seeks. It is log-structured, which simplifies compression further. When writing a

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Anton Altaparmakov
On 5 Mar 2007, at 14:37, Theodore Tso wrote: On Sun, Mar 04, 2007 at 11:22:06PM +, Anton Altaparmakov wrote: And I specifically did NOT update the initialized size in the inode thus it will remain at its old value thus all new allocated blocks will be considered as present but not

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Ulrich Drepper
Theodore Tso wrote: Given that glibc already has to support this for older kernels, I would argue that there's no point putting in generic support for filesystem that can't support a more advanced way of doing things. Well, I'm sure the kernel can do better than the code we have in libc now.

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Ulrich Drepper
Jörn Engel wrote: Does the allocation have to be persistent beyond lifetime of the file descriptor? Of course. You call posix_fallocate once for the lifetime of the file when it is created to ensure that all future uses will work. It seems your filesystem will not be able to support this

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Ulrich Drepper
Jörn Engel wrote: The bad news for posix_fallocate() is that even if libc is smart enough to write random data, mmap() can still cause problems. This is not smart, quite to the contrary. The standard guarantees that all not-yet-written-to places in the file are zero. And if a block has

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Christoph Hellwig
On Mon, Mar 05, 2007 at 07:15:33AM -0800, Ulrich Drepper wrote: Theodore Tso wrote: Given that glibc already has to support this for older kernels, I would argue that there's no point putting in generic support for filesystem that can't support a more advanced way of doing things. Well,

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Ulrich Drepper
Jörn Engel wrote: Of course. You call posix_fallocate once for the lifetime of the file when it is created to ensure that all future uses will work. That part is not quite clear from the manpage but I trust most people would assume the same. Not only that, it is what this function is for.

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Theodore Tso
On Mon, Mar 05, 2007 at 07:15:33AM -0800, Ulrich Drepper wrote: Well, I'm sure the kernel can do better than the code we have in libc now. The kernel has access to the bitmasks which say which blocks have already been allocated. The libc code does not and we have to be very simple-minded and

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Ulrich Drepper
Theodore Tso wrote: [...] although the libc implementation still wouldn't be able to go away for long time due to the need to be backwards compatible with older kernels that didn't have this support. It's better than that. If somebody compiles glibc to not run on older kernels at all (tested

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Mingming Cao
Jan Kara wrote: On Fri, 02 Mar 2007 09:40:54 +1100 Nathan Scott [EMAIL PROTECTED] wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: This is to give a heads up on few patches that we will be soon

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Eric Sandeen
Jan Kara wrote: I am wondering if it is useful to add another mode to advise block allocation policy? Something like indicating which physical block/block group to allocate from (goal), and whether ask for strict contigous blocks. This will help preallocation or reservation to choose the

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Eric Sandeen
Jörn Engel wrote: Does the allocation have to be persistent beyond lifetime of the file descriptor? It would be fairly simple to support the write guarantee while the file is open (or rather the inode remains cached) and drop it afterwards. The posix_fallocate() function shall ensure that

Re: [RFC] Heads up on sys_fallocate()

2007-03-05 Thread Christoph Hellwig
On Mon, Mar 05, 2007 at 12:02:59PM -0800, Mingming Cao wrote: Yep, I think it makes sense to use preallocation for defragmentation. After all both preallocation and defragmentation shall call underlying filesystem multiple block allocator to try to allocate a chunk of contiguous blocks on

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Anton Altaparmakov
On 3 Mar 2007, at 22:45, Arnd Bergmann wrote: On Friday 02 March 2007 00:38:19 Christoph Hellwig wrote: Forgive me if I haven't put enough thought into it, but would it be useful to create a generic_fallocate() that writes zeroed pages for any non-existent pages in the range? I don't know

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Arnd Bergmann
On Sunday 04 March 2007, Anton Altaparmakov wrote: A generic_fallocate makes sense to me iff we can do it in the kernel more significantly more efficiently than in glibc, e.g. by using only a single page in page cache instead of one for each page to be   preallocated. If  glibc is

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Ulrich Drepper
Anton Altaparmakov wrote: And that is it. No zeroing needs to happen at all because we have not updated the initialized size of the inode! When you do it like this, who can the kernel/filesystem *guarantee* that when the data is written there actually is room on the harddrive? What you

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Anton Altaparmakov
On 5 Mar 2007, at 00:16, Jörn Engel wrote: On Sun, 4 March 2007 14:38:13 -0800, Ulrich Drepper wrote: When you do it like this, who can the kernel/filesystem *guarantee* that when the data is written there actually is room on the harddrive? What you described seems like using

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Anton Altaparmakov
On 5 Mar 2007, at 00:32, Anton Altaparmakov wrote: On 5 Mar 2007, at 00:16, Jörn Engel wrote: On Sun, 4 March 2007 14:38:13 -0800, Ulrich Drepper wrote: When you do it like this, who can the kernel/filesystem *guarantee* that when the data is written there actually is room on the

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Jörn Engel
On Sun, 4 March 2007 14:38:13 -0800, Ulrich Drepper wrote: When you do it like this, who can the kernel/filesystem *guarantee* that when the data is written there actually is room on the harddrive? What you described seems like using truncate/ftruncate to increase the file's size. That is

Re: [RFC] Heads up on sys_fallocate()

2007-03-04 Thread Christoph Hellwig
On Sun, Mar 04, 2007 at 08:11:17PM +, Anton Altaparmakov wrote: glibc cannot ever be smart enough because a file system driver will always know better and be able to do things in a much more optimized way. Please read the thread again. That is not what anyone proposed. The issues

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Andreas Dilger
On Mar 01, 2007 13:15 -0600, Eric Sandeen wrote: One thing I'd like to see is a cmd argument as well, to allow for example allocation vs. reservation (i.e. allocating blocks vs. simply reserving a number), as well as the inverse of those functions (un-reservation, de-allocation)? If the

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Dave Kleikamp
On Fri, 2007-03-02 at 18:45 +0800, Andreas Dilger wrote: On Mar 01, 2007 13:15 -0600, Eric Sandeen wrote: One thing I'd like to see is a cmd argument as well, to allow for example allocation vs. reservation (i.e. allocating blocks vs. simply reserving a number), as well as the inverse of

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Dave Kleikamp
Amit wrote: asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len); On Thu, 2007-03-01 at 22:16 -0800, Andrew Morton wrote: On Thu, 01 Mar 2007 22:03:55 -0800 Badari Pulavarty [EMAIL PROTECTED] wrote: Just curious .. What does posix_fallocate() return ? bookmark this:

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Jan Engelhardt
On Mar 1 2007 23:09, Dave Kleikamp wrote: Given that glibc already implements fallocate for all filesystems, it will need to continue to do so for filesystems which don't implement this syscall - otherwise applications would start breaking. I didn't make it clear, but my point was to call

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Ulrich Drepper
On 3/2/07, Dave Kleikamp [EMAIL PROTECTED] wrote: Then there's no need for sys_allocate to return a long. Every syscall must return a long. Otherwise you can have problems on 64-bit archs. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Eric Sandeen
Badari Pulavarty wrote: Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as Andrew suggested,

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Badari Pulavarty
On Fri, 2007-03-02 at 09:16 -0600, Eric Sandeen wrote: Badari Pulavarty wrote: Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for

Re: [RFC] Heads up on sys_fallocate()

2007-03-02 Thread Mingming Cao
Dave Kleikamp wrote: On Thu, 2007-03-01 at 14:59 -0800, Andrew Morton wrote: On Thu, 01 Mar 2007 22:44:16 + Dave Kleikamp [EMAIL PROTECTED] wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote:

[RFC] Heads up on sys_fallocate()

2007-03-01 Thread Amit K. Arora
This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as Andrew suggested, will look like: asmlinkage long sys_fallocate(int

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Eric Sandeen
Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as Andrew suggested, will look like: asmlinkage

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Jeff Garzik
Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as Andrew suggested, will look like: asmlinkage

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Jeremy Fitzhardinge
Amit K. Arora wrote: + if (inode-i_op inode-i_op-fallocate) + ret = inode-i_op-fallocate(inode, offset, len); + else + ret = -ENOTTY; You can only allocate space on typewriters? ;) J - To unsubscribe from this list: send the line unsubscribe linux-fsdevel

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Alan
On Thu, 01 Mar 2007 13:14:32 -0800 Jeremy Fitzhardinge [EMAIL PROTECTED] wrote: Amit K. Arora wrote: + if (inode-i_op inode-i_op-fallocate) + ret = inode-i_op-fallocate(inode, offset, len); + else + ret = -ENOTTY; You can only allocate space on typewriters? ;)

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Jeremy Fitzhardinge
Alan wrote: A lot of people get confused about -ENOTTY, but it is the return for attempting to use an ioctl on the wrong type of object, so this appears to be quite correct. This is a syscall though; ENOSYS is probably a better match. J - To unsubscribe from this list: send the line

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Alan
On Thu, 01 Mar 2007 14:05:36 -0800 Jeremy Fitzhardinge [EMAIL PROTECTED] wrote: Alan wrote: A lot of people get confused about -ENOTTY, but it is the return for attempting to use an ioctl on the wrong type of object, so this appears to be quite correct. This is a syscall though; ENOSYS

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Jeremy Fitzhardinge
Alan wrote: ENOSYS indicates quite different things and ENOTTY is also used for syscalls. I still think ENOTTY is correct. Yes, ENOSYS tends to me operation flat out not support rather than not on this object. I think we can do better than ENOTTY though - ENOTSUP for example (modulo the

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Andrew Morton
On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Nathan Scott
On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Eric Sandeen
Nathan Scott wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Anton Blanchard
That new argument might need to come after fd - ARM has funny requirements on syscall arg padding and layout. FYI the 32bit ppc ABI does too, from arch/powerpc/kernel/sys_ppc32.c: /* * long long munging: * The 32 bit ABI passes long longs in an odd even register pair. */ and the first

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Dave Kleikamp
On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: +asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len) +{ + struct file *file; + struct inode *inode; + long ret = -EINVAL; + file =

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Andrew Morton
On Fri, 02 Mar 2007 09:40:54 +1100 Nathan Scott [EMAIL PROTECTED] wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: This is to give a heads up on few patches that we will be soon coming up with.

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Andrew Morton
On Thu, 01 Mar 2007 22:44:16 + Dave Kleikamp [EMAIL PROTECTED] wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: +asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len) +{ + struct

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Dave Kleikamp
On Thu, 2007-03-01 at 14:59 -0800, Andrew Morton wrote: On Thu, 01 Mar 2007 22:44:16 + Dave Kleikamp [EMAIL PROTECTED] wrote: On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote: On Fri, 2 Mar 2007 00:04:45 +0530 Amit K. Arora [EMAIL PROTECTED] wrote: +asmlinkage long

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Eric Sandeen
Amit K. Arora wrote: Might want more error checking in there, something like (rough cut)... (or is some of this glibc's job?) +asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len) +{ + struct file *file; + struct inode *inode; + long ret; + + ret = -EINVAL; +

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Christoph Hellwig
On Fri, Mar 02, 2007 at 12:04:45AM +0530, Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Christoph Hellwig
On Thu, Mar 01, 2007 at 10:44:16PM +, Dave Kleikamp wrote: Would EINVAL (or whatever) make it back to the caller of posix_fallocate(), or would glibc fall back to its current implementation? Forgive me if I haven't put enough thought into it, but would it be useful to create a

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Christoph Hellwig
On Thu, Mar 01, 2007 at 05:29:15PM -0600, Eric Sandeen wrote: Amit K. Arora wrote: Might want more error checking in there, something like (rough cut)... (or is some of this glibc's job?) Yeah, we need to have this checks. We can't rely on userspace not passing arguments that might corrupt

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Badari Pulavarty
Amit K. Arora wrote: This is to give a heads up on few patches that we will be soon coming up with. These patches implement a new system call sys_fallocate() and a new inode operation fallocate, for persistent preallocation. The new system call, as Andrew suggested, will look like:

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Andrew Morton
On Thu, 01 Mar 2007 22:03:55 -0800 Badari Pulavarty [EMAIL PROTECTED] wrote: Just curious .. What does posix_fallocate() return ? bookmark this: http://www.opengroup.org/onlinepubs/009695399/nfindex.html Upon successful completion, posix_fallocate() shall return zero; otherwise, an

Re: [RFC] Heads up on sys_fallocate()

2007-03-01 Thread Ulrich Drepper
Andrew Morton wrote: Perhaps Ulrich can comment. I was out of town, hence the delay. I think that if there is no support for the syscall the correct answer is to return ENOSYS. In this case the current userlevel code would be used and ENOSYS is also used to trigger the use of the compat code