Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Mark Fasheh
On Thu, Jul 19, 2007 at 03:10:52PM +1000, David Chinner wrote:
 % git-log 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 |head -10
 commit 84e1e99f112dead8f9ba036c02d24a9f5ce7f544
 Author: David Chinner [EMAIL PROTECTED]
 Date:   Mon Jun 18 16:50:27 2007 +1000
 
 [XFS] Prevent ENOSPC from aborting transactions that need to succeed
 
 During delayed allocation extent conversion or unwritten extent
 conversion, we need to reserve some blocks for transactions reservations.
 We need to reserve these blocks in case a btree split occurs and we need
 to allocate some blocks.
 
 --
 
 IOWs, XFS didn't provide this guarantee until about a month ago

Ok, once again XFS is ahead of the curve ;)

Comment rescinded then...
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Andreas Dilger
On Jul 18, 2007  20:41 -0700, Mark Fasheh wrote:
 On Sat, Jul 14, 2007 at 12:16:25AM +0530, Amit K. Arora wrote:
  After a successful call, subsequent writes are guaranteed not to fail
  because of lack of disk space.
   
 If a write to an unwritten region requires a node split, that could result
 in the allocation of new meta data which obviously could fail if the disk is
 truly full.
 
 Granted that's unlikely to happen but maybe we should be conservative and
 say something like:
 
 After a successful call, subsequent writes are guaranteed to never require
 allocation of file data. ?
 --Mark

In the worst case, the unwritten extent could be zero-filled before the write
is done, so no exent split is needed.  We discussed this recently for the
ext4 fallocate, but didn't consider it important enough to hold the code.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-18 Thread Mark Fasheh
On Sat, Jul 14, 2007 at 12:16:25AM +0530, Amit K. Arora wrote:
 After a successful call, subsequent writes are guaranteed not to fail because
 of lack of disk space.
  
If a write to an unwritten region requires a node split, that could result
in the allocation of new meta data which obviously could fail if the disk is
truly full.

Granted that's unlikely to happen but maybe we should be conservative and
say something like:

After a successful call, subsequent writes are guaranteed to never require
allocation of file data. ?
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-18 Thread David Chinner
On Wed, Jul 18, 2007 at 08:41:55PM -0700, Mark Fasheh wrote:
 On Sat, Jul 14, 2007 at 12:16:25AM +0530, Amit K. Arora wrote:
  After a successful call, subsequent writes are guaranteed not to fail 
  because
  of lack of disk space.
   
 If a write to an unwritten region requires a node split, that could result
 in the allocation of new meta data which obviously could fail if the disk is
 truly full.

% git-log 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 |head -10
commit 84e1e99f112dead8f9ba036c02d24a9f5ce7f544
Author: David Chinner [EMAIL PROTECTED]
Date:   Mon Jun 18 16:50:27 2007 +1000

[XFS] Prevent ENOSPC from aborting transactions that need to succeed

During delayed allocation extent conversion or unwritten extent
conversion, we need to reserve some blocks for transactions reservations.
We need to reserve these blocks in case a btree split occurs and we need
to allocate some blocks.

--

IOWs, XFS didn't provide this guarantee until about a month ago

 Granted that's unlikely to happen but maybe we should be conservative and
 say something like:
 
 After a successful call, subsequent writes are guaranteed to never require
 allocation of file data. ?

Well, the above phrasing is taken directly from the posix_fallocate() man
page, and it is intended that sys_fallocate() is used to implement
posix_fallocate(). In that case, the semantics we have to provide are
writes are guaranteed not to fail due to lack of disk space.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5][TAKE8] manpage for fallocate

2007-07-13 Thread Amit K. Arora
Following is the modified version of the manpage originally submitted by
David Chinner. Please use `nroff -man fallocate.2 | less` to view.

Following changed from TAKE7:
* Removed FALLOC_ALLOCATE and FALLOCATE_RESV_SPACE modes.
* Described only single flag for mode, i.e. FALLOC_FL_KEEP_SIZE.
* s/zero blocks/zeroed blocks/ as suggested by Dave.
* Included linux/falloc.h instead of fcntl.h.

Following changed from TAKE6 to TAKE7:
Included changes suggested by Heikki Orsila and Barry Naujok.


.TH fallocate 2
.SH NAME
fallocate \- manipulate file space
.SH SYNOPSIS
.nf
.B #include linux/falloc.h
.PP
.BI long fallocate(int  fd , int  mode , loff_t  offset , loff_t  len 
);
.SH DESCRIPTION
The
.B fallocate
syscall allows a user to directly manipulate the allocated disk space
for the file referred to by
.I fd
for the byte range starting at
.I offset
and continuing for
.I len
bytes.
The
.I mode
parameter determines the operation to be performed on the given range.
Currently there is only one flag supported for the mode argument.
.TP
.B FALLOC_FL_KEEP_SIZE
allocates and initialises to zero the disk space within the given range.
After a successful call, subsequent writes are guaranteed not to fail because
of lack of disk space.  Even if the size of the file is less than
.IR offset + len ,
the file size is not changed. This allows allocation of zeroed blocks beyond
the end of file and is useful for optimising append workloads.
.PP
If
.B FALLOC_FL_KEEP_SIZE
flag is not specified in the mode argument, the default behavior of this system
call is almost same as when this flag is passed. The only difference is that
on success, the file size will be changed if the
.IR offset + len
is greater than the file size. This default behavior closely resembles
.BR posix_fallocate (3)
and is intended as a method of optimally implementing this function.
.PP
.B fallocate
may allocate a larger range than that was specified.
.SH RETURN VALUE
.B fallocate
returns zero on success, or an error number on failure.
Note that
.I errno
is not set.
.SH ERRORS
.TP
.B EBADF
.I fd
is not a valid file descriptor, or is not opened for writing.
.TP
.B EFBIG
.IR offset + len
exceeds the maximum file size.
.TP
.B EINVAL
.I offset
was less than 0, or
.I len
was less than or equal to 0.
.TP
.B ENODEV
.I fd
does not refer to a regular file or a directory.
.TP
.B ENOSPC
There is not enough space left on the device containing the file
referred to by
.IR fd .
.TP
.B ESPIPE
.I fd
refers to a pipe of file descriptor.
.TP
.B ENOSYS
The filesystem underlying the file descriptor does not support this
operation.
.TP
.B EINTR
A signal was caught during execution
.TP
.B EIO
An I/O error occurred while reading from or writing to a file system.
.TP
.B EOPNOTSUPP
The mode is not supported on the file descriptor.
.SH AVAILABILITY
The
.B fallocate
system call is available since 2.6.XX
.SH SEE ALSO
.BR posix_fallocate (3),
.BR posix_fadvise (3),
.BR ftruncate (3).
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html