Re: [PATCH] isofs: mounting to regular file may succeed
On Sat, Jul 14, 2007 at 09:16:51PM +0200, Jan Engelhardt wrote: On Jul 14 2007 03:47, Kirill Kuvaldin wrote: We then can mount it to a regular file: Wow, this is news to me. Since when is it possible to mount files to files? It is possible to mount a regular file to another one with --bind. The problem in question is that mounting a malformed ISO 9660 image to a directory fails, but to a regular file - succeeds. Kirill - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Fri, 2007-07-13 at 14:47 -0700, Zach Brown wrote: Peter, do you have any interest in seeing how far we can get at tracking lock_page()? I'm not holding my breath, but any little bit would probably help. I ran headfirst into the fact the unlock_page() need not be called by the same task that did lock_page(). Esp IO-completion interrupts love to unlock pages they did not lock themselves. Not at all sure that is fixable, it seems to be the nature of the async structure of the problem :-( - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Fri, 2007-07-13 at 14:47 -0700, Zach Brown wrote: Peter, do you have any interest in seeing how far we can get at tracking lock_page()? I'm not holding my breath, but any little bit would probably help. Would this be a valid report? ( /me goes hunt a x86_64 unwinder patch that will apply to this tree. These stacktraces are pain ) === [ INFO: possible circular locking dependency detected ] [ 2.6.22-rt3-dirty #34 --- mount/1296 is trying to acquire lock: (ei-truncate_mutex){--..}, at: [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 but task is already holding lock: (lock_page_0){--..}, at: [80267107] generic_file_buffered_write+0x1ee/0x646 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (lock_page_0){--..}: [80251b26] __lock_acquire+0xa72/0xc35 [802520c9] lock_acquire+0x48/0x61 [80265e22] add_to_page_cache_lru+0xe/0x23 [80265d31] add_to_page_cache+0x1de/0x2c1 [80265e22] add_to_page_cache_lru+0xe/0x23 [80266985] find_or_create_page+0x4c/0x73 [802ae716] __getblk+0x118/0x23c [802afa91] __bread+0x6/0x9c [802f382d] read_block_bitmap+0x34/0x65 [802f3e1b] ext3_free_blocks_sb+0xec/0x3d4 [802f4131] ext3_free_blocks+0x2e/0x61 [802f82bc] ext3_free_data+0xaa/0xda [802f8976] ext3_truncate+0x4d2/0x84e [8026df5a] pagevec_lookup+0x17/0x1e [8026e7b1] truncate_inode_pages_range+0x1f4/0x323 [802614b4] add_preempt_count+0x14/0xe4 [80304d13] journal_stop+0x1fe/0x21d [8027661a] vmtruncate+0xa2/0xc0 [802a292b] inode_setattr+0x22/0x10a [802f9b51] ext3_setattr+0x136/0x18f [802a2b1d] notify_change+0x10a/0x241 [802a2b3b] notify_change+0x128/0x241 [8028e35e] do_truncate+0x56/0x7f [8028e369] do_truncate+0x61/0x7f [80296278] get_write_access+0x3f/0x45 [802973c7] may_open+0x193/0x1af [80299869] open_namei+0x2cb/0x63e [8025718b] rt_up_read+0x53/0x5c [8056da59] do_page_fault+0x479/0x7cc [8028dce1] do_filp_open+0x1c/0x38 [8056a4f9] rt_spin_unlock+0x17/0x47 [8028da05] get_unused_fd+0xf9/0x107 [8028dd45] do_sys_open+0x48/0xd5 [8020950e] system_call+0x7e/0x83 [] 0x - #0 (ei-truncate_mutex){--..}: [802503b9] print_circular_bug_header+0xcc/0xd3 [80251a22] __lock_acquire+0x96e/0xc35 [802520c9] lock_acquire+0x48/0x61 [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 [8056a6d4] _mutex_lock+0x26/0x52 [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 [802504b2] find_usage_backwards+0xb0/0xd9 [802504b2] find_usage_backwards+0xb0/0xd9 [80250d7c] debug_check_no_locks_freed+0x11d/0x129 [80250c33] trace_hardirqs_on_caller+0x115/0x138 [8024efdc] lockdep_init_map+0xac/0x41f [802614b4] add_preempt_count+0x14/0xe4 [802f8035] ext3_get_block+0xc2/0xe4 [802aeed3] __block_prepare_write+0x195/0x442 [802f7f73] ext3_get_block+0x0/0xe4 [802af19a] block_prepare_write+0x1a/0x25 [802f93e9] ext3_prepare_write+0xb2/0x17b [802671b1] generic_file_buffered_write+0x298/0x646 [8023944e] current_fs_time+0x3b/0x40 [802614b4] add_preempt_count+0x14/0xe4 [802678ae] __generic_file_aio_write_nolock+0x34f/0x3b9 [8024ed3d] put_lock_stats+0xe/0x2a [80267964] generic_file_aio_write+0x4c/0xc4 [80267979] generic_file_aio_write+0x61/0xc4 [802fcf18] ext3_orphan_del+0x53/0x19f [802f5768] ext3_file_write+0x1c/0x9d [8028ef31] do_sync_write+0xcc/0x10f [80246f9c] autoremove_wake_function+0x0/0x2e [8024ecfe] get_lock_stats+0xe/0x3f [8024ed9a] lock_release_holdtime+0x41/0x4f [8024ed3d] put_lock_stats+0xe/0x2a [8028dfb1] sys_fchmod+0xa3/0xbd [8056a717] _mutex_unlock+0x17/0x20 [8028f6cd] vfs_write+0xb6/0x148 [8028fc61] sys_write+0x48/0x74 [8020950e] system_call+0x7e/0x83 [] 0x other info that might help us debug this: 2 locks held by mount/1296: #0: (inode-i_mutex){--..}, at: [80267964] generic_file_aio_write+0x4c/0xc4 #1: (lock_page_0){--..}, at: [80267107]
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Sun, 2007-07-15 at 15:02 +0200, Peter Zijlstra wrote: On Fri, 2007-07-13 at 14:47 -0700, Zach Brown wrote: Peter, do you have any interest in seeing how far we can get at tracking lock_page()? I'm not holding my breath, but any little bit would probably help. Would this be a valid report? === [ INFO: possible circular locking dependency detected ] [ 2.6.22-rt3-dirty #35 --- mkdir/1662 is trying to acquire lock: (lock_page_0){--..}, at: [80265df6] add_to_page_cache_lru+0xe/0x23 but task is already holding lock: (jbd_handle){--..}, at: [80305797] journal_start+0x108/0x12c which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (jbd_handle){--..}: [80251b16] __lock_acquire+0xa72/0xc35 [802520b9] lock_acquire+0x48/0x61 [80305797] journal_start+0x108/0x12c [803057b3] journal_start+0x124/0x12c [802f92e9] ext3_prepare_write+0x42/0x17b [80267185] generic_file_buffered_write+0x298/0x646 [8023943e] current_fs_time+0x3b/0x40 [802614a4] add_preempt_count+0x14/0xe4 [80267882] __generic_file_aio_write_nolock+0x34f/0x3b9 [8024ed2d] put_lock_stats+0xe/0x2a [80267938] generic_file_aio_write+0x4c/0xc4 [8026794d] generic_file_aio_write+0x61/0xc4 [802fce88] ext3_orphan_del+0x53/0x19f [802f56d8] ext3_file_write+0x1c/0x9d [8028eedd] do_sync_write+0xcc/0x10f [80246f8d] autoremove_wake_function+0x0/0x2e [8024ecee] get_lock_stats+0xe/0x3f [8024ed8a] lock_release_holdtime+0x41/0x4f [8024ed2d] put_lock_stats+0xe/0x2a [8028df5d] sys_fchmod+0xa3/0xbd [8056a6d7] _mutex_unlock+0x17/0x20 [8028f679] vfs_write+0xb6/0x148 [8028fc0d] sys_write+0x48/0x74 [8020950e] system_call+0x7e/0x83 [] 0x - #0 (lock_page_0){--..}: [802503a9] print_circular_bug_header+0xcc/0xd3 [80251a12] __lock_acquire+0x96e/0xc35 [802520b9] lock_acquire+0x48/0x61 [80265df6] add_to_page_cache_lru+0xe/0x23 [80265d05] add_to_page_cache+0x1de/0x2c1 [80265df6] add_to_page_cache_lru+0xe/0x23 [80266959] find_or_create_page+0x4c/0x73 [802ae6c2] __getblk+0x118/0x23c [802f7d9a] ext3_getblk+0xf2/0x23b [80306337] journal_dirty_metadata+0x1a8/0x1b3 [80301e3e] __ext3_journal_dirty_metadata+0x1e/0x46 [802f6c63] ext3_mark_iloc_dirty+0x293/0x30a [802f70a1] ext3_mark_inode_dirty+0x3f/0x48 [802f644e] ext3_new_inode+0x8ff/0x943 [802f8c9c] ext3_bread+0x11/0x84 [802fccc3] ext3_mkdir+0xdd/0x24f [80296893] vfs_mkdir+0x6d/0xb5 [8029902e] sys_mkdirat+0xa1/0xec [80569f4d] trace_hardirqs_on_thunk+0x3a/0x3c [80250c23] trace_hardirqs_on_caller+0x115/0x138 [80569f4d] trace_hardirqs_on_thunk+0x3a/0x3c [8020950e] system_call+0x7e/0x83 [] 0x other info that might help us debug this: 2 locks held by mkdir/1662: #0: (inode-i_mutex/1){--..}, at: [80297154] lookup_create+0x26/0x8b #1: (jbd_handle){--..}, at: [80305797] journal_start+0x108/0x12c stack backtrace: Call Trace: [8024ffad] print_circular_bug_tail+0x69/0x72 [802503a9] print_circular_bug_header+0xcc/0xd3 [80251a12] __lock_acquire+0x96e/0xc35 [802520b9] lock_acquire+0x48/0x61 [80265df6] add_to_page_cache_lru+0xe/0x23 [80265d05] add_to_page_cache+0x1de/0x2c1 [80265df6] add_to_page_cache_lru+0xe/0x23 [80266959] find_or_create_page+0x4c/0x73 [802ae6c2] __getblk+0x118/0x23c [802f7d9a] ext3_getblk+0xf2/0x23b [80306337] journal_dirty_metadata+0x1a8/0x1b3 [80301e3e] __ext3_journal_dirty_metadata+0x1e/0x46 [802f6c63] ext3_mark_iloc_dirty+0x293/0x30a [802f70a1] ext3_mark_inode_dirty+0x3f/0x48 [802f644e] ext3_new_inode+0x8ff/0x943 [802f8c9c] ext3_bread+0x11/0x84 [802fccc3] ext3_mkdir+0xdd/0x24f [80296893] vfs_mkdir+0x6d/0xb5 [8029902e] sys_mkdirat+0xa1/0xec [80569f4d] trace_hardirqs_on_thunk+0x3a/0x3c [80250c23] trace_hardirqs_on_caller+0x115/0x138 [80569f4d] trace_hardirqs_on_thunk+0x3a/0x3c [8020950e] system_call+0x7e/0x83 INFO: lockdep is turned off. --- | preempt count: ] | 0-level deep critical section nesting:
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Sun, 2007-07-15 at 11:11 -0700, Andrew Morton wrote: On Sun, 15 Jul 2007 15:02:23 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote: On Fri, 2007-07-13 at 14:47 -0700, Zach Brown wrote: Peter, do you have any interest in seeing how far we can get at tracking lock_page()? I'm not holding my breath, but any little bit would probably help. Would this be a valid report? ( /me goes hunt a x86_64 unwinder patch that will apply to this tree. These stacktraces are pain ) They are. lockdep reports are a pain too. It's still a struggle to understand wtf they're trying to tell you. Mabe it's just me. It got you confused alright,.. === [ INFO: possible circular locking dependency detected ] [ 2.6.22-rt3-dirty #34 --- mount/1296 is trying to acquire lock: (ei-truncate_mutex){--..}, at: [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 but task is already holding lock: (lock_page_0){--..}, at: [80267107] generic_file_buffered_write+0x1ee/0x646 which lock already depends on the new lock. So, the offence is trying to acquire ei-truncate_mutex while already holding lock_page_0. These traces show how the previous (reverse) dependancy came into being the existing dependency chain (in reverse order) is: - #1 (lock_page_0){--..}: [80251b26] __lock_acquire+0xa72/0xc35 [802520c9] lock_acquire+0x48/0x61 [80265e22] add_to_page_cache_lru+0xe/0x23 [80265d31] add_to_page_cache+0x1de/0x2c1 [80265e22] add_to_page_cache_lru+0xe/0x23 [80266985] find_or_create_page+0x4c/0x73 [802ae716] __getblk+0x118/0x23c [802afa91] __bread+0x6/0x9c [802f382d] read_block_bitmap+0x34/0x65 [802f3e1b] ext3_free_blocks_sb+0xec/0x3d4 [802f4131] ext3_free_blocks+0x2e/0x61 [802f82bc] ext3_free_data+0xaa/0xda [802f8976] ext3_truncate+0x4d2/0x84e [8026df5a] pagevec_lookup+0x17/0x1e [8026e7b1] truncate_inode_pages_range+0x1f4/0x323 [802614b4] add_preempt_count+0x14/0xe4 [80304d13] journal_stop+0x1fe/0x21d [8027661a] vmtruncate+0xa2/0xc0 [802a292b] inode_setattr+0x22/0x10a [802f9b51] ext3_setattr+0x136/0x18f [802a2b1d] notify_change+0x10a/0x241 [802a2b3b] notify_change+0x128/0x241 [8028e35e] do_truncate+0x56/0x7f [8028e369] do_truncate+0x61/0x7f [80296278] get_write_access+0x3f/0x45 [802973c7] may_open+0x193/0x1af [80299869] open_namei+0x2cb/0x63e [8025718b] rt_up_read+0x53/0x5c [8056da59] do_page_fault+0x479/0x7cc [8028dce1] do_filp_open+0x1c/0x38 [8056a4f9] rt_spin_unlock+0x17/0x47 [8028da05] get_unused_fd+0xf9/0x107 [8028dd45] do_sys_open+0x48/0xd5 [8020950e] system_call+0x7e/0x83 [] 0x I guess we're doing lock_page() against a blockdev pagecache page here while holding truncate_mutex against some S_ISREG file. So this trace ( - #1 ) shows how lock_page_0 became to depend on ei-truncate_mutex ( - #0 ). - #0 (ei-truncate_mutex){--..}: [802503b9] print_circular_bug_header+0xcc/0xd3 [80251a22] __lock_acquire+0x96e/0xc35 [802520c9] lock_acquire+0x48/0x61 [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 [8056a6d4] _mutex_lock+0x26/0x52 [802f75e5] ext3_get_blocks_handle+0x1a4/0x8f7 [802504b2] find_usage_backwards+0xb0/0xd9 [802504b2] find_usage_backwards+0xb0/0xd9 [80250d7c] debug_check_no_locks_freed+0x11d/0x129 [80250c33] trace_hardirqs_on_caller+0x115/0x138 [8024efdc] lockdep_init_map+0xac/0x41f [802614b4] add_preempt_count+0x14/0xe4 [802f8035] ext3_get_block+0xc2/0xe4 [802aeed3] __block_prepare_write+0x195/0x442 [802f7f73] ext3_get_block+0x0/0xe4 [802af19a] block_prepare_write+0x1a/0x25 [802f93e9] ext3_prepare_write+0xb2/0x17b [802671b1] generic_file_buffered_write+0x298/0x646 [8023944e] current_fs_time+0x3b/0x40 [802614b4] add_preempt_count+0x14/0xe4 [802678ae] __generic_file_aio_write_nolock+0x34f/0x3b9 [8024ed3d] put_lock_stats+0xe/0x2a [80267964] generic_file_aio_write+0x4c/0xc4 [80267979] generic_file_aio_write+0x61/0xc4 [802fcf18]
*at syscalls for xattrs?
Hi, recently, the family of *at() syscalls and functions (openat, fstatat, etc.) have been added to Linux and Glibc, respectively. In short: I am missing xattr at functions :) BTW, why is fstatat called fstatat and not statat? (Same goes for futimesat.) It does not take a file descriptor for the file argument. Otherwise we'd also need fopenat/funlinkat, etc. Any reasons? Thanks, Jan -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Sun, 15 Jul 2007 21:21:03 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote: Shows the current stacktrace where we violate the previously established locking order. yup, but the lock_page() which we did inside truncate_mutex was a lock_page() against a different address_space: the blockdev mapping. So this is OK - we'll never take truncate_mutex against the blockdev mapping (it doesn't have one, for a start ;)) This is similar to the quite common case where we take inode A's i_mutex inside inode B's i_mutex, which needs special lockdep annotations. I think. I haven't looked into this in detail. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 5][PATCH 1/1] expand inode i_extra_isize to support features in larger inode
On Sun, 2007-07-15 at 12:59 -0700, Andrew Morton wrote: On Sun, 15 Jul 2007 21:21:03 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote: Shows the current stacktrace where we violate the previously established locking order. yup, but the lock_page() which we did inside truncate_mutex was a lock_page() against a different address_space: the blockdev mapping. So this is OK - we'll never take truncate_mutex against the blockdev mapping (it doesn't have one, for a start ;)) This is similar to the quite common case where we take inode A's i_mutex inside inode B's i_mutex, which needs special lockdep annotations. I think. I haven't looked into this in detail. Right, I can make lock_page classes per address space. Lets see if this one goes away. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: *at syscalls for xattrs?
On Sun, Jul 15, 2007 at 09:46:27PM +0200, Jan Engelhardt wrote: Hi, recently, the family of *at() syscalls and functions (openat, fstatat, etc.) have been added to Linux and Glibc, respectively. In short: I am missing xattr at functions :) No. They are not fscking forks. They are almost as revolting, but not quite on the same level. BTW, why is fstatat called fstatat and not statat? (Same goes for futimesat.) It does not take a file descriptor for the file argument. Otherwise we'd also need fopenat/funlinkat, etc. Any reasons? Ulrich having an odd taste? - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: *at syscalls for xattrs?
On Sun, Jul 15, 2007 at 02:13:21PM -0700, Nicholas Miell wrote: I suspect he was asking for int getxattrat(int fd, const char *path, const char *name, void *value, size_t size, int flags) int setxattrat(int fd, const char *path, const char *name, void *value, size_t size, int xattrflags, int atflags) rather than the ability to access xattrs as files. Just one question: what the bleeding hell for? Not that the rest of ..at() family made any damn sense as an interface... BTW, why is fstatat called fstatat and not statat? (Same goes for futimesat.) It does not take a file descriptor for the file argument. Otherwise we'd also need fopenat/funlinkat, etc. Any reasons? Ulrich having an odd taste? Solaris compatibility. Sun having no taste whatsoever - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: *at syscalls for xattrs?
Al Viro wrote: BTW, why is fstatat called fstatat and not statat? (Same goes for futimesat.) It does not take a file descriptor for the file argument. Otherwise we'd also need fopenat/funlinkat, etc. Any reasons? Ulrich having an odd taste? Solaris compatibility. Sun having no taste whatsoever Yup. I filed an objection to this with the POSIX committee, but it was rejected :( -hpa - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6][TAKE7] manpage for fallocate
On Sat, Jul 14, 2007 at 10:23:42AM +0200, Michael Kerrisk wrote: [CC += [EMAIL PROTECTED] Amit, Hi Michael, Thanks for this page. I will endeavour to review it in the coming days. In the meantime, the better address to CC me on fot man pages stuff is [EMAIL PROTECTED] Sure. BTW, this man page has changed a bit and the one in TAKE8 of fallocate patches is the latest one. You are copied on that too. I will forward that mail to [EMAIL PROTECTED] id also, so that you do not miss it. Thanks! -- Regards, Amit Arora Cheers, Michael Following is the modified version of the manpage originally submitted by David Chinner. Please use `nroff -man fallocate.2 | less` to view. This includes changes suggested by Heikki Orsila and Barry Naujok. .TH fallocate 2 .SH NAME fallocate \- allocate or remove file space .SH SYNOPSIS .nf .B #include fcntl.h .PP .BI long fallocate(int fd , int mode , loff_t offset , loff_t len); .SH DESCRIPTION The .B fallocate syscall allows a user to directly manipulate the allocated disk space for the file referred to by .I fd for the byte range starting at .I offset and continuing for .I len bytes. The .I mode parameter determines the operation to be performed on the given range. Currently there are two modes: .TP .B FALLOC_ALLOCATE allocates and initialises to zero the disk space within the given range. After a successful call, subsequent writes are guaranteed not to fail because of lack of disk space. If the size of the file is less than .IR offset + len , then the file is increased to this size; otherwise the file size is left unchanged. .B FALLOC_ALLOCATE closely resembles .BR posix_fallocate (3) and is intended as a method of optimally implementing this function. .B FALLOC_ALLOCATE may allocate a larger range than that was specified. .TP .B FALLOC_RESV_SPACE provides the same functionality as .B FALLOC_ALLOCATE except it does not ever change the file size. This allows allocation of zero blocks beyond the end of file and is useful for optimising append workloads. .SH RETURN VALUE .B fallocate returns zero on success, or an error number on failure. Note that .I errno is not set. .SH ERRORS .TP .B EBADF .I fd is not a valid file descriptor, or is not opened for writing. .TP .B EFBIG .IR offset + len exceeds the maximum file size. .TP .B EINVAL .I offset was less than 0, or .I len was less than or equal to 0. .TP .B ENODEV .I fd does not refer to a regular file or a directory. .TP .B ENOSPC There is not enough space left on the device containing the file referred to by .IR fd . .TP .B ESPIPE .I fd refers to a pipe of file descriptor. .TP .B ENOSYS The filesystem underlying the file descriptor does not support this operation. .TP .B EINTR A signal was caught during execution .TP .B EIO An I/O error occurred while reading from or writing to a file system. .TP .B EOPNOTSUPP The mode is not supported on the file descriptor. .SH AVAILABILITY The .B fallocate system call is available since 2.6.XX .SH SEE ALSO .BR syscall (2), .BR posix_fadvise (3), .BR ftruncate (3). -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ia64 fallocate system call
sys_fallocate for ia64. This uses the empty slot originally reserved for move_pages. Signed-Off-By: Dave Chinner [EMAIL PROTECTED] --- arch/ia64/kernel/entry.S |2 +- include/asm-ia64/unistd.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: 2.6.x-xfs-new/arch/ia64/kernel/entry.S === --- 2.6.x-xfs-new.orig/arch/ia64/kernel/entry.S 2007-07-16 14:18:51.432168485 +1000 +++ 2.6.x-xfs-new/arch/ia64/kernel/entry.S 2007-07-16 14:22:08.582454284 +1000 @@ -1581,7 +1581,7 @@ sys_call_table: data8 sys_sync_file_range // 1300 data8 sys_tee data8 sys_vmsplice - data8 sys_ni_syscall// reserved for move_pages + data8 sys_fallocate data8 sys_getcpu data8 sys_epoll_pwait // 1305 data8 sys_utimensat Index: 2.6.x-xfs-new/include/asm-ia64/unistd.h === --- 2.6.x-xfs-new.orig/include/asm-ia64/unistd.h2007-06-08 21:36:31.0 +1000 +++ 2.6.x-xfs-new/include/asm-ia64/unistd.h 2007-07-16 14:22:41.166204402 +1000 @@ -292,7 +292,7 @@ #define __NR_sync_file_range 1300 #define __NR_tee 1301 #define __NR_vmsplice 1302 -/* 1303 reserved for move_pages */ +#define __NR_fallocate 1303 #define __NR_getcpu1304 #define __NR_epoll_pwait 1305 #define __NR_utimensat 1306 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xfs: implement fallocate V2
Initial implementation of -fallocate for XFS. Version 2: o Make allocation and setting the file size atomic. o Drop deallocate/punch functionality o use mode field appropriately to determine if size needs changing. --- fs/xfs/linux-2.6/xfs_iops.c | 47 1 file changed, 47 insertions(+) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_iops.c === --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_iops.c 2007-07-16 14:16:02.090255611 +1000 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_iops.c 2007-07-16 14:50:07.087885337 +1000 @@ -51,6 +51,7 @@ #include linux/xattr.h #include linux/namei.h #include linux/security.h +#include linux/falloc.h /* * Get a XFS inode from a given vnode. @@ -812,6 +813,51 @@ xfs_vn_removexattr( return namesp-attr_remove(vp, attr, xflags); } +/* + * generic space allocation vector. + * + * This should really through a bhv_vop before stuffing around + * with xfs_inodes and such. + */ +STATIC long +xfs_vn_fallocate( + struct inode*inode, + int mode, + loff_t offset, + loff_t len) +{ + longerror = -EOPNOTSUPP; + bhv_vnode_t *vp = vn_from_inode(inode); + bhv_desc_t *bdp; + loff_t new_size = 0; + xfs_flock64_t bf; + + bf.l_whence = 0; + bf.l_start = offset; + bf.l_len = len; + + bdp = bhv_lookup_range(VN_BHV_HEAD(vp), VNODE_POSITION_XFS, + VNODE_POSITION_XFS); + + xfs_ilock(xfs_vtoi(vp), XFS_IOLOCK_EXCL); + error = xfs_change_file_space(bdp, XFS_IOC_RESVSP, bf, 0, NULL, + ATTR_NOLOCK); + if (!error !(mode FALLOC_FL_KEEP_SIZE) + offset + len i_size_read(inode)) + new_size = offset + len; + + /* Change file size if needed */ + if (new_size) { + bhv_vattr_t va; + + va.va_mask = XFS_AT_SIZE; + va.va_size = new_size; + error = bhv_vop_setattr(vp, va, ATTR_NOLOCK, NULL); + } + + xfs_iunlock(xfs_vtoi(vp), XFS_IOLOCK_EXCL); + return error; +} const struct inode_operations xfs_inode_operations = { .permission = xfs_vn_permission, @@ -822,6 +868,7 @@ const struct inode_operations xfs_inode_ .getxattr = xfs_vn_getxattr, .listxattr = xfs_vn_listxattr, .removexattr= xfs_vn_removexattr, + .fallocate = xfs_vn_fallocate, }; const struct inode_operations xfs_dir_inode_operations = { -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] introduce fallocate support into xfs_io
FYI. Initial support for fallocate-based pre-allocation in xfs_io for testing. This currently only works on ia64 because of the hard coded syscall number and will require autoconf magic to conditionally compile in this support. This allows simple command-line based testing of fallocate based allocation such as: # ~/xfs_io -f -c falloc_resvsp 0 1024k -c bmap -vp -c stat /mnt/scratch/fred /mnt/scratch/fred: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSETTOTAL FLAGS 0: [0..2047]: 96..2143 0 (96..2143)2048 1 fd.path = /mnt/scratch/fred fd.flags = non-sync,non-direct,read-write stat.ino = 131 stat.type = regular file stat.size = 0 stat.blocks = 2048 fsxattr.xflags = 0x2 [-p] fsxattr.projid = 0 fsxattr.extsize = 0 fsxattr.nextents = 1 fsxattr.naextents = 0 dioattr.mem = 0x200 dioattr.miniosz = 512 dioattr.maxiosz = 2147483136 Or more complex cases: # ~/xfs_io -f \ -c falloc_allocsp 0 1024k \ -c unresvsp 32k 32k \ -c unresvsp 128k 64k \ -c unresvsp 512k 256k \ -cpwrite 0 16k \ -c pwrite 96k 128k \ -c pwrite 640k 384k \ -c bmap -vp \ -c falloc_resvsp 0 1024k \ -c bmap -vvp /mnt/scratch/fred wrote 16384/16384 bytes at offset 0 16 KiB, 4 ops; 0. sec (274.123 MiB/sec and 70175.4386 ops/sec) wrote 131072/131072 bytes at offset 98304 128 KiB, 32 ops; 0. sec (338.753 MiB/sec and 86720.8672 ops/sec) wrote 393216/393216 bytes at offset 655360 384 KiB, 96 ops; 0. sec (386.200 MiB/sec and 98867.1473 ops/sec) /mnt/scratch/fred: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSETTOTAL FLAGS 0: [0..31]: 96..127 0 (96..127) 32 1: [32..63]:128..159 0 (128..159) 32 1 2: [64..127]: hole64 3: [128..191]: 224..287 0 (224..287) 64 1 4: [192..447]: 288..543 0 (288..543) 256 5: [448..1023]: 544..1119 0 (544..1119)576 1 6: [1024..1279]:hole 256 7: [1280..2047]:1376..21430 (1376..2143) 768 /mnt/scratch/fred: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSETTOTAL FLAGS 0: [0..31]: 96..127 0 (96..127) 32 1: [32..191]: 128..287 0 (128..287) 160 1 2: [192..447]: 288..543 0 (288..543) 256 3: [448..1279]: 544..1375 0 (544..1375)832 1 4: [1280..2047]:1376..21430 (1376..2143) 768 FLAG Values: 01 Unwritten preallocated extent 001000 Doesn't begin on stripe unit 000100 Doesn't end on stripe unit 10 Doesn't begin on stripe width 01 Doesn't end on stripe width Yes, that looks like it filled all the holes properly, and the allocator allocated the right holes on disk to merge adjacent extents when hole filling. ;) --- xfsprogs/io/prealloc.c | 72 + 1 file changed, 72 insertions(+) Index: xfs-cmds/xfsprogs/io/prealloc.c === --- xfs-cmds.orig/xfsprogs/io/prealloc.c2006-11-15 19:00:31.0 +1100 +++ xfs-cmds/xfsprogs/io/prealloc.c 2007-07-16 15:25:44.041513574 +1000 @@ -26,6 +26,8 @@ static cmdinfo_t allocsp_cmd; static cmdinfo_t freesp_cmd; static cmdinfo_t resvsp_cmd; static cmdinfo_t unresvsp_cmd; +static cmdinfo_t falloc_allocsp_cmd; +static cmdinfo_t falloc_resvsp_cmd; static int offset_length( @@ -119,6 +121,56 @@ unresvsp_f( return 0; } +/* + * Hack, hack, hackety-hack-hack. + * + * This only works for ia64... + */ +#define __NR_fallocate1303 + +/* + * someday there'll be a real header file + */ +#define FALLOC_FL_KEEP_SIZE 0x01 +#define FALLOC_ALLOCATE 0x0 +#define FALLOC_RESV_SPACE FALLOC_FL_KEEP_SIZE + +static int +fallocate_allocsp_f( + int argc, + char**argv) +{ + xfs_flock64_t segment; + + if (!offset_length(argv[1], argv[2], segment)) + return 0; + + if (syscall(__NR_fallocate, file-fd, FALLOC_ALLOCATE, + segment.l_start, segment.l_len)) { + perror(FALLOC_ALLOCATE); + return 0; + } + return 0; +} + +static int +fallocate_resvsp_f( + int argc, + char**argv) +{ + xfs_flock64_t segment; + + if (!offset_length(argv[1], argv[2], segment)) + return 0; + + if (syscall(__NR_fallocate, file-fd, FALLOC_RESV_SPACE, + segment.l_start, segment.l_len)) { + perror(FALLOC_ALLOCATE); + return 0; + } + return 0; +} + void prealloc_init(void) { @@ -156,8 +208,28 @@ prealloc_init(void) unresvsp_cmd.oneline = _(frees reserved space associated with part of a file); +