Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)

2014-09-15 Thread Andreas Dilger
On Sep 15, 2014, at 2:20 PM, Milosz Tanski wrote: > This patcheset introduces an ability to perform a non-blocking read > from regular files in buffered IO mode. This works by only for those > filesystems that have data in the page cache. > > It does this by introducing new syscalls new

Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)

2014-09-15 Thread Andreas Dilger
On Sep 15, 2014, at 2:20 PM, Milosz Tanski mil...@adfin.com wrote: This patcheset introduces an ability to perform a non-blocking read from regular files in buffered IO mode. This works by only for those filesystems that have data in the page cache. It does this by introducing new syscalls

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-31 Thread Andreas Dilger
On Aug 1, 2014, at 1:53, Dave Chinner wrote: > On Thu, Jul 31, 2014 at 01:19:45PM +0200, Andreas Dilger wrote: >> None of these issues are relevant in the API that I'm thinking about. >> The syscall just passes the list of inode numbers to be prefetched >> into kernel

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-31 Thread Andreas Dilger
On Jul 31, 2014, at 6:49, Dave Chinner wrote: > >> On Mon, Jul 28, 2014 at 03:19:31PM -0600, Andreas Dilger wrote: >>> On Jul 28, 2014, at 6:52 AM, Abhijith Das wrote: >>> OnJuly 26, 2014 12:27:19 AM "Andreas Dilger" wrote: >>>> Is there a time

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-31 Thread Andreas Dilger
On Jul 31, 2014, at 6:49, Dave Chinner da...@fromorbit.com wrote: On Mon, Jul 28, 2014 at 03:19:31PM -0600, Andreas Dilger wrote: On Jul 28, 2014, at 6:52 AM, Abhijith Das a...@redhat.com wrote: OnJuly 26, 2014 12:27:19 AM Andreas Dilger adil...@dilger.ca wrote: Is there a time when

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-31 Thread Andreas Dilger
On Aug 1, 2014, at 1:53, Dave Chinner da...@fromorbit.com wrote: On Thu, Jul 31, 2014 at 01:19:45PM +0200, Andreas Dilger wrote: None of these issues are relevant in the API that I'm thinking about. The syscall just passes the list of inode numbers to be prefetched into kernel memory

Re: [RFC] readdirplus implementations: xgetdents vs dirreadahead syscalls

2014-07-28 Thread Andreas Dilger
On Jul 25, 2014, at 6:38 PM, Dave Chinner wrote: > On Fri, Jul 25, 2014 at 10:52:57AM -0700, Zach Brown wrote: >> On Fri, Jul 25, 2014 at 01:37:19PM -0400, Abhijith Das wrote: >>> Hi all, >>> >>> The topic of a readdirplus-like syscall had come up for discussion at last >>> year's >>> LSF/MM

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-28 Thread Andreas Dilger
On Jul 28, 2014, at 6:52 AM, Abhijith Das wrote: > OnJuly 26, 2014 12:27:19 AM "Andreas Dilger" wrote: >> Is there a time when this doesn't get called to prefetch entries in >> readdir() order? It isn't clear to me what benefit there is of returning >> the entri

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-28 Thread Andreas Dilger
On Jul 28, 2014, at 6:52 AM, Abhijith Das a...@redhat.com wrote: OnJuly 26, 2014 12:27:19 AM Andreas Dilger adil...@dilger.ca wrote: Is there a time when this doesn't get called to prefetch entries in readdir() order? It isn't clear to me what benefit there is of returning the entries

Re: [RFC] readdirplus implementations: xgetdents vs dirreadahead syscalls

2014-07-28 Thread Andreas Dilger
On Jul 25, 2014, at 6:38 PM, Dave Chinner da...@fromorbit.com wrote: On Fri, Jul 25, 2014 at 10:52:57AM -0700, Zach Brown wrote: On Fri, Jul 25, 2014 at 01:37:19PM -0400, Abhijith Das wrote: Hi all, The topic of a readdirplus-like syscall had come up for discussion at last year's LSF/MM

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-25 Thread Andreas Dilger
Is there a time when this doesn't get called to prefetch entries in readdir() order? It isn't clear to me what benefit there is of returning the entries to userspace instead of just doing the statahead implicitly in the kernel? The Lustre client has had what we call "statahead" for a while, and

Re: [RFC PATCH 0/2] dirreadahead system call

2014-07-25 Thread Andreas Dilger
Is there a time when this doesn't get called to prefetch entries in readdir() order? It isn't clear to me what benefit there is of returning the entries to userspace instead of just doing the statahead implicitly in the kernel? The Lustre client has had what we call statahead for a while, and

Re: [PATCH -V1 22/22] ext4: Add Ext4 compat richacl feature flag

2014-05-01 Thread Andreas Dilger
On May 1, 2014, at 9:48 AM, Aneesh Kumar K.V wrote: > Andreas Dilger writes: > >> On Apr 27, 2014, at 10:14 AM, Aneesh Kumar K.V >> wrote: >>> This feature flag can be used to enable richacl on >>> the file system. Once enabled the "acl" mount o

Re: [PATCH -V1 22/22] ext4: Add Ext4 compat richacl feature flag

2014-05-01 Thread Andreas Dilger
On May 1, 2014, at 9:48 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Andreas Dilger adil...@dilger.ca writes: On Apr 27, 2014, at 10:14 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: This feature flag can be used to enable richacl on the file system. Once

Re: [PATCH -V1 22/22] ext4: Add Ext4 compat richacl feature flag

2014-04-28 Thread Andreas Dilger
On Apr 27, 2014, at 10:14 AM, Aneesh Kumar K.V wrote: > This feature flag can be used to enable richacl on > the file system. Once enabled the "acl" mount option > will enable richacl instead of posix acl I was going to complain about this patch, because re-using the "acl" mount option to

Re: [PATCH -V1 22/22] ext4: Add Ext4 compat richacl feature flag

2014-04-28 Thread Andreas Dilger
On Apr 27, 2014, at 10:14 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: This feature flag can be used to enable richacl on the file system. Once enabled the acl mount option will enable richacl instead of posix acl I was going to complain about this patch, because re-using the

Re: Ext4: deadlock occurs when running fsstress and ENOSPC errors are seen.

2014-04-15 Thread Andreas Dilger
On Apr 15, 2014, at 11:07 PM, Theodore Ts'o wrote: > On Wed, Apr 16, 2014 at 10:30:10AM +0530, Amit Sahrawat wrote: >> 4) Corrupt the block group ‘1’ by writing all ‘1’, we had one file >> with all 1’s, so using ‘dd’ – >> dd if=i_file of=/dev/sdb1 bs=4096 seek=17 count=1 >> After this mount

Re: Ext4: deadlock occurs when running fsstress and ENOSPC errors are seen.

2014-04-15 Thread Andreas Dilger
On Apr 15, 2014, at 11:07 PM, Theodore Ts'o ty...@mit.edu wrote: On Wed, Apr 16, 2014 at 10:30:10AM +0530, Amit Sahrawat wrote: 4) Corrupt the block group ‘1’ by writing all ‘1’, we had one file with all 1’s, so using ‘dd’ – dd if=i_file of=/dev/sdb1 bs=4096 seek=17 count=1 After this

Re: [PATCH] fs: ext4: Sign-extend tv_sec after ORing in epoch bits

2014-03-31 Thread Andreas Dilger
Hmm, I thought there was a separate patch to fix this a few months ago, that was "more correct" than this one? Did that not land? Cheers, Andreas > On Mar 30, 2014, at 8:58, Conrad Meyer wrote: > > Fixes kernel.org bug #23732. > > Background: ext4 stores time as a 34-bit quantity; 2 bits in

Re: [PATCH] fs: ext4: Sign-extend tv_sec after ORing in epoch bits

2014-03-31 Thread Andreas Dilger
Hmm, I thought there was a separate patch to fix this a few months ago, that was more correct than this one? Did that not land? Cheers, Andreas On Mar 30, 2014, at 8:58, Conrad Meyer ceme...@uw.edu wrote: Fixes kernel.org bug #23732. Background: ext4 stores time as a 34-bit quantity; 2

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-02-12 Thread Andreas Dilger
On Feb 11, 2014, at 12:58 PM, Thavatchai Makphaibulchoke wrote: > On 01/24/2014 11:09 PM, Andreas Dilger wrote: >> I think the ext4 block groups are locked with the blockgroup_lock that has >> about the same number of locks as the number of cores, with a max of 128,

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-02-12 Thread Andreas Dilger
On Feb 11, 2014, at 12:58 PM, Thavatchai Makphaibulchoke thavatchai.makpahibulch...@hp.com wrote: On 01/24/2014 11:09 PM, Andreas Dilger wrote: I think the ext4 block groups are locked with the blockgroup_lock that has about the same number of locks as the number of cores, with a max of 128

Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2014-02-10 Thread Andreas Dilger
On Feb 10, 2014, at 10:12 PM, David Turner wrote: > On Tue, 2014-01-21 at 22:22 -0800, Darrick J. Wong wrote: >> On Mon, Nov 11, 2013 at 07:30:18PM -0500, Theodore Ts'o wrote: >>> On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote: b. Use Andreas's encoding, which is incompatible

Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2014-02-10 Thread Andreas Dilger
On Feb 10, 2014, at 10:12 PM, David Turner nova...@novalis.org wrote: On Tue, 2014-01-21 at 22:22 -0800, Darrick J. Wong wrote: On Mon, Nov 11, 2013 at 07:30:18PM -0500, Theodore Ts'o wrote: On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote: b. Use Andreas's encoding, which is

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-28 Thread Andreas Dilger
On Jan 28, 2014, at 5:26 AM, George Spelvin wrote: >> The third part of the patch further increases the scalablity of an ext4 >> filesystem by having each ext4 fielsystem allocate and use its own private >> mbcache structure, instead of sharing a single mcache structures across all >> ext4

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-28 Thread Andreas Dilger
On Jan 28, 2014, at 5:26 AM, George Spelvin li...@horizon.com wrote: The third part of the patch further increases the scalablity of an ext4 filesystem by having each ext4 fielsystem allocate and use its own private mbcache structure, instead of sharing a single mcache structures across all

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-24 Thread Andreas Dilger
I think the ext4 block groups are locked with the blockgroup_lock that has about the same number of locks as the number of cores, with a max of 128, IIRC. See blockgroup_lock.h. While there is some chance of contention, it is also unlikely that all of the cores are locking this area at the

Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-24 Thread Andreas Dilger
I think the ext4 block groups are locked with the blockgroup_lock that has about the same number of locks as the number of cores, with a max of 128, IIRC. See blockgroup_lock.h. While there is some chance of contention, it is also unlikely that all of the cores are locking this area at the

Re: [PATCH 0/3] Fadvise: Directory level page cache cleaning support

2013-12-30 Thread Andreas Dilger
On Dec 30, 2013, at 12:18, Dave Hansen wrote: > > Why is this necessary to do in the kernel? Why not leave it to > userspace to walk the filesystem(s)? I would suspect that trying to do it in userspace would be quite bad. It would require traversing the whole directory tree to issue cache

Re: [PATCH 0/3] Fadvise: Directory level page cache cleaning support

2013-12-30 Thread Andreas Dilger
On Dec 30, 2013, at 12:18, Dave Hansen dave.han...@intel.com wrote: Why is this necessary to do in the kernel? Why not leave it to userspace to walk the filesystem(s)? I would suspect that trying to do it in userspace would be quite bad. It would require traversing the whole directory tree

Re: [PATCH v2] fuse: Fix IOC_[GS]ET{FLAGS,VERSION} argument size brokenness.

2013-12-20 Thread Andreas Dilger
To be honest, FS_IOC_SETVERSION isn't used by many/any users, so it might be better to avoid doing anything with that for now. In the past we even talked about adding a deprecation warning for that ioctl since it adds complexity for little value. Cheers, Andreas > On Dec 20, 2013, at 16:35,

Re: [PATCH v2] fuse: Fix IOC_[GS]ET{FLAGS,VERSION} argument size brokenness.

2013-12-20 Thread Andreas Dilger
To be honest, FS_IOC_SETVERSION isn't used by many/any users, so it might be better to avoid doing anything with that for now. In the past we even talked about adding a deprecation warning for that ioctl since it adds complexity for little value. Cheers, Andreas On Dec 20, 2013, at 16:35,

Re: [PATCH v7 1/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-12-07 Thread Andreas Dilger
I suspect that any 32-bit systems running at that time will have been updated to have 64-bit time_t or otherwise have windowed the 32-bit time_t to have a new starting epoch. So I'm willing to punt on decoding the 64-bit value correctly to libc and just assign our time to the system time_t.

Re: [PATCH v7 1/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-12-07 Thread Andreas Dilger
I suspect that any 32-bit systems running at that time will have been updated to have 64-bit time_t or otherwise have windowed the 32-bit time_t to have a new starting epoch. So I'm willing to punt on decoding the 64-bit value correctly to libc and just assign our time to the system time_t.

Re: [PATCH v6] e2fsck: Correct ext4 dates generated by old kernels.

2013-12-02 Thread Andreas Dilger
On Nov 29, 2013, at 2:54 PM, David Turner wrote: > Is this version good, or should I make some more improvements? The patch looks good to me (you could add a Reviewed-by: line for me if you want. What you need to do now is to add a new test case or two to verify that this is working correctly.

Re: [PATCH v6] e2fsck: Correct ext4 dates generated by old kernels.

2013-12-02 Thread Andreas Dilger
On Nov 29, 2013, at 2:54 PM, David Turner nova...@novalis.org wrote: Is this version good, or should I make some more improvements? The patch looks good to me (you could add a Reviewed-by: line for me if you want. What you need to do now is to add a new test case or two to verify that this is

Re: [PATCH v4 2/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-11-12 Thread Andreas Dilger
On Nov 13, 2013, at 12:00 AM, David Turner wrote: > This patch is against e2fsprogs. > > --- > Older kernels on 64-bit machines would incorrectly encode pre-1970 > ext4 dates as post-2311 dates. Detect and correct this (assuming the > current date is before 2311). > > Signed-off-by: David

Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2013-11-12 Thread Andreas Dilger
On Nov 11, 2013, at 5:30 PM, Theodore Ts'o wrote: > On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote: >> b. Use Andreas's encoding, which is incompatible with pre-1970 files >> written on 64-bit systems. >> >> I don't care about currently-existing post-2038 files, because I believe

Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2013-11-12 Thread Andreas Dilger
On Nov 11, 2013, at 5:30 PM, Theodore Ts'o ty...@mit.edu wrote: On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote: b. Use Andreas's encoding, which is incompatible with pre-1970 files written on 64-bit systems. I don't care about currently-existing post-2038 files, because I

Re: [PATCH v4 2/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-11-12 Thread Andreas Dilger
On Nov 13, 2013, at 12:00 AM, David Turner nova...@novalis.org wrote: This patch is against e2fsprogs. --- Older kernels on 64-bit machines would incorrectly encode pre-1970 ext4 dates as post-2311 dates. Detect and correct this (assuming the current date is before 2311). Signed-off-by:

Re: [PATCH v3] ext4: Fix reading of extended tv_sec (bug 23732)

2013-11-08 Thread Andreas Dilger
On Nov 7, 2013, at 4:26 PM, David Turner wrote: > On Fri, 2013-11-08 at 00:14 +0100, Jan Kara wrote: >> Still unnecessary type cast here (but that's a cosmetic issue). > ... >> Otherwise the patch looks good. You can add: >> Reviewed-by: Jan Kara > > Thanks. A version with this correction and

Re: [PATCH v3] ext4: Fix reading of extended tv_sec (bug 23732)

2013-11-08 Thread Andreas Dilger
On Nov 7, 2013, at 4:26 PM, David Turner nova...@novalis.org wrote: On Fri, 2013-11-08 at 00:14 +0100, Jan Kara wrote: Still unnecessary type cast here (but that's a cosmetic issue). ... Otherwise the patch looks good. You can add: Reviewed-by: Jan Kara j...@suse.cz Thanks. A version with

Re: Disabling in-memory write cache for x86-64 in Linux II

2013-11-04 Thread Andreas Dilger
On Oct 25, 2013, at 2:18 AM, Linus Torvalds wrote: > On Fri, Oct 25, 2013 at 8:25 AM, Artem S. Tashkinov wrote: >> >> On my x86-64 PC (Intel Core i5 2500, 16GB RAM), I have the same 3.11 >> kernel built for the i686 (with PAE) and x86-64 architectures. What’s >> really troubling me is that

Re: Disabling in-memory write cache for x86-64 in Linux II

2013-11-04 Thread Andreas Dilger
On Oct 25, 2013, at 2:18 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, Oct 25, 2013 at 8:25 AM, Artem S. Tashkinov t.ar...@lycos.com wrote: On my x86-64 PC (Intel Core i5 2500, 16GB RAM), I have the same 3.11 kernel built for the i686 (with PAE) and x86-64 architectures.

Re: 3.11.4: kernel BUG at fs/buffer.c:1268

2013-10-31 Thread Andreas Dilger
On Oct 17, 2013, at 4:14 PM, Al Viro wrote: > On Thu, Oct 17, 2013 at 05:11:43PM -0400, George Spelvin wrote: >> >> Well, it happened again (error appended). Can you please clarify what you >> mean >> by "such BUG_ON()"; I'm having a hard time following the RCU code and >> determining >> all

Re: 3.11.4: kernel BUG at fs/buffer.c:1268

2013-10-31 Thread Andreas Dilger
On Oct 17, 2013, at 4:14 PM, Al Viro v...@zeniv.linux.org.uk wrote: On Thu, Oct 17, 2013 at 05:11:43PM -0400, George Spelvin wrote: Well, it happened again (error appended). Can you please clarify what you mean by such BUG_ON(); I'm having a hard time following the RCU code and

Re: [RFC PATCH 0/5] locks: implement "filp-private" (aka UNPOSIX) locks

2013-10-11 Thread Andreas Dilger
On Fri, Oct 11, 2013 at 08:25:17AM -0400, Jeff Layton wrote: > At LSF this year, there was a discussion about the "wishlist" for > userland file servers. One of the things brought up was the goofy and > problematic behavior of POSIX locks when a file is closed. Boaz started > a thread on it here:

Re: [RFC PATCH 0/5] locks: implement filp-private (aka UNPOSIX) locks

2013-10-11 Thread Andreas Dilger
On Fri, Oct 11, 2013 at 08:25:17AM -0400, Jeff Layton wrote: At LSF this year, there was a discussion about the wishlist for userland file servers. One of the things brought up was the goofy and problematic behavior of POSIX locks when a file is closed. Boaz started a thread on it here:

Re: [PATCH 2/2] fs/ext4/namei.c: reducing contention on s_orphan_lock mmutex

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote: > Instead of using a single per super block mutex, s_orphan_lock, to serialize > all orphan list updates, a separate mutex and spinlock are used to > protect the on disk and in memory orphan lists respecvitely. > > At the same time, a per

Re: [PATCH 1/2] fs/ext4: adding and initalizing new members of ext4_inode_info and ext4_sb_info

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote: > Adding new members, i_prev_oprhan to help decoupling the ondisk from the > in memory orphan list and i_mutex_orphan_mutex to serialize orphan list > updates on a single inode, to the ext4_inode_info structure. What do these additional fields

Re: [PATCH 0/2] fs/ext4: increase parallelism in updating ext4 orphan list

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:38 AM, T Makphaibulchoke wrote: > Instead of allowing only a single atomic update (both in memory and on disk > orphan lists) of an ext4's orphan list via the s_orphan_lock mutex, this > patch allows multiple updates of the orphan list, while still maintaing the > integrity of

Re: [PATCH 0/2] fs/ext4: increase parallelism in updating ext4 orphan list

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:38 AM, T Makphaibulchoke wrote: Instead of allowing only a single atomic update (both in memory and on disk orphan lists) of an ext4's orphan list via the s_orphan_lock mutex, this patch allows multiple updates of the orphan list, while still maintaing the integrity of

Re: [PATCH 1/2] fs/ext4: adding and initalizing new members of ext4_inode_info and ext4_sb_info

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote: Adding new members, i_prev_oprhan to help decoupling the ondisk from the in memory orphan list and i_mutex_orphan_mutex to serialize orphan list updates on a single inode, to the ext4_inode_info structure. What do these additional fields do

Re: [PATCH 2/2] fs/ext4/namei.c: reducing contention on s_orphan_lock mmutex

2013-10-03 Thread Andreas Dilger
On 2013-10-02, at 9:36 AM, T Makphaibulchoke wrote: Instead of using a single per super block mutex, s_orphan_lock, to serialize all orphan list updates, a separate mutex and spinlock are used to protect the on disk and in memory orphan lists respecvitely. At the same time, a per inode

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-10 Thread Andreas Dilger
On 2013-09-06, at 6:23 AM, Thavatchai Makphaibulchoke wrote: > On 09/06/2013 05:10 AM, Andreas Dilger wrote: >> On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote: >>> No, I did not do anything special, including changing an inode's size. I >>> just used the pr

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-10 Thread Andreas Dilger
On 2013-09-06, at 6:23 AM, Thavatchai Makphaibulchoke wrote: On 09/06/2013 05:10 AM, Andreas Dilger wrote: On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote: No, I did not do anything special, including changing an inode's size. I just used the profile data, which indicated mb_cache

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-05 Thread Andreas Dilger
On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote: > On 09/05/2013 02:35 AM, Theodore Ts'o wrote: >> How did you gather these results? The mbcache is only used if you >> are using extended attributes, and only if the extended attributes don't fit >> in the inode's extra space. >> >> I

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-05 Thread Andreas Dilger
On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote: On 09/05/2013 02:35 AM, Theodore Ts'o wrote: How did you gather these results? The mbcache is only used if you are using extended attributes, and only if the extended attributes don't fit in the inode's extra space. I checked

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-04 Thread Andreas Dilger
On 2013-09-04, at 10:39 AM, T Makphaibulchoke wrote: > This patch intends to improve the scalability of an ext filesystem, > particularly ext4. In the past, I've raised the question of whether mbcache is even useful on real-world systems. Essentially, this is providing a "deduplication" service

Re: [PATCH v3 0/2] ext4: increase mbcache scalability

2013-09-04 Thread Andreas Dilger
On 2013-09-04, at 10:39 AM, T Makphaibulchoke wrote: This patch intends to improve the scalability of an ext filesystem, particularly ext4. In the past, I've raised the question of whether mbcache is even useful on real-world systems. Essentially, this is providing a deduplication service for

Re: [RFC/PATCH 0/2] ext4: Transparent Decompression Support

2013-08-07 Thread Andreas Dilger
On 2013-08-04, at 5:48 PM, Dave Chinner wrote: > On Sat, Aug 03, 2013 at 10:21:14PM -0400, Jörn Engel wrote: >> On Sat, 3 August 2013 20:33:16 -0400, Theodore Ts'o wrote: >>> >>> P.P.S. At least in theory, nothing of what I've described here has to be >>> ext4 specific. We could implement this

Re: [RFC/PATCH 0/2] ext4: Transparent Decompression Support

2013-08-07 Thread Andreas Dilger
On 2013-08-04, at 5:48 PM, Dave Chinner wrote: On Sat, Aug 03, 2013 at 10:21:14PM -0400, Jörn Engel wrote: On Sat, 3 August 2013 20:33:16 -0400, Theodore Ts'o wrote: P.P.S. At least in theory, nothing of what I've described here has to be ext4 specific. We could implement this in the VFS

Re: [lustre mess] is mgc_fs_setup() reachable at all?

2013-07-18 Thread Andreas Dilger
On 2013-07-18, at 1:07 PM, Al Viro wrote: > On Thu, Jul 18, 2013 at 11:40:16AM -0700, Nathan Rutman wrote: } RETURN(rc); } What is going on here? We cast something to struct super_block *? Where does it come from? The function it's in is

Re: [lustre mess] is mgc_fs_setup() reachable at all?

2013-07-18 Thread Andreas Dilger
On 2013-07-18, at 1:07 PM, Al Viro wrote: On Thu, Jul 18, 2013 at 11:40:16AM -0700, Nathan Rutman wrote: } RETURN(rc); } What is going on here? We cast something to struct super_block *? Where does it come from? The function it's in is Well, addressing

Re: [PATCH 1/3] ext4: Add EXT4_IOC_TRUNCATE_BLOCK_RANGE ioctl

2013-06-23 Thread Andreas Dilger
On 2013-06-23, at 0:07, Namjae Jeon wrote: > From: Namjae Jeon > The EXT4_IOC_TRUNCATE_BLOCK_RANGE removes the data blocks lying > between [start, "start + length") and updates the logical block numbers > of data blocks starting from "start + length" block to last block of file. > This will

Re: [PATCH 0/3] ext4: introduce two new ioctls

2013-06-23 Thread Andreas Dilger
On 2013-06-23, at 0:07, Namjae Jeon wrote: > From: Namjae Jeon > > This patch series introduces 2 new ioctls for ext4. > > Truncate_block_range ioctl truncates blocks from source file. How is this different from fallocate(FALLOC_FL_PUNCH_HOLE)? That is already in existing kernels, and

Re: [PATCH 0/3] ext4: introduce two new ioctls

2013-06-23 Thread Andreas Dilger
On 2013-06-23, at 0:07, Namjae Jeon linkinj...@gmail.com wrote: From: Namjae Jeon namjae.j...@samsung.com This patch series introduces 2 new ioctls for ext4. Truncate_block_range ioctl truncates blocks from source file. How is this different from fallocate(FALLOC_FL_PUNCH_HOLE)? That is

Re: [PATCH 1/3] ext4: Add EXT4_IOC_TRUNCATE_BLOCK_RANGE ioctl

2013-06-23 Thread Andreas Dilger
On 2013-06-23, at 0:07, Namjae Jeon linkinj...@gmail.com wrote: From: Namjae Jeon namjae.j...@samsung.com The EXT4_IOC_TRUNCATE_BLOCK_RANGE removes the data blocks lying between [start, start + length) and updates the logical block numbers of data blocks starting from start + length block to

Re: Tux3 Report: Faster than tmpfs, what?

2013-05-15 Thread Andreas Dilger
On 2013-05-14, at 0:25, Daniel Phillips wrote: > Interesting, Andreas. We don't do anything as heavyweight as > allocating an inode in this path, just mark the inode dirty (which > puts it on a list) and set a bit in the inode flags. The new inode allocation is only needed for the

Re: Tux3 Report: Faster than tmpfs, what?

2013-05-15 Thread Andreas Dilger
On 2013-05-14, at 0:25, Daniel Phillips daniel.raymond.phill...@gmail.com wrote: Interesting, Andreas. We don't do anything as heavyweight as allocating an inode in this path, just mark the inode dirty (which puts it on a list) and set a bit in the inode flags. The new inode allocation is

Re: New copyfile system call - discuss before LSF?

2013-03-30 Thread Andreas Dilger
On 2013-03-30, at 16:21, Ric Wheeler wrote: > On 03/30/2013 05:57 PM, Myklebust, Trond wrote: >> On Mar 30, 2013, at 5:45 PM, Pavel Machek >> wrote: >> >>> On Sat 2013-03-30 13:08:39, Andreas Dilger wrote: >>>> On 2013-03-30, at 12:49 PM, Pavel Machek

Re: New copyfile system call - discuss before LSF?

2013-03-30 Thread Andreas Dilger
On 2013-03-30, at 12:49 PM, Pavel Machek wrote: > Hmm, really? AFAICT it would be simple to provide an > open_deleted_file("directory") syscall. You'd open_deleted_file(), > copy source file into it, then fsync(), then link it into filesystem. > > That should have atomicity properties reflected.

Re: New copyfile system call - discuss before LSF?

2013-03-30 Thread Andreas Dilger
On 2013-03-30, at 12:49 PM, Pavel Machek wrote: Hmm, really? AFAICT it would be simple to provide an open_deleted_file(directory) syscall. You'd open_deleted_file(), copy source file into it, then fsync(), then link it into filesystem. That should have atomicity properties reflected.

Re: New copyfile system call - discuss before LSF?

2013-03-30 Thread Andreas Dilger
On 2013-03-30, at 16:21, Ric Wheeler rwhee...@redhat.com wrote: On 03/30/2013 05:57 PM, Myklebust, Trond wrote: On Mar 30, 2013, at 5:45 PM, Pavel Machek pa...@ucw.cz wrote: On Sat 2013-03-30 13:08:39, Andreas Dilger wrote: On 2013-03-30, at 12:49 PM, Pavel Machek wrote: Hmm, really

Re: [Nepomuk] Better support for (desktop) file search / indexing applications

2013-03-11 Thread Andreas Dilger
On 2013-03-10, at 6:06, Lijo Antony wrote: > On 03/10/2013 08:51 AM, Simeon Bird wrote: >> >> We (nepomuk) recently looked at using fanotify, and indeed we would >> need user watches, support for moves and recursive directory watches >> (we need to support the case where /home is not a separate

Re: [Nepomuk] Better support for (desktop) file search / indexing applications

2013-03-11 Thread Andreas Dilger
On 2013-03-10, at 6:06, Lijo Antony lijo.ker...@gmail.com wrote: On 03/10/2013 08:51 AM, Simeon Bird wrote: We (nepomuk) recently looked at using fanotify, and indeed we would need user watches, support for moves and recursive directory watches (we need to support the case where /home is not

Re: New copyfile system call - discuss before LSF?

2013-02-21 Thread Andreas Dilger
On 2013-02-21, at 7:57 AM, Ric Wheeler wrote: > On 02/21/2013 02:51 PM, Myklebust, Trond wrote: >> On Thu, 2013-02-21 at 12:37 +0100, Ric Wheeler wrote: >>> We have debated the need to have a system call to allow for offloading copy >>> operations, for example to an NFS server (part to the new NFS

Re: New copyfile system call - discuss before LSF?

2013-02-21 Thread Andreas Dilger
On 2013-02-21, at 7:57 AM, Ric Wheeler wrote: On 02/21/2013 02:51 PM, Myklebust, Trond wrote: On Thu, 2013-02-21 at 12:37 +0100, Ric Wheeler wrote: We have debated the need to have a system call to allow for offloading copy operations, for example to an NFS server (part to the new NFS 4.2

Re: [PATCH] vfs: update atimes over one day in the past or future

2012-12-18 Thread Andreas Dilger
fat-fingers a "touch". The future atime will never be fixed. >> >> Without relatime enabled, a future atime is updated to the current >> kernel time on access. Relatime is meant to reduce the frequency >> of atime updates, not decide if whether the system clock or the >

Re: [PATCH] vfs: update atimes over one day in the past or future

2012-12-18 Thread Andreas Dilger
-off-by: Andreas Dilger adil...@dilger.ca Acked-by: David Chinner da...@fromorbit.com No I didn't. Please don't add tags that someone has not added directly in a reply to the original patch. That's my fault. I thought you'd OK'd the patch with the revised commit comment. CC: sta

Re: [PATCH] Update atime from future.

2012-12-04 Thread Andreas Dilger
On 2012-12-04, at 13:24, Dave Chinner wrote: > On Tue, Dec 04, 2012 at 01:56:39AM +0800, yangsheng wrote: >> Relatime should update the inode atime if it is more than a day in the >> future. The original problem seen was a tarball that had a bad atime, >> but could also happen if someone

Re: [PATCH] Update atime from future.

2012-12-04 Thread Andreas Dilger
On 2012-12-04, at 13:24, Dave Chinner da...@fromorbit.com wrote: On Tue, Dec 04, 2012 at 01:56:39AM +0800, yangsheng wrote: Relatime should update the inode atime if it is more than a day in the future. The original problem seen was a tarball that had a bad atime, but could also happen if

Re: ext4 write performance regression in 3.6-rc1 on RAID0/5

2012-08-22 Thread Andreas Dilger
On 2012-08-22, at 12:00 AM, NeilBrown wrote: > On Wed, 22 Aug 2012 11:57:02 +0800 Yuanhan Liu > wrote: >> >> -#define NR_STRIPES 256 >> +#define NR_STRIPES 1024 > > Changing one magic number into another magic number might help your case, but > it not really a general

Re: ext4 write performance regression in 3.6-rc1 on RAID0/5

2012-08-22 Thread Andreas Dilger
On 2012-08-22, at 12:00 AM, NeilBrown wrote: On Wed, 22 Aug 2012 11:57:02 +0800 Yuanhan Liu yuanhan@linux.intel.com wrote: -#define NR_STRIPES 256 +#define NR_STRIPES 1024 Changing one magic number into another magic number might help your case, but it not really a

Re: [RFC] ext3 freeze feature ver 0.2

2008-02-26 Thread Andreas Dilger
On Feb 26, 2008 08:39 -0800, Eric Sandeen wrote: > Takashi Sato wrote: > > > o Elevate XFS ioctl numbers (XFS_IOC_FREEZE and XFS_IOC_THAW) to the VFS > > As Andreas Dilger and Christoph Hellwig advised me, I have elevated > > them to include/linux/fs.h as below.

Re: [RFC] ext3 freeze feature ver 0.2

2008-02-26 Thread Andreas Dilger
On Feb 26, 2008 08:39 -0800, Eric Sandeen wrote: Takashi Sato wrote: o Elevate XFS ioctl numbers (XFS_IOC_FREEZE and XFS_IOC_THAW) to the VFS As Andreas Dilger and Christoph Hellwig advised me, I have elevated them to include/linux/fs.h as below. #define FIFREEZE_IOWR

Re: [2.6 patch] fs/jbd/journal.c: cleanups

2008-02-17 Thread Andreas Dilger
eatures in the JBD superblock. Similarly, for 64-bit support in ext4 uses journal_set_features() to set a 64-bit feature flag in the journal superblock. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line

Re: [2.6 patch] fs/jbd/journal.c: cleanups

2008-02-17 Thread Andreas Dilger
. Similarly, for 64-bit support in ext4 uses journal_set_features() to set a 64-bit feature flag in the journal superblock. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [sample] mem_notify v6: usage example

2008-02-11 Thread Andreas Dilger
o complex, but hiding the details of /dev/mem_notify from applications is desirable. A simple wrapper (possibly part of glibc) to return the poll fd, or set up the signal is enough. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscr

Re: [RFC] ext3 freeze feature

2008-02-08 Thread Andreas Dilger
gt;fd:file descriptor of mountpoint >FITHAW:Request cord for unfreeze You may as well make the common ioctl the same as the XFS version, both by number and parameters, so that applications which already understand the XFS ioctl will work on other filesystems. Cheers, Andreas -- Andreas D

Re: [RFC] ext3 freeze feature

2008-02-08 Thread Andreas Dilger
cord for unfreeze You may as well make the common ioctl the same as the XFS version, both by number and parameters, so that applications which already understand the XFS ioctl will work on other filesystems. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-06 Thread Andreas Dilger
25 +0900. [PATCH] ext3,4:fdatasync should skip metadata writeout when overwriting It may be that we already have a solution in that patch for database workloads where the pages are already allocated by avoiding the need for ordered mode journal flushing in that case. Cheers, Andre

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-06 Thread Andreas Dilger
. [PATCH] ext3,4:fdatasync should skip metadata writeout when overwriting It may be that we already have a solution in that patch for database workloads where the pages are already allocated by avoiding the need for ordered mode journal flushing in that case. Cheers, Andreas -- Andreas Dilger Sr

Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

2008-01-31 Thread Andreas Dilger
mmitting_transaction-> > - t_outstanding_credits; > + t_nr_buffers; Same... Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

2008-01-31 Thread Andreas Dilger
-- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-30 Thread Andreas Dilger
and while I'm not sure what kernel it is for the JBD code rarely changes much Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-30 Thread Andreas Dilger
Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-25 Thread Andreas Dilger
journal on a separate disk and make it big enough that you don't block on it to flush the data to the filesystem (but not so big that it is consuming all of your RAM). That keeps your data guarantees without hurting performance. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lu

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Andreas Dilger
y kind of process (and not just those that are event loop driven) can register a callback at some arbitrary point in the code and be notified. I don't object to the poll() interface, but it would be good to have a signal mechanism also. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Gro

Re: [patch 12/26] mount options: fix ext4

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 20:33 +0100, Miklos Szeredi wrote: > Add stripe= option to /proc/mounts for ext4 filesystems. > > Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]> Acked-by: Andreas Dilger <[EMAIL PROTECTED]> > Inde

<    1   2   3   4   5   6   7   8   9   10   >