Re: [trivial] treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks

2013-06-23 Thread Theodore Ts'o
On Wed, Jun 19, 2013 at 12:15:53PM -0700, Joe Perches wrote: Don't emit OOM warnings when k.alloc calls fail when there there is a v.alloc immediately afterwards. Signed-off-by: Joe Perches j...@perches.com For fs/ext4/super.c: Acked-by: Theodore Ts'o ty...@mit.edu

[PATCH] fs: push sync_filesystem() down to the file system's remount_fs()

2014-03-13 Thread Theodore Ts'o
, and there are some file systems where this is not needed at all (for example, for a pseudo-filesystem or something like romfs). Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: linux-fsde...@vger.kernel.org Cc: Christoph Hellwig h...@infradead.org Cc: Artem Bityutskiy dedeki...@gmail.com Cc: Adrian Hunter

Re: [Cluster-devel] [PATCH] fs: push sync_filesystem() down to the file system's remount_fs()

2014-03-13 Thread Theodore Ts'o
On Thu, Mar 13, 2014 at 04:28:23PM +, Steven Whitehouse wrote: I guess the same is true for other file systems which are mounted ro too. So maybe a check for MS_RDONLY before doing the sync in those cases? My original patch moved the sync_filesystem into the check for MS_RDONLY in the

Re: [ANNOUNCE] xfstests: new mailing list

2014-05-16 Thread Theodore Ts'o
On Fri, May 16, 2014 at 10:02:07AM -0400, Calvin Walton wrote: Instead of renaming the test suite, why not just backronym it to mean something different? The letter x is used to mean cross in many contexts, so xfstests could easily mean cross-filesystem tests - a name that fits perfectly!

No one seems to be using AOP_WRITEPAGE_ACTIVATE?

2010-04-24 Thread Theodore Ts'o
I happened to be going through the source code for write_cache_pages(), and I came across a reference to AOP_WRITEPAGE_ACTIVATE. I was curious what the heck that was, so I did search for it, and found this in Documentation/filesystems/vfs.txt: If wbc-sync_mode is WB_SYNC_NONE, -writepage

Re: [e2fsprogs] ext2_dir_entry To ext2_dir_entry_2 Casting

2012-10-02 Thread Theodore Ts'o
On Tue, Oct 02, 2012 at 12:02:22PM -0700, Wade Cline wrote: Hello Theodore Ts'o, Is there a function similar to ext2fs_dir_iterate2() that will call a hook function on an ext2_dir_entry_2 structure and not an ext2_dir_entry structure? The reason I ask is because btrfs-convert currently

Re: [e2fsprogs] ext2_dir_entry To ext2_dir_entry_2 Casting

2012-10-03 Thread Theodore Ts'o
On Wed, Oct 03, 2012 at 10:39:55AM -0700, Wade Cline wrote: I would think that using (name_len 0xFF) is a much simpler solution, and my suggestion is to not depend on the file type in the directory entry (since there might be some very old ext2 file systems that don't set the file type), and

Re: [PATCH 2/9] uuid: use random32_get_bytes()

2012-10-29 Thread Theodore Ts'o
On Tue, Oct 30, 2012 at 09:49:58AM +0800, Huang Ying wrote: The uuid_le/be_gen() in lib/uuid.c has set UUID variants to be DCE, that is done in __uuid_gen_common() with b[8] = (b[8] 0x3F) | 0x80. Oh, I see, I missed that. To deal with random number generation issue, how about use

Re: [PATCH 2/9] uuid: use random32_get_bytes()

2012-10-30 Thread Theodore Ts'o
On Wed, Oct 31, 2012 at 09:35:37AM +0800, Huang Ying wrote: The intention of lib/uuid.c is to unify various UUID related code, and put them in same place. In addition to UUID generation, it provide some other utility and may provide/collect more in the future. So do you think it is a good

Re: Work Queue for btrfs compression writes

2014-07-30 Thread Theodore Ts'o
On Wed, Jul 30, 2014 at 10:38:21AM +0100, Hugo Mills wrote: qemu/kvm is good for this, because it has a mode that bypasses the BIOS and bootloader emulation, and just directly runs a kernel from a file on the host machine. This is fast. You can pass large sparse files to the VM to act as

Re: Work Queue for btrfs compression writes

2014-07-31 Thread Theodore Ts'o
On Wed, Jul 30, 2014 at 10:36:57AM -0400, Peter Hurley wrote: Where is that git tree? I've been planning to set up a unit test and regression suite for tty/serial, and wouldn't mind cribbing the infrastructure from someone's existing work.

Re: [PATCH] Add support to check for FALLOC_FL_COLLAPSE_RANGE and FALLOC_FL_ZERO_RANGE crap modes

2014-08-01 Thread Theodore Ts'o
On Thu, Jul 31, 2014 at 08:09:10PM +0100, Hugo Mills wrote: On Thu, Jul 31, 2014 at 01:53:33PM -0400, Nicholas Krause wrote: This adds checks for the stated modes as if they are crap we will return error not supported. You've just enabled two options, but you haven't actually

Re: ext4 vs btrfs performance on SSD array

2014-09-02 Thread Theodore Ts'o
- the very small max readahead size For things like the readahead size, that's probably something that we should autotune, based the time it takes to read N sectors. i.e., start N relatively small, such as 128k, and then bump it up based on how long it takes to do a sequential read of N

Re: ext4 vs btrfs performance on SSD array

2014-09-02 Thread Theodore Ts'o
On Tue, Sep 02, 2014 at 04:20:24PM +0200, Jan Kara wrote: On Tue 02-09-14 07:31:04, Ted Tso wrote: - the very small max readahead size For things like the readahead size, that's probably something that we should autotune, based the time it takes to read N sectors. i.e., start N

[PATCH 1/4] fs: split update_time() into update_time() and write_time()

2014-11-21 Thread Theodore Ts'o
() check; otherwise we could drop the update_time() inode operation entirely. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: x...@oss.sgi.com Cc: linux-btrfs@vger.kernel.org --- fs/btrfs/inode.c | 10 ++ fs/inode.c | 29 ++--- fs/xfs/xfs_iops.c | 39

[PATCH 2/4] vfs: add support for a lazytime mount option

2014-11-21 Thread Theodore Ts'o
Track Interference (ATI) remediation latencies, which very negatively impact 99.9 percentile latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 38

[PATCH 3/4] vfs: don't let the dirty time inodes get more than a day stale

2014-11-21 Thread Theodore Ts'o
Guarantee that the on-disk timestamps will be no more than 24 hours stale. Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 1 + fs/inode.c | 7 ++- include/linux/fs.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/fs-writeback.c b/fs/fs

Re: [PATCH 3/4] vfs: don't let the dirty time inodes get more than a day stale

2014-11-21 Thread Theodore Ts'o
On Fri, Nov 21, 2014 at 02:19:07PM -0600, Andreas Dilger wrote: - if (inode-i_sb-s_flags MS_LAZYTIME) { + if ((inode-i_sb-s_flags MS_LAZYTIME) + (!inode-i_ts_dirty_day || +inode-i_ts_dirty_day == days_since_boot)) { spin_lock(inode-i_lock);

Re: [PATCH 1/4] fs: split update_time() into update_time() and write_time()

2014-11-21 Thread Theodore Ts'o
Out of curiosity, why does btrfs_update_time() need to call btrfs_root_readonly()? Why can't it just depend on the __mnt_want_write() call in touch_atime()? Surely if there are times when it's not OK to write into a btrfs file system and mnt_is_readonly() returns false, the VFS is going to get

[PATCH-v2 1/5] fs: split update_time() into update_time() and write_time()

2014-11-22 Thread Theodore Ts'o
() check; otherwise we could drop the update_time() inode operation entirely. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: x...@oss.sgi.com Cc: linux-btrfs@vger.kernel.org --- fs/btrfs/inode.c | 10 ++ fs/inode.c | 29 ++--- fs/xfs/xfs_iops.c | 39

[PATCH-v2 3/5] vfs: don't let the dirty time inodes get more than a day stale

2014-11-22 Thread Theodore Ts'o
Guarantee that the on-disk timestamps will be no more than 24 hours stale. Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 1 + fs/inode.c | 16 +++- include/linux/fs.h | 1 + 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/fs/fs

[PATCH-v2 4/5] vfs: add lazytime tracepoints for better debugging

2014-11-22 Thread Theodore Ts'o
Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 5 - fs/inode.c| 5 + include/trace/events/fs.h | 56 +++ 3 files changed, 65 insertions(+), 1 deletion(-) create mode 100644 include/trace/events/fs.h

[PATCH-v2 5/5] ext4: add support for a lazytime mount option

2014-11-22 Thread Theodore Ts'o
the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/ext4/inode.c | 48

[PATCH-v2 2/5] vfs: add support for a lazytime mount option

2014-11-22 Thread Theodore Ts'o
Track Interference (ATI) remediation latencies, which very negatively impact 99.9 percentile latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 38

[PATCH-v2 0/5] add support for a lazytime mount option

2014-11-22 Thread Theodore Ts'o
test and characterize how often and under what circumstances inodes have their timestamps lazily updated Theodore Ts'o (5): fs: split update_time() into update_time() and write_time() vfs: add support for a lazytime mount option vfs: don't let the dirty time inodes get more than a day

Re: [PATCH-v2 0/5] add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
On Mon, Nov 24, 2014 at 01:07:55AM -0800, Christoph Hellwig wrote: What's the test coverage for this? xfstest generic/192 tests that atime is persisted over remounts, which we had a bug with when XFS used to have a lazy atime implementation somewhat similar to the proposal. We should have

Re: [PATCH 1/4] fs: split update_time() into update_time() and write_time()

2014-11-24 Thread Theodore Ts'o
On Mon, Nov 24, 2014 at 07:21:01AM -0800, Christoph Hellwig wrote: On Fri, Nov 21, 2014 at 02:59:21PM -0500, Theodore Ts'o wrote: We needed to preserve update_time() because btrfs wants to have a special btrfs_root_readonly() check; otherwise we could drop the update_time() inode operation

Re: [PATCH-v2 3/5] vfs: don't let the dirty time inodes get more than a day stale

2014-11-24 Thread Theodore Ts'o
On Mon, Nov 24, 2014 at 01:27:21PM +0100, Rasmus Villemoes wrote: On Sat, Nov 22 2014, Theodore Ts'o ty...@mit.edu wrote: Guarantee that the on-disk timestamps will be no more than 24 hours stale. + unsigned short days_since_boot = jiffies / (HZ * 86400); This seems to wrap every

Re: [PATCH 1/4] fs: split update_time() into update_time() and write_time()

2014-11-24 Thread Theodore Ts'o
On Mon, Nov 24, 2014 at 05:38:30PM +0100, David Sterba wrote: It is necessary and the whole .update_time callback was added intentionally, see commits c3b2da314834499f34cba94f7053e55f6d6f92d8 fs: introduce inode operation -update_time e41f941a23115e84a8550b3d901a13a14b2edc2f Btrfs:

Re: [PATCH-v2 0/5] add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
On Mon, Nov 24, 2014 at 05:11:45PM -0500, J. Bruce Fields wrote: On Mon, Nov 24, 2014 at 06:57:27AM -0500, Theodore Ts'o wrote: If we want to be paranoid, we handle i_version updates non-lazily; I can see arguments in favor of that. Ext4 only enables MS_I_VERSION if the user asks

Re: [PATCH 2/4] vfs: add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
On Tue, Nov 25, 2014 at 12:52:39PM +1100, Dave Chinner wrote: +static void flush_sb_dirty_time(struct super_block *sb) +{ ... +} This just seems wrong to me, not to mention extremely expensive when we have millions of cached inodes on the superblock. #1, It only gets called on a

Re: [PATCH 3/4] vfs: don't let the dirty time inodes get more than a day stale

2014-11-24 Thread Theodore Ts'o
On Tue, Nov 25, 2014 at 12:53:32PM +1100, Dave Chinner wrote: On Fri, Nov 21, 2014 at 02:59:23PM -0500, Theodore Ts'o wrote: Guarantee that the on-disk timestamps will be no more than 24 hours stale. Signed-off-by: Theodore Ts'o ty...@mit.edu If we put these inodes on the dirty inode

[PATCH-v3 1/6] fs: split update_time() into update_time() and write_time()

2014-11-24 Thread Theodore Ts'o
() check; otherwise we could drop the update_time() inode operation entirely. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: x...@oss.sgi.com Cc: linux-btrfs@vger.kernel.org --- Documentation/filesystems/Locking | 2 ++ fs/btrfs/inode.c | 10 ++ fs/inode.c

[PATCH-v3 3/6] vfs: don't let the dirty time inodes get more than a day stale

2014-11-24 Thread Theodore Ts'o
Guarantee that the on-disk timestamps will be no more than 24 hours stale. Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 1 + fs/inode.c | 28 +++- include/linux/fs.h | 1 + 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/fs

[PATCH-v3 0/6] add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
- Fix type used for days_since_boot - Improve SMP scalability in update_time and ext4_update_other_inodes_time - Added tracepoints to help test and characterize how often and under what circumstances inodes have their timestamps lazily updated Theodore Ts'o (6): fs: split update_time

[PATCH-v3 5/6] ext4: add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/ext4/inode.c | 48

[PATCH-v3 2/6] vfs: add support for a lazytime mount option

2014-11-24 Thread Theodore Ts'o
Track Interference (ATI) remediation latencies, which very negatively impact 99.9 percentile latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 38

[PATCH-v3 6/6] btrfs: add an is_readonly() so btrfs can use common code for update_time()

2014-11-24 Thread Theodore Ts'o
places where the VFS layer may want to know that btrfs would want to treat an inode is read-only. With this commit, there are no remaining users of update_time() in the inode operations structure, so we can remove it and simply things further. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: linux

[PATCH-v3 4/6] vfs: add lazytime tracepoints for better debugging

2014-11-24 Thread Theodore Ts'o
Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 5 - fs/inode.c| 5 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index eb04277..cab2d6d 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -27,6 +27,7

Re: [PATCH-v3 3/6] vfs: don't let the dirty time inodes get more than a day stale

2014-11-25 Thread Theodore Ts'o
On Tue, Nov 25, 2014 at 03:58:01PM +0100, Rasmus Villemoes wrote: I think days_since_boot was a lot clearer than daycode. In any case, please make the comment and the code consistent. Yeah, I was going back and forth between days since the epoch and days since boot, but found it was more

Re: [PATCH 2/4] vfs: add support for a lazytime mount option

2014-11-25 Thread Theodore Ts'o
On Tue, Nov 25, 2014 at 06:19:27PM +0100, Jan Kara wrote: Actually, I'd also prefer to do the writing from iput_final(). My main reason is that shrinker starts behaving very differently when you put inodes with I_DIRTY_TIME to the LRU. See inode_lru_isolate() and in particular: /*

Re: [PATCH 2/4] vfs: add support for a lazytime mount option

2014-11-25 Thread Theodore Ts'o
On Tue, Nov 25, 2014 at 06:30:40PM +0100, Jan Kara wrote: This would be possible and as Boaz says, it might be possible to reuse the same list_head in the inode for this. Getting rid of the full scan of all superblock inodes would be nice (as the scan gets really expensive for large numbers

Re: [PATCH 3/4] vfs: don't let the dirty time inodes get more than a day stale

2014-11-26 Thread Theodore Ts'o
On Wed, Nov 26, 2014 at 10:48:51AM +1100, Dave Chinner wrote: No abuse necessary at all. Just a different inode_dirtied_after() check is requires if the inode is on the time dirty list in move_expired_inodes(). I'm still not sure what you have in mind here. When would this be checked? It

[PATCH-v4 4/7] vfs: add lazytime tracepoints for better debugging

2014-11-26 Thread Theodore Ts'o
Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 1 + fs/inode.c| 5 + 2 files changed, 6 insertions(+) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 529480a..3d87174 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -27,6 +27,7 @@ #include linux

[PATCH-v4 3/7] vfs: don't let the dirty time inodes get more than a day stale

2014-11-26 Thread Theodore Ts'o
Guarantee that the on-disk timestamps will be no more than 24 hours stale. Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 1 + fs/inode.c | 18 ++ include/linux/fs.h | 1 + 3 files changed, 20 insertions(+) diff --git a/fs/fs-writeback.c b/fs/fs

[PATCH-v4 5/7] vfs: add find_active_inode_nowait() function

2014-11-26 Thread Theodore Ts'o
not mean that inode number is free for use. It is useful for callers that want to opportunistically do some work on an inode only if it is present and available in the cache, and where blocking is not an option. Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/inode.c | 36

[PATCH-v4 7/7] btrfs: add an is_readonly() so btrfs can use common code for update_time()

2014-11-26 Thread Theodore Ts'o
places where the VFS layer may want to know that btrfs would want to treat an inode is read-only. With this commit, there are no remaining users of update_time() in the inode operations structure, so we can remove it and simply things further. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: linux

[PATCH-v4 0/7] add support for a lazytime mount option

2014-11-26 Thread Theodore Ts'o
for days_since_boot - Improve SMP scalability in update_time and ext4_update_other_inodes_time - Added tracepoints to help test and characterize how often and under what circumstances inodes have their timestamps lazily updated Theodore Ts'o (7): vfs: split update_time() into update_time

[PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-26 Thread Theodore Ts'o
() check; otherwise we could drop the update_time() inode operation entirely. Signed-off-by: Theodore Ts'o ty...@mit.edu Cc: x...@oss.sgi.com Cc: linux-btrfs@vger.kernel.org Acked-by: David Sterba dste...@suse.cz --- Documentation/filesystems/Locking | 2 ++ fs/btrfs/inode.c | 10

[PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-26 Thread Theodore Ts'o
the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/ext4/inode.c | 49

[PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-26 Thread Theodore Ts'o
Track Interference (ATI) remediation latencies, which very negatively impact 99.9 percentile latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o ty...@mit.edu --- fs/fs-writeback.c | 55

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
On Wed, Nov 26, 2014 at 11:23:28AM -0800, Christoph Hellwig wrote: As mentioned last round please move the addition of the is_readonly operation to the first thing in the series, so that the ordering makes more sense. OK, will fix. Second I think this patch is incorrect for XFS - XFS uses

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
This is what I'm currently playing with which I believe fixes the iput() problem. In fs/ext4/inode.c: struct other_inode { unsigned long orig_ino; struct ext4_inode *raw_inode; }; static int other_inode_match(struct inode * inode, unsigned long ino,

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
Christoph, can you take a quick look at this? I'm not sure I got the xfs inode transaction logging correct. Thanks!! - Ted commit cd58addfa340c9cf88b1f9b2d31a42e2e65c7252 Author: Theodore Ts'o ty...@mit.edu Date: Thu Nov 27 10:14:27 2014 -0500

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 04:41:59PM +0100, Jan Kara wrote: Hum, but this puts lots of stuff under inode_hash_lock, including writeback list lock. I don't like this too much. I understand that getting handle for each inode is rather more CPU intensive but it should still be a clear win over

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 02:14:21PM +0100, Jan Kara wrote: Looking into the code your patch I'd prefer to do something like: * add support for I_DIRTY_TIME in __mark_inode_dirty() - update_time will call __mark_inode_dirty() with this flag if any of the times was updated. That way we can

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 08:49:52AM -0800, Christoph Hellwig wrote: I don't think this scheme works well. As mentioned earlier XFS doesn't even use vfs dirty tracking at the moment, so introducing this in a hidden way sounds like a bad idea. Probably the same for btrfs. I'd rather keep

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 02:14:21PM +0100, Jan Kara wrote: * change queue_io() to also call moved += move_expired_inodes(wb-b_dirty_time, wb-b_io, time + 24hours) For this you need to tweak move_expired_inodes() to take pointer to timestamp instead of pointer to work but that's

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 06:00:16PM -0500, Theodore Ts'o wrote: Well it's not quite enough. The problem is that for ext3 and ext4, the actual work of writing the inode happens in dirty_inode(), not in write_inode(). Which means we need to do something like this. I'm not entirely sure

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-12-01 Thread Theodore Ts'o
On Mon, Dec 01, 2014 at 01:28:10AM -0800, Christoph Hellwig wrote: The -is_readonly method seems like a clear winner to me, I'm all for adding it, and thus suggested moving it first in the series. It's a real winner for me as well, but the reason why I dropped it is because if btrfs() has to

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-12-02 Thread Theodore Ts'o
On Tue, Dec 02, 2014 at 01:20:33AM -0800, Christoph Hellwig wrote: Why do you need the additional I_DIRTY flag? A lesser __mark_inode_dirty should never override a stronger one. Agreed, will fix. Otherwise this looks fine to me, except that I would split the default implementation into a

Re: [PATCH-v5 1/5] vfs: add support for a lazytime mount option

2014-12-02 Thread Theodore Ts'o
On Tue, Dec 02, 2014 at 07:55:48PM +0200, Boaz Harrosh wrote: This I do not understand. I thought that I_DIRTY_TIME, and the all lazytime mount option, is only for atime. So if there are dirty pages then there are also m/ctime that changed and surly we want to write these times to disk ASAP.

Re: [PATCH-v5 1/5] vfs: add support for a lazytime mount option

2014-12-02 Thread Theodore Ts'o
On Tue, Dec 02, 2014 at 01:37:27PM -0700, Andreas Dilger wrote: One thing that comes to mind is touch/utimes()/utimensat(). Those should definitely not result in timestamps being kept only in memory for 24h, since the whole point of those calls is to update the times. It makes sense for

Re: Announce re-factor all current xfstests patches request

2013-03-27 Thread Theodore Ts'o
On Wed, Mar 27, 2013 at 08:23:07AM -0500, Rich Johnston wrote: All xfstest developers, Thanks again for all your time in submitting and reviewing patches for xfstests. The latest patchset posted here: http://oss.sgi.com/archives/xfs/2013-03/msg00467.html requires all current patches to

Re: Announce re-factor all current xfstests patches request

2013-03-27 Thread Theodore Ts'o
What do you think about renaming the existing tests from NNN to NNN-descriptive-name? That way it will be easier for people who are trying to track regressions, since they can easily map from the new more descriptive name to the old test number for comparison purposes (i.e., to see whether a

Re: Announce re-factor all current xfstests patches request

2013-03-27 Thread Theodore Ts'o
On Thu, Mar 28, 2013 at 07:54:07AM +1100, Dave Chinner wrote: Support for named tests have not yet been added. From the check script: SUPPORTED_TESTS=[0-9][0-9][0-9] [0-9][0-9][0-9][0-9] Ah, I thought support for named tests was there. For right now, though, if we have test ext4/123 and

Re: Snapshot cannot be deleted

2015-01-20 Thread Theodore Ts'o
+linux-btrfs,-linux-ext4 On Tue, Jan 20, 2015 at 01:28:04PM +0100, Andreas Philipp wrote: Due to the known (and fixed) bug in kernel 3.17.0 one of my btrfs volume suffers from unreadable and - even worse - uneraseable snapshots. Whenever such a snapshot is accessed there is parent transid

Re: Documenting MS_LAZYTIME

2015-02-27 Thread Theodore Ts'o
With Omar's suggestions, this looks great. Thanks!! - Ted -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Documenting MS_LAZYTIME

2015-02-26 Thread Theodore Ts'o
On Thu, Feb 26, 2015 at 09:49:39AM +0100, Michael Kerrisk (man-pages) wrote: How about somethign like This mount significantly reduces writes needed to update the inode's timestamps, especially mtime and actime. What is actime in the preceding line? Should it be ctime? Sorry, no, it should

Re: Documenting MS_LAZYTIME

2015-02-20 Thread Theodore Ts'o
On Fri, Feb 20, 2015 at 09:49:34AM -0600, Eric Sandeen wrote: This mount option significantly reduces writes to the inode table for workloads that perform frequent random writes to preallocated files. This seems like an overly specific

Re: Documenting MS_LAZYTIME

2015-02-26 Thread Theodore Ts'o
On Thu, Feb 26, 2015 at 02:36:33PM +0100, Michael Kerrisk (man-pages) wrote: The disadvantage of MS_STRICTATIME | MS_LAZYTIME is that in the case of a system crash, the atime and mtime fields on disk might be out of date by at most 24 hours. I'd change to The disadvantage of

Re: i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION)

2015-06-23 Thread Theodore Ts'o
On Thu, Jun 18, 2015 at 04:38:56PM +0200, David Sterba wrote: Moving the discussion to fsdevel. Summary: disabling MS_I_VERSION brings some speedups to btrfs, but the generic 'noiversion' option cannot be used to achieve that. It is processed before it reaches btrfs superblock callback,

Re: i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION)

2015-06-25 Thread Theodore Ts'o
On Thu, Jun 25, 2015 at 02:46:44PM -0400, J. Bruce Fields wrote: Does this sound reasonable? Just to make sure I understand, the logic is something like: to read the i_version: inode-i_version_seen = true; return inode-i_version to update the

Re: i_version vs iversion (Was: Re: [RFC PATCH v2 1/2] Btrfs: add noi_version option to disable MS_I_VERSION)

2015-06-24 Thread Theodore Ts'o
On Wed, Jun 24, 2015 at 08:02:15PM +0200, David Sterba wrote: This sounds similar to what Dave proposed, a per-inode I_VERSION attribute that can be changed through chattr. Though the negated meaning of the flag could be confusing, I had to reread the paragraph again. Dave did not specify an

Re: [RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure

2015-08-15 Thread Theodore Ts'o
On Wed, Aug 12, 2015 at 11:14:11AM +0200, Michal Hocko wrote: Is this if (!committed_data) { check now dead code? I also see other similar suspected dead sites in the rest of the series. You are absolutely right. I have updated the patches. Have you sent out an updated version of these

Re: [PATCH] generic/224: Increase filesystem instance size to 1.5 GiB

2015-08-31 Thread Theodore Ts'o
On Tue, Sep 01, 2015 at 05:49:14AM +0530, Chandan Rajendra wrote: > mkfs.btrfs when invoked on small filesystems by "not" specifying any block > sizes (i.e. mkfs.btrfs -f /dev/sda1) will automatically create filesystem > instance with "data block size" == "metadata block size". However in the >

Re: [PATCH] generic/224: Increase filesystem instance size to 1.5 GiB

2015-08-31 Thread Theodore Ts'o
On Sun, Aug 30, 2015 at 08:16:21PM +0530, Chandan Rajendra wrote: > For small filesystem instances (i.e. size <= 1 GiB), mkfs.btrfs fails when > "data block size" does not match with the "metadata block size" specified on > the mkfs.btrfs command line. This commit increases the size of filesystem

Re: [PATCH] generic/224: Increase filesystem instance size to 1.5 GiB

2015-08-31 Thread Theodore Ts'o
On Mon, Aug 31, 2015 at 03:19:22PM -0400, Austin S Hemmelgarn wrote: > AFAIK, it shouldn't be failing that way, and should automatically switch to > mixed mode allocation. A 1G filesystem should work fine for BTRFS, but > smaller ones will have higher chances of ENOSPC issues (inversely >

Re: [PATCH 12/32] dio: unwritten conversion bug tests

2016-02-12 Thread Theodore Ts'o
On Fri, Feb 12, 2016 at 02:52:53PM +1100, Dave Chinner wrote: > On Thu, Feb 11, 2016 at 03:40:37PM -0800, Darrick J. Wong wrote: > > Check that we don't expose old disk contents when a directio write to > > an unwritten extent fails due to IO errors. This primarily affects > > XFS and ext4. > >

Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

2017-01-26 Thread Theodore Ts'o
On Thu, Jan 26, 2017 at 08:44:55AM +0100, Michal Hocko wrote: > > > I'm convinced the current series is OK, only real life will tell us > > > whether > > > we missed something or not ;) > > > > I would like to extend the changelog of "jbd2: mark the transaction > > context with the scope

Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

2017-01-27 Thread Theodore Ts'o
On Fri, Jan 27, 2017 at 10:37:35AM +0100, Michal Hocko wrote: > If this ever turn out to be a problem and with the vmapped stacks we > have good chances to get a proper stack traces on a potential overflow > we can add the scope API around the problematic code path with the > explanation why it is

Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

2017-01-17 Thread Theodore Ts'o
On Tue, Jan 17, 2017 at 04:18:17PM +0100, Michal Hocko wrote: > > OK, so I've been staring into the code and AFAIU current->journal_info > can contain my stored information. I could either hijack part of the > word as the ref counting is only consuming low 12b. But that looks too > ugly to live.

Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

2017-01-16 Thread Theodore Ts'o
On Fri, Jan 06, 2017 at 03:11:07PM +0100, Michal Hocko wrote: > From: Michal Hocko > > This reverts commit 216553c4b7f3e3e2beb4981cddca9b2027523928. Now that > the transaction context uses memalloc_nofs_save and all allocations > within the this context inherit GFP_NOFS

Re: [PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp"

2017-01-16 Thread Theodore Ts'o
On Fri, Jan 06, 2017 at 03:11:06PM +0100, Michal Hocko wrote: > From: Michal Hocko > > This reverts commit c45653c341f5c8a0ce19c8f0ad4678640849cb86 because > sb_getblk_gfp is not really needed as > sb_getblk > __getblk_gfp > __getblk_slow > grow_buffers >

Re: Is is possible to submit binary image as fstest test case?

2016-10-06 Thread Theodore Ts'o
On Thu, Oct 06, 2016 at 08:29:28AM -0400, Brian Foster wrote: > Doesn't necessarily bother me one way or the other, but something we've > done with XFS in such situations is introduce a DEBUG mode only sysfs > tunable that delays certain infrastructure (log recovery in our case) to > coordinate

Re: Experimental btrfs encryption

2016-09-19 Thread Theodore Ts'o
(I'm not on linux-btrfs@, so please keep me on the cc list. Or perhpas better yet, maybe we can move discussion to the linux-fsdevel@ list.) Hi Anand, After reading this thread on the web archives, and seeing that some folks seem to be a bit confused about "vfs level crypto", fs/crypto, and

Re: ChaCha20 vs. AES performance

2016-09-20 Thread Theodore Ts'o
On Tue, Sep 20, 2016 at 03:15:19AM -0800, Kent Overstreet wrote: > Not on the list or I would've replied directly, but on Haswell, ChaCha20 (in > software) is over 2x as fast as AES (in hardware), at realistic (for a > filesystem) block sizes: On Skylake and Broadwell processors, AES is faster

Re: [PATCH] fstests: generic: Check if cycle mount and sleep can affect fiemap result

2017-04-06 Thread Theodore Ts'o
On Wed, Apr 05, 2017 at 10:35:26AM +0800, Eryu Guan wrote: > > Test fails with ext3/2 when driving with ext4 driver, fiemap changed > after umount/mount cycle, then changed back to original result after > sleeping some time. An ext4 bug? (cc'ed linux-ext4 list.) I haven't had time to look at

Re: [PATCH RFC] vfs: add mount umount logs

2017-05-19 Thread Theodore Ts'o
On Fri, May 19, 2017 at 08:17:55AM +0800, Anand Jain wrote: > > XFS already logs its own unmounts. > > Nice. as far as I know its only in XFS. Ext4 logs mounts, but not unmounts. > > I prefer to let each filesystem log > > its own unmount, because then the mount/unmount messages also have the

Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-09 Thread Theodore Ts'o
On Mon, Oct 09, 2017 at 08:54:16AM -0400, Josef Bacik wrote: > I purposefully used as little as possible, just json and sqlite, and I tried > to > use as little python3 isms as possible. Any rpm based systems should have > these > libraries already installed, I agree that using any of the PyPI

Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-09 Thread Theodore Ts'o
On Mon, Oct 09, 2017 at 02:54:34PM +0800, Eryu Guan wrote: > I have no problem either if python is really needed, after all this is a > very useful infrastructure improvement. But the python version problem > brought up by Ted made me a bit nervous, we need to work that round > carefully. > >

Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-08 Thread Theodore Ts'o
On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote: > > Probably should have led with that shouldn't I have? There's nothing keeping > me > from doing it, but I didn't want to try and shoehorn in a python thing into > fstests. I need python to do the sqlite and the json parsing to

Re: [dm-devel] Ideas to reuse filesystem's checksum to enhance dm-raid1/10/5/6?

2017-11-20 Thread Theodore Ts'o
On Thu, Nov 16, 2017 at 03:32:05PM -0700, Chris Murphy wrote: > > XFS by default does metadata csums. But ext4 doesn't use it for either > metadata or the journal by default still, it is still optional. So for > now it mainly benefits XFS. Metadata checksums are enabled by default in the version

Re: Lockdep is less useful than it was

2017-12-08 Thread Theodore Ts'o
On Thu, Dec 07, 2017 at 02:38:03PM -0800, Matthew Wilcox wrote: > I think it was a mistake to force these on for everybody; they have a > much higher false-positive rate than the rest of lockdep, so as you say > forcing them on leads to fewer people using *any* of lockdep. > > The bug you're

Re: [PATCH v4 72/73] xfs: Convert mru cache to XArray

2017-12-07 Thread Theodore Ts'o
On Wed, Dec 06, 2017 at 06:06:48AM -0800, Matthew Wilcox wrote: > > Unfortunately for you, I don't find arguments along the lines of > > "lockdep will save us" at all convincing. lockdep already throws > > too many false positives to be useful as a tool that reliably and > > accurately points out

Re: [GIT PULL] inode->i_version rework for v4.16

2018-01-30 Thread Theodore Ts'o
On Tue, Jan 30, 2018 at 07:05:48AM -0500, Jeff Layton wrote: > > I want to make sure I understand what's actually broken here thoug. Is > it only broken when the two values are more than 2**63 apart, or is > there something else more fundamentally wrong here? The other problem is that returning