[LSF/MM TOPIC]: Btrfs: Decoupling block-size and page-size in BTRFS.

2014-11-27 Thread Chandan Rajendra
In BTRFS, BLOCK_SIZE, the basic IO size of the filesystem, is equal to the PAGE_SIZE of the architecture. Some 64-bit architectures, like PPC64 and ARM64, can/do have a default PAGE_SIZE of 64K, which means the filesystems handled in these architectures have a BLOCK_SIZE of 64K. This works fine as

Re: BTRFS messes up snapshot LV with origin

2014-11-27 Thread Duncan
Robert White posted on Wed, 26 Nov 2014 14:08:14 -0800 as excerpted: > On 11/25/2014 07:22 PM, Duncan wrote: >>>From my perspective, however, btrfs is simply incompatible with lvm >> snapshots, because the basic assumptions are incompatible. Btrfs >> assumes UUIDs will be exactly what they say on

Re: Can't cp --reflink files on a Ext4-converted FS w/o checksums

2014-11-27 Thread Duncan
Robert White posted on Wed, 26 Nov 2014 15:18:26 -0800 as excerpted: > I also don't see anything in the code that says "this ioctl will create > the checksums for the selected file" so you may have to do the copy you > tried to avoid. Note that btrfs check has an --init-csum-tree switch. In a n

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Jan Kara
On Wed 26-11-14 11:23:28, Christoph Hellwig wrote: > As mentioned last round please move the addition of the is_readonly > operation to the first thing in the series, so that the ordering makes > more sense. > > Second I think this patch is incorrect for XFS - XFS uses ->update_time > to set the t

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Jan Kara
On Wed 26-11-14 05:23:52, Ted Tso wrote: > Add a new mount option which enables a new "lazytime" mode. This mode > causes atime, mtime, and ctime updates to only be made to the > in-memory version of the inode. The on-disk times will only get > updated when (a) if the inode needs to be updated fo

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Jan Kara
On Thu 27-11-14 10:35:37, Dave Chinner wrote: > On Wed, Nov 26, 2014 at 04:10:44PM -0700, Andreas Dilger wrote: > > On Nov 26, 2014, at 3:48 PM, Dave Chinner wrote: > > > > > > On Wed, Nov 26, 2014 at 05:23:56AM -0500, Theodore Ts'o wrote: > > >> Add an optimization for the MS_LAZYTIME mount opti

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Jan Kara
On Thu 27-11-14 14:27:52, Jan Kara wrote: > On Thu 27-11-14 10:35:37, Dave Chinner wrote: > > On Wed, Nov 26, 2014 at 04:10:44PM -0700, Andreas Dilger wrote: > > > On Nov 26, 2014, at 3:48 PM, Dave Chinner wrote: > > > > > > > > On Wed, Nov 26, 2014 at 05:23:56AM -0500, Theodore Ts'o wrote: > > >

Balance and RAID-1

2014-11-27 Thread Russell Coker
I had a RAID-1 filesystem with 2*3TB disks and 330G of disk space free according to df -h. I replaced a 3TB disk with a 4TB disk and df reported no change in the free space (as expected). I added a 1TB disk to the filesystem and there was still no change! I expected that adding a 1TB disk wou

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
On Wed, Nov 26, 2014 at 11:23:28AM -0800, Christoph Hellwig wrote: > As mentioned last round please move the addition of the is_readonly > operation to the first thing in the series, so that the ordering makes > more sense. OK, will fix. > Second I think this patch is incorrect for XFS - XFS uses

Re: Balance and RAID-1

2014-11-27 Thread Zygo Blaxell
On Fri, Nov 28, 2014 at 01:37:50AM +1100, Russell Coker wrote: > I had a RAID-1 filesystem with 2*3TB disks and 330G of disk space free > according to df -h. I replaced a 3TB disk with a 4TB disk and df reported no > change in the free space (as expected). Did you btrfs resize that 4TB disk? I

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Christoph Hellwig
On Thu, Nov 27, 2014 at 01:34:29PM +0100, Jan Kara wrote: > But Ted changed XFS to copy timestamps to on-disk structure from the > in-memory inode fields after VFS updated the timestamps. So the stamps > should be coherent AFAICT, shouldn't they? Not coherent enough. We need the XFS ilock to sy

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
This is what I'm currently playing with which I believe fixes the iput() problem. In fs/ext4/inode.c: struct other_inode { unsigned long orig_ino; struct ext4_inode *raw_inode; }; static int other_inode_match(struct inode * inode, unsigned long ino,

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Christoph Hellwig
FYI, I suspect for now the best might be to let filesystems that define ->update_times work as-is and not tie them into the infrastructure. At least for XFS I suspect the lazy updates might better be handled internally, although I'm not entirely sure yet. -- To unsubscribe from this list: send the

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
Christoph, can you take a quick look at this? I'm not sure I got the xfs inode transaction logging correct. Thanks!! - Ted commit cd58addfa340c9cf88b1f9b2d31a42e2e65c7252 Author: Theodore Ts'o Date: Thu Nov 27 10:14:27 2014 -0500 vfs: spli

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Jan Kara
On Thu 27-11-14 10:25:24, Ted Tso wrote: > This is what I'm currently playing with which I believe fixes the iput() > problem. In fs/ext4/inode.c: > > struct other_inode { > unsigned long orig_ino; > struct ext4_inode *raw_inode; > }; > static int other_inode_match(str

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Christoph Hellwig
I don't think this scheme works well. As mentioned earlier XFS doesn't even use vfs dirty tracking at the moment, so introducing this in a hidden way sounds like a bad idea. Probably the same for btrfs. I'd rather keep update_time as-is for now, don't add ->write_time and let btrfs and XFS figur

Re: File test operator for subvols; possible bug in 'btrfs show '

2014-11-27 Thread David Sterba
On Tue, Nov 25, 2014 at 08:32:36AM +0100, Goffredo Baroncelli wrote: > On 11/25/2014 03:11 AM, boris wrote: > > Hi all, > > > > I was looking for a quick method of testing whether a working directory is > > a > > subvolume. > > Currently btrfs check that: > - the inode number is 255 It's 256.

Re: Changing label few times killed filesystem?

2014-11-27 Thread Boris Chernov
Since nobody had any other suggestions, I decided to attempt to run modified btrfsck with --repair option (without BUG_ON(rec->is_root) assertion). Surprisingly modified btrfsck --repair fixed all errors but one (according to btrfsck), but btrfsck asked me to run btrfsck --repair one

Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 04:41:59PM +0100, Jan Kara wrote: > Hum, but this puts lots of stuff under inode_hash_lock, including > writeback list lock. I don't like this too much. I understand that getting > handle for each inode is rather more CPU intensive but it should still be a > clear win over

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 02:14:21PM +0100, Jan Kara wrote: > Looking into the code & your patch I'd prefer to do something like: > * add support for I_DIRTY_TIME in __mark_inode_dirty() - update_time will > call __mark_inode_dirty() with this flag if any of the times was updated. > That way we c

Re: [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time()

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 08:49:52AM -0800, Christoph Hellwig wrote: > I don't think this scheme works well. As mentioned earlier XFS doesn't > even use vfs dirty tracking at the moment, so introducing this in a > hidden way sounds like a bad idea. Probably the same for btrfs. > > I'd rather keep

[PATCH v4 4/6] Btrfs: fix race between fs trimming and block group remove/allocation

2014-11-27 Thread Filipe Manana
Our fs trim operation, which is completely transactionless (doesn't start or joins an existing transaction) consists of visiting all block groups and then for each one to iterate its free space entries and perform a discard operation against the space range represented by the free space entries. Ho

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 02:14:21PM +0100, Jan Kara wrote: > * change queue_io() to also call > moved += move_expired_inodes(&wb->b_dirty_time, &wb->b_io, time + > 24hours) > For this you need to tweak move_expired_inodes() to take pointer to > timestamp instead of pointer to work but tha

Re: Balance and RAID-1

2014-11-27 Thread Russell Coker
On Fri, 28 Nov 2014, Zygo Blaxell wrote: > On Fri, Nov 28, 2014 at 01:37:50AM +1100, Russell Coker wrote: > > I had a RAID-1 filesystem with 2*3TB disks and 330G of disk space free > > according to df -h. I replaced a 3TB disk with a 4TB disk and df > > reported no change in the free space (as

Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option

2014-11-27 Thread Theodore Ts'o
On Thu, Nov 27, 2014 at 06:00:16PM -0500, Theodore Ts'o wrote: > Well it's not quite enough. The problem is that for ext3 and > ext4, the actual work of writing the inode happens in dirty_inode(), > not in write_inode(). Which means we need to do something like this. > > I'm not entirely sur

Re: BTRFS messes up snapshot LV with origin

2014-11-27 Thread Chris Murphy
On Thu, Nov 27, 2014 at 2:08 AM, Duncan <1i5t5.dun...@cox.net> wrote: > So, umm... kinda late now, but read that "copy" as if it had a footnote > attached, saying "Yes, I know it's not actual copy, it's two views of the > same thing using COW, but my point is, from the btrfs perspective it's a > co

Re: Can't cp --reflink files on a Ext4-converted FS w/o checksums

2014-11-27 Thread Robert White
On 11/27/2014 01:27 AM, Duncan wrote: Robert White posted on Wed, 26 Nov 2014 15:18:26 -0800 as excerpted: I also don't see anything in the code that says "this ioctl will create the checksums for the selected file" so you may have to do the copy you tried to avoid. Note that btrfs check has a