Metadata size
I'm a little concerned about the size of my metadata. I'm doing raid10 on both data and metadata, and: h...@vlad:mnt $ sudo btrfs fi df /mnt Data: total=488.01GB, used=487.23GB Metadata: total=3.01GB, used=677.73MB System: total=11.88MB, used=52.00KB h...@vlad:mnt $ find /mnt | wc -l 20137 By my calculations, that's something on the order of 17.5K per filesystem object. This is mostly media files, plus some small metadata files. 17.5K on average seems very large to me. I have quite a bit of space on this system, so I'm not too concerned, but I wasn't sure if this kind of figure was representative or not. Overall file count by size: 0-1021 10-100 153 100-1K 778 1K-10K 279 10K-100K96 100K-1M238 1M-10M 12556 10M-100M 3452 100M-1G332 1G-10G 171 0-1K 952 1K-1M 613 1M-1G16340 1G+171 Interestingly, the metadata value was closer to 15K/object until my last batch of writing, which was the 171 1G+ files (and a few in the 100M-1G range), plus an equal number of small (2K) files. Hugo. -- === Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Trouble rather the tiger in his lair than the sage amongst --- his books for to you kingdoms and their armies are mighty and enduring, but to him they are but toys of the moment to be overturned by the flicking of a finger. signature.asc Description: Digital signature
Re: btrfs filesystem df not working
On Wed, Oct 13, 2010 at 4:08 PM, Chris Mason chris.ma...@oracle.com wrote: On Wed, Oct 13, 2010 at 10:52:57AM +0100, Leonidas Spyropoulos wrote: On Wed, Oct 13, 2010 at 1:43 AM, Chris Mason chris.ma...@oracle.com wrote: On Tue, Oct 12, 2010 at 02:45:19PM +0100, Leonidas Spyropoulos wrote: On Tue, Oct 12, 2010 at 2:43 PM, cwillu cwi...@cwillu.com wrote: On Tue, Oct 12, 2010 at 4:12 AM, Leonidas Spyropoulos artafi...@gmail.com wrote: The above command is not working on my system. Information: btrfs f df /media/data btrfs f isn't unique; fi is the minimum to specify filesystem I tried even with btrfs filesystem df /media/data and same results. Does strace give us any clues? According to strace there is inappropriate ioctl for the device. Here is the log I missed this before: 2.6.32-5-amd64 The df ioctl was added after 2.6.32 (2.6.33 I think). So in debian squeeze/unstable which is currently on 2.6.32 (and won't change any sooner) I cannot use btrfs. All I can do is try experimental kernels? My question though is, if I use experimental kernels can I then load an old kernel and still use the btrfs filesystem? Or the newer kernels write anything specials on ionodes which the old ones cannot read? -chris -- Caution: breathing may be hazardous to your health. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
PATCH zeroing ioctl21 flags progs-side
This is once again the whole patch starting from 075587c96c2f39e227847d13ca0ef305b13cd7d3 (Chris Mason, April 06 2010) The difference between this one and yesterday's is: 1: the file descriptor leak is corrected 2: the ioctl21 flags field is explicitly zeroed, for forwards compatibility. The intended semantics of the flags field is, zeroes mean, wait for everthing we know how to wait for using ioctl#21 -- a 1 will mean, ignore completion of that set of deferred tasks, when there are other deferred tasks ioctl#21 can be used to wait for. Also a 1 in a position associated with a reprioritization directive would mean to do something, and a zero would mean, do nothing so all-zeroes is supposed to mean, into a possible future where ioctl#21 does more, wait for completion of everything we know about, and don't do any other optional anything. -- Forwarded message -- Date: Thu, Oct 14, 2010 at 9:32 AM Subject: zeroing ioctl21 flags progs-side To: davidni...@gmail.com diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e9bf864..a350b75 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -895,6 +895,7 @@ struct btrfs_fs_info { struct list_head trans_list; struct list_head hashers; struct list_head dead_roots; + wait_queue_head_t cleaner_notification_registration; struct list_head caching_block_groups; spinlock_t delayed_iput_lock; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 34f7c37..6a35257 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1451,6 +1451,7 @@ static int cleaner_kthread(void *arg) mutex_trylock(root-fs_info-cleaner_mutex)) { btrfs_run_delayed_iputs(root); btrfs_clean_old_snapshots(root); + wake_up_all(root-fs_info-cleaner_notification_registration); mutex_unlock(root-fs_info-cleaner_mutex); } @@ -1581,6 +1582,7 @@ struct btrfs_root *open_ctree(struct super_block *sb, INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC); INIT_LIST_HEAD(fs_info-trans_list); INIT_LIST_HEAD(fs_info-dead_roots); + init_waitqueue_head(fs_info-cleaner_notification_registration); INIT_LIST_HEAD(fs_info-delayed_iputs); INIT_LIST_HEAD(fs_info-hashers); INIT_LIST_HEAD(fs_info-delalloc_inodes); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9254b3d..ffc86a8 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1212,6 +1212,65 @@ static noinline int btrfs_ioctl_ino_lookup(struct file *file, return ret; } +static int btrfs_ioctl_cleaner_wait(struct btrfs_root *root, void __user *arg) +{ + struct btrfs_ioctl_cleaner_wait_args *bicwa; + long remainingjiffies; + int err; + + bicwa = memdup_user(arg, sizeof(*bicwa)); + if (IS_ERR(bicwa)) + return PTR_ERR(bicwa); + + /* the bicwa flags field is intended to hold bits + that will be set to 1 to disable a cleanliness + test. Currently there is only one test, but + when there are more (or other things, like + reprioritizing the cleaner thread because something + is waiting on it, although that happens already + because the waiting thing has yielded, so that + isn't really a hot to-do item) this function + will of course get modified to implement them. */ + + if (bicwa-flags 0x01) /* the highest flag we know about */ + { + err = -EINVAL; + goto done_with_bicwa; + } + + if (bicwa-ms 0) + { + remainingjiffies = wait_event_interruptible_timeout( + root-fs_info-cleaner_notification_registration, + /* together multiple FLAG OR TEST sequences + when there are more than one */ + ( bicwa-flags 0x01 ? 1 : + list_empty(root-fs_info-dead_roots) + ), + msecs_to_jiffies(bicwa-ms) + ); + if (remainingjiffies 0) + err = 0; + else if (remainingjiffies 0 ) + err = -EAGAIN; + else + err = -ETIME; + } + else + { + err = wait_event_interruptible( + root-fs_info-cleaner_notification_registration, + list_empty(root-fs_info-dead_roots) + ); + }; + + done_with_bicwa: + kfree(bicwa); + return err; + +} + + static noinline int btrfs_ioctl_snap_destroy(struct file *file, void __user *arg) { @@ -2003,6 +2062,8 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_snap_create(file, argp, 1); case BTRFS_IOC_SNAP_DESTROY: return btrfs_ioctl_snap_destroy(file, argp); + case BTRFS_IOC_CLEANER_WAIT: + return
PATCH added flags to cleaner-wait structure kernel-side
This is once again based on 2ebc3464781ad24474abcbd2274e6254689853b5 (Dan Rosenberg July 19 2010) The delta between this and the previous ioctl#21 kernel patch I posted is that this one defines the flags field in the arguments structure, and has a comment about the intended semantics of it, and tests for the low bit to the effect that if the flags field is set to 1 the ioctl returns immediately; also if it is set 1 that is an EINVAL because this version of the kernel doesn't know that flag, and it is better to safely full-stop instead of ignoring what might be an important flag. Or is it a better practice to ignore unexpected fields in such things? I think the proposed flag semantics as described in the introduction to the latest revision of the prog-side code might make it okay to ignore unexpected fields instead of refusing. The scenario where it matters is, running a newer, future ioctl21 invoker that knows about some future flag, against an old (such as current, after applying this patch) kernel that doesn't. Fail or ignore? Or, do I revise it again to have two flags, one to ignore the one defined completion test, and the other to specify ignore (0) or fail (1) semantics for unrecognized flag bits? Then I'd have to add a command line arg for that, possibly --no-forward-compat which would set the fail-on-unrecognized-flag-bit flag. My crystal ball might need a little adjustment, I don't know. -- Forwarded message -- Date: Thu, Oct 14, 2010 at 8:58 AM Subject: added flags to cleaner-wait structure kernel-side To: davidni...@gmail.com diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e9bf864..a350b75 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -895,6 +895,7 @@ struct btrfs_fs_info { struct list_head trans_list; struct list_head hashers; struct list_head dead_roots; + wait_queue_head_t cleaner_notification_registration; struct list_head caching_block_groups; spinlock_t delayed_iput_lock; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 34f7c37..6a35257 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1451,6 +1451,7 @@ static int cleaner_kthread(void *arg) mutex_trylock(root-fs_info-cleaner_mutex)) { btrfs_run_delayed_iputs(root); btrfs_clean_old_snapshots(root); + wake_up_all(root-fs_info-cleaner_notification_registration); mutex_unlock(root-fs_info-cleaner_mutex); } @@ -1581,6 +1582,7 @@ struct btrfs_root *open_ctree(struct super_block *sb, INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC); INIT_LIST_HEAD(fs_info-trans_list); INIT_LIST_HEAD(fs_info-dead_roots); + init_waitqueue_head(fs_info-cleaner_notification_registration); INIT_LIST_HEAD(fs_info-delayed_iputs); INIT_LIST_HEAD(fs_info-hashers); INIT_LIST_HEAD(fs_info-delalloc_inodes); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9254b3d..ffc86a8 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1212,6 +1212,65 @@ static noinline int btrfs_ioctl_ino_lookup(struct file *file, return ret; } +static int btrfs_ioctl_cleaner_wait(struct btrfs_root *root, void __user *arg) +{ + struct btrfs_ioctl_cleaner_wait_args *bicwa; + long remainingjiffies; + int err; + + bicwa = memdup_user(arg, sizeof(*bicwa)); + if (IS_ERR(bicwa)) + return PTR_ERR(bicwa); + + /* the bicwa flags field is intended to hold bits + that will be set to 1 to disable a cleanliness + test. Currently there is only one test, but + when there are more (or other things, like + reprioritizing the cleaner thread because something + is waiting on it, although that happens already + because the waiting thing has yielded, so that + isn't really a hot to-do item) this function + will of course get modified to implement them. */ + + if (bicwa-flags 0x01) /* the highest flag we know about */ + { + err = -EINVAL; + goto done_with_bicwa; + } + + if (bicwa-ms 0) + { + remainingjiffies = wait_event_interruptible_timeout( + root-fs_info-cleaner_notification_registration, + /* together multiple FLAG OR TEST sequences + when there are more than one */ + ( bicwa-flags 0x01 ? 1 : + list_empty(root-fs_info-dead_roots) + ), + msecs_to_jiffies(bicwa-ms) + ); + if (remainingjiffies 0) + err = 0; + else if (remainingjiffies 0 ) + err = -EAGAIN; + else + err = -ETIME; + } + else + { + err = wait_event_interruptible( + root-fs_info-cleaner_notification_registration, +
Re: btrfs filesystem df not working
On Thu, Oct 14, 2010 at 12:45:59PM +0100, Leonidas Spyropoulos wrote: On Wed, Oct 13, 2010 at 4:08 PM, Chris Mason chris.ma...@oracle.com wrote: On Wed, Oct 13, 2010 at 10:52:57AM +0100, Leonidas Spyropoulos wrote: On Wed, Oct 13, 2010 at 1:43 AM, Chris Mason chris.ma...@oracle.com wrote: On Tue, Oct 12, 2010 at 02:45:19PM +0100, Leonidas Spyropoulos wrote: On Tue, Oct 12, 2010 at 2:43 PM, cwillu cwi...@cwillu.com wrote: On Tue, Oct 12, 2010 at 4:12 AM, Leonidas Spyropoulos artafi...@gmail.com wrote: The above command is not working on my system. Information: btrfs f df /media/data btrfs f isn't unique; fi is the minimum to specify filesystem I tried even with btrfs filesystem df /media/data and same results. Does strace give us any clues? According to strace there is inappropriate ioctl for the device. Here is the log I missed this before: 2.6.32-5-amd64 The df ioctl was added after 2.6.32 (2.6.33 I think). So in debian squeeze/unstable which is currently on 2.6.32 (and won't change any sooner) I cannot use btrfs. All I can do is try experimental kernels? Or backport the changes, yes. My question though is, if I use experimental kernels can I then load an old kernel and still use the btrfs filesystem? Or the newer kernels write anything specials on ionodes which the old ones cannot read? We haven't made any of those changes, you'll be fine going back and forth. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: fix df regression
The new ENOSPC stuff breaks out the raid types which breaks the way we were reporting df to the system. This fixes it back so that Available is the total space available to data and used is the actual bytes used by the filesystem. This means that Available is Total - data used - all of the metadata space. Thanks, Signed-off-by: Josef Bacik jo...@redhat.com --- fs/btrfs/ctree.h |5 - fs/btrfs/extent-tree.c |2 ++ fs/btrfs/super.c | 11 +-- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e25e96e..4833a01 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -696,7 +696,8 @@ struct btrfs_block_group_item { struct btrfs_space_info { u64 flags; - u64 total_bytes;/* total bytes in the space */ + u64 total_bytes;/* total bytes in the space, + this doesn't take mirrors into account */ u64 bytes_used; /* total bytes used, this does't take mirrors into account */ u64 bytes_pinned; /* total bytes pinned, will be freed when the @@ -708,6 +709,8 @@ struct btrfs_space_info { u64 bytes_may_use; /* number of bytes that may be used for delalloc/allocations */ u64 disk_used; /* total bytes used on disk */ + u64 disk_total; /* total bytes on disk, takes mirrors into + account */ int full; /* indicates that we cannot allocate any more chunks for this space */ diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3f8aee5..72c3d5f 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2982,6 +2982,7 @@ static int update_space_info(struct btrfs_fs_info *info, u64 flags, if (found) { spin_lock(found-lock); found-total_bytes += total_bytes; + found-disk_total += total_bytes * factor; found-bytes_used += bytes_used; found-disk_used += bytes_used * factor; found-full = 0; @@ -3001,6 +3002,7 @@ static int update_space_info(struct btrfs_fs_info *info, u64 flags, BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA); found-total_bytes = total_bytes; + found-disk_total = total_bytes * factor; found-bytes_used = bytes_used; found-disk_used = bytes_used * factor; found-bytes_pinned = 0; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 1b92f57..0570211 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -729,18 +729,25 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf) struct list_head *head = root-fs_info-space_info; struct btrfs_space_info *found; u64 total_used = 0; + u64 total_used_data = 0; int bits = dentry-d_sb-s_blocksize_bits; __be32 *fsid = (__be32 *)root-fs_info-fsid; rcu_read_lock(); - list_for_each_entry_rcu(found, head, list) + list_for_each_entry_rcu(found, head, list) { + if (found-flags (BTRFS_BLOCK_GROUP_METADATA | + BTRFS_BLOCK_GROUP_SYSTEM)) + total_used_data += found-disk_total; + else + total_used_data += found-disk_used; total_used += found-disk_used; + } rcu_read_unlock(); buf-f_namelen = BTRFS_NAME_LEN; buf-f_blocks = btrfs_super_total_bytes(disk_super) bits; buf-f_bfree = buf-f_blocks - (total_used bits); - buf-f_bavail = buf-f_bfree; + buf-f_bavail = buf-f_blocks - (total_used_data bits); buf-f_bsize = dentry-d_sb-s_blocksize; buf-f_type = BTRFS_SUPER_MAGIC; -- 1.6.6.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html