On Mon, 24 Nov 2014 13:23:05 +0800, Liu Bo wrote:
This brings a strong-but-slow checksum algorithm, sha256.
Actually btrfs used sha256 at the early time, but then moved to crc32c for
performance purposes.
As crc32c is sort of weak due to its hash collision issue, we need a stronger
What's the test coverage for this? xfstest generic/192 tests that
atime is persisted over remounts, which we had a bug with when XFS
used to have a lazy atime implementation somewhat similar to the
proposal.
We should have something similar for c/mtime as well. Also a test to
ensure timestamps
[SUMMARY]
Introduce the new 'lost+found' dir and related infrastructure to create it
in btrfs-progs.
[BUG]
With the new infrastructure, fix a bug that some people reported in both
kernel BZ and maillist, which there is some files' nlink is 1 but backref
points to non-exist parent.
The two
Import btrfs_insert/del/lookup_extref() functions form kernel for the
incoming btrfs_add_link() and btrfs_unlink() functions.
As the base of incoming btrfs 'lost+found' recovery mechanism.
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
ctree.h | 14
inode-item.c | 206
With the previous btrfs inode operations patches, now we can use
btrfs_mkdir() to create the 'lost+found' dir to do some data salvage in
btrfsck.
This patch along with previous ones will make data salvage easier.
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
ctree.h | 2 ++
inode.c | 92
Before this patch, when btrfsck found an error in root dir, it will only
output the following message root %llu root dir %llu error without any
detailed error.
Just add print_inode_error() to print out the whole error.
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
cmds-check.c | 1 +
1
[BUG]
At least two users have already hit a bug in btrfs causing file
missing(Chromium config file).
The missing file's nlink is still 1 but its backref points to non-exist
parent inode.
This should be a kernel bug, but btrfsck fix is needed anyway.
[FIX]
For such nlink mismatch inode, we will
Import lookup/del_inode_ref() function in inode-item.c, as base functions
for the incoming btrfs_add_link() and btrfs_unlink() functions.
Also modify btrfs_insert_inode_ref() and split_leaf() making them able
to deal with EXTENT_IREF incompat flag.
Signed-off-by: Qu Wenruo
Add btrfs_unlink() and btrfs_add_link() functions in inode.c,
for the incoming btrfs_mkdir() and later inode operations functions.
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Makefile | 2 +-
cmds-check.c | 7 +-
ctree.h | 12 ++
inode.c | 361
This patch implement the RAID5/6 common data repair function, the
implementation is similar to the scrub on the other RAID such as
RAID1, the differentia is that we don't read the data from the
mirror, we use the data repair function of RAID5/6.
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
On Mon, Nov 24, 2014 at 01:07:55AM -0800, Christoph Hellwig wrote:
What's the test coverage for this? xfstest generic/192 tests that
atime is persisted over remounts, which we had a bug with when XFS
used to have a lazy atime implementation somewhat similar to the
proposal.
We should have
On Sat, Nov 22 2014, Theodore Ts'o ty...@mit.edu wrote:
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.
static int update_time(struct inode *inode, struct timespec *time, int flags)
{
+ unsigned short days_since_boot = jiffies / (HZ * 86400);
int ret;
Not yet ready for integration, for review of the new sysfs layout.
This patch makes btrfs_fs_devices and btrfs_device information readable
from sysfs. This uses the sysfs group visible entry point to mark
certain attributes visible/hidden depending the FS state.
The new kobject 'by_fsid' will be
On Fri, Nov 21, 2014 at 02:59:21PM -0500, Theodore Ts'o wrote:
We needed to preserve update_time() because btrfs wants to have a
special btrfs_root_readonly() check; otherwise we could drop the
update_time() inode operation entirely.
Can't btrfs just set the immutable flag on every inode that
On Mon, Nov 24, 2014 at 07:21:01AM -0800, Christoph Hellwig wrote:
On Fri, Nov 21, 2014 at 02:59:21PM -0500, Theodore Ts'o wrote:
We needed to preserve update_time() because btrfs wants to have a
special btrfs_root_readonly() check; otherwise we could drop the
update_time() inode operation
On Fri, Nov 21, 2014 at 04:42:45PM -0500, Theodore Ts'o wrote:
Out of curiosity, why does btrfs_update_time() need to call
btrfs_root_readonly()? Why can't it just depend on the
__mnt_want_write() call in touch_atime()?
mnt_want_write looks only at the mountpoint flags, the readonly
subvolume
On Mon, Nov 24, 2014 at 01:27:21PM +0100, Rasmus Villemoes wrote:
On Sat, Nov 22 2014, Theodore Ts'o ty...@mit.edu wrote:
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.
+ unsigned short days_since_boot = jiffies / (HZ * 86400);
This seems to wrap every
On Mon, Nov 24, 2014 at 05:38:30PM +0100, David Sterba wrote:
It is necessary and the whole .update_time callback was added
intentionally, see commits
c3b2da314834499f34cba94f7053e55f6d6f92d8
fs: introduce inode operation -update_time
e41f941a23115e84a8550b3d901a13a14b2edc2f
Btrfs:
On Mon, Nov 24, 2014 at 07:21:01AM -0800, Christoph Hellwig wrote:
On Fri, Nov 21, 2014 at 02:59:21PM -0500, Theodore Ts'o wrote:
We needed to preserve update_time() because btrfs wants to have a
special btrfs_root_readonly() check; otherwise we could drop the
update_time() inode operation
Hi Chris,
I thought the fix for the scrub/replace deadlock would be included in
this pull, I can reproduce it on each run of xfstests with 3.18-rc.
btrfs: fix dead lock while running replace and defrag concurrently
https://patchwork.kernel.org/patch/5264531/
I've retested it again including
On Mon, Nov 24, 2014 at 12:39 PM, David Sterba dste...@suse.cz wrote:
Hi Chris,
I thought the fix for the scrub/replace deadlock would be included in
this pull, I can reproduce it on each run of xfstests with 3.18-rc.
btrfs: fix dead lock while running replace and defrag concurrently
On Mon, Nov 24, 2014 at 12:22:16PM -0500, Theodore Ts'o wrote:
On Mon, Nov 24, 2014 at 05:38:30PM +0100, David Sterba wrote:
It is necessary and the whole .update_time callback was added
intentionally, see commits
c3b2da314834499f34cba94f7053e55f6d6f92d8
fs: introduce inode
On 2014/11/23 03:07, Marc MERLIN wrote:
On Sun, Nov 23, 2014 at 12:05:04AM +, Hugo Mills wrote:
Which is correct?
Less than or equal to 55% full.
This confuses me. Does that mean that the fullest blocks do not get
rebalanced?
Balance has three primary benefits:
- free up some
On Mon, Nov 24, 2014 at 01:01:10PM -0500, Chris Mason wrote:
I've retested it again including this pull and still deadlocks
reliably
at btrfs/070.
This wasn't a new problem, so I had it queued for the merge window.
Well, I don't remember seeing this problem with anything 3.17 based but
On Fri, Nov 21, 2014 at 05:00:31PM -0500, Josef Bacik wrote:
Hello,
I'm hoping some FS guys can weigh in and verify my approach for testing power
fail conditions, and the DM guys to of course verify I didn't completely fail
at
making a DM target. All suggestions welcome, I want to have a
Holger Hoffstätte posted on Mon, 24 Nov 2014 08:23:25 + as excerpted:
Users can choose sha256 from mkfs.btrfs via
$ mkfs.btrfs -C 256 /device
Not sure how others feel about this, but it's probably easier for
sysadmins to specify the algorithm by name from the set of supported
ones,
On 11/24/2014 01:45 PM, Zach Brown wrote:
On Fri, Nov 21, 2014 at 05:00:31PM -0500, Josef Bacik wrote:
Hello,
I'm hoping some FS guys can weigh in and verify my approach for testing power
fail conditions, and the DM guys to of course verify I didn't completely fail at
making a DM target. All
On Mon, Nov 24, 2014 at 12:23 AM, Holger Hoffstätte
holger.hoffstae...@googlemail.com wrote:
Would there be room for a compromise with e.g. 128 bits?
For example, Spooky V2 hash is 128 bits and is very fast. It is
noncryptographic, but it is more than adequate for data checksums.
This implements a writeback cache in kernel data structures so that you
can race to throw away cached blocks that haven't been flushed. How is
that meaningfully different than using an actual writeback caching dm
target and racing to invalidate it?
I didn't think of the dm-cache target,
On Mon, Nov 24, 2014 at 12:23 AM, Liu Bo bo.li@oracle.com wrote:
This brings a strong-but-slow checksum algorithm, sha256.
Actually btrfs used sha256 at the early time, but then moved to
crc32c for
performance purposes.
As crc32c is sort of weak due to its hash collision issue, we need a
On 11/24/2014 02:57 PM, Zach Brown wrote:
This implements a writeback cache in kernel data structures so that you
can race to throw away cached blocks that haven't been flushed. How is
that meaningfully different than using an actual writeback caching dm
target and racing to invalidate it?
I
On Mon, Nov 24, 2014 at 3:15 PM, Josef Bacik jba...@fb.com wrote:
On 11/24/2014 02:57 PM, Zach Brown wrote:
That is way complicated, I was just going to take two devices, one
that's a linear mapping and the other that's the log, and then write
to the log the sector+data that was written in
On Mon, Nov 24, 2014 at 03:07:45PM -0500, Chris Mason wrote:
On Mon, Nov 24, 2014 at 12:23 AM, Liu Bo bo.li@oracle.com wrote:
This brings a strong-but-slow checksum algorithm, sha256.
Actually btrfs used sha256 at the early time, but then moved to
crc32c for
performance purposes.
As
On Sat, Nov 22, 2014 at 12:03:57PM -0800, Omar Sandoval wrote:
On Fri, Nov 21, 2014 at 07:00:45PM +0100, David Sterba wrote:
+ ret = -EINVAL;
+ goto out;
+ }
+ if (test_bit(EXTENT_FLAG_COMPRESSED, em-flags)) {
+
On Mon, Nov 24, 2014 at 03:15:25PM -0500, Josef Bacik wrote:
On 11/24/2014 02:57 PM, Zach Brown wrote:
This implements a writeback cache in kernel data structures so that you
can race to throw away cached blocks that haven't been flushed. How is
that meaningfully different than using an
On Mon, Nov 24, 2014 at 06:57:27AM -0500, Theodore Ts'o wrote:
If we want to be paranoid, we handle i_version updates non-lazily; I
can see arguments in favor of that.
Ext4 only enables MS_I_VERSION if the user asks for it explicitly, so
it wouldn't cause me any problems. However, xfs and
On Fri, Nov 21, 2014 at 02:19:14AM -0800, Christoph Hellwig wrote:
On Fri, Nov 21, 2014 at 02:15:31AM -0800, Omar Sandoval wrote:
Sorry for the noise, looks like Christoph got back to me on the previous RFC
just before I sent this out -- disregard this for now.
If the NFS people are fine
On 11/24/2014 05:10 PM, Zach Brown wrote:
On Mon, Nov 24, 2014 at 03:15:25PM -0500, Josef Bacik wrote:
On 11/24/2014 02:57 PM, Zach Brown wrote:
This implements a writeback cache in kernel data structures so that you
can race to throw away cached blocks that haven't been flushed. How is
that
That is way complicated, I was just going to take two devices, one that's a
linear mapping and the other that's the log, and then write to the log the
sector+data that was written in order that it completes, and then have
userspace do the replay. So basically do the flush tracking like I am,
On Mon, Nov 24, 2014 at 05:11:45PM -0500, J. Bruce Fields wrote:
On Mon, Nov 24, 2014 at 06:57:27AM -0500, Theodore Ts'o wrote:
If we want to be paranoid, we handle i_version updates non-lazily; I
can see arguments in favor of that.
Ext4 only enables MS_I_VERSION if the user asks for it
On Fri, Nov 21, 2014 at 02:59:22PM -0500, Theodore Ts'o wrote:
Add a new mount option which enables a new lazytime mode. This mode
causes atime, mtime, and ctime updates to only be made to the
in-memory version of the inode. The on-disk times will only get
updated when (a) if the inode needs
On Fri, Nov 21, 2014 at 02:59:23PM -0500, Theodore Ts'o wrote:
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.
Signed-off-by: Theodore Ts'o ty...@mit.edu
If we put these inodes on the dirty inode list with at writeback
time of 24 hours, this is completely
Hi all,
I was looking for a quick method of testing whether a working directory is a
subvolume.
Couldn't see an obvious one, so tried 'btrfs show somesubvol≥'. It printed
a fail message as expected but returned 0 exit status. Bug?
Can I put in a feature request for a shell file test operator
Original Message
Subject: Re: [RFC PATCH] Btrfs: add sha256 checksum option
From: Hugo Mills h...@carfax.org.uk
To: Chris Mason c...@fb.com
Date: 2014年11月25日 04:58
On Mon, Nov 24, 2014 at 03:07:45PM -0500, Chris Mason wrote:
On Mon, Nov 24, 2014 at 12:23 AM, Liu Bo
On Tue, Nov 25, 2014 at 12:52:39PM +1100, Dave Chinner wrote:
+static void flush_sb_dirty_time(struct super_block *sb)
+{
...
+}
This just seems wrong to me, not to mention extremely expensive when we have
millions of cached inodes on the superblock.
#1, It only gets called on a
On Tue, Nov 25, 2014 at 12:53:32PM +1100, Dave Chinner wrote:
On Fri, Nov 21, 2014 at 02:59:23PM -0500, Theodore Ts'o wrote:
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.
Signed-off-by: Theodore Ts'o ty...@mit.edu
If we put these inodes on the dirty inode
Hello,
After I had some brief stability issues with my computer, it seems
some form of metadata corruption took place in my BTRFS filesystem,
and now a particular file seems to exist, but I cannot access any
details on it or delete it.
If I try to `ls` in the directory it is in, that's what I
Hi,
What's the btrfsck output? Without --repair option.
Also, if it is OK for you, would you please dump the btrfs with
'btrfs-image' command?
'-c 9' option is highly recommended considering the size of it.
This will helps a lot for developers to test the btrfsck repair function.
Thanks,
Qu
On Mon, Nov 24, 2014 at 08:58:25PM +, Hugo Mills wrote:
On Mon, Nov 24, 2014 at 03:07:45PM -0500, Chris Mason wrote:
On Mon, Nov 24, 2014 at 12:23 AM, Liu Bo bo.li@oracle.com wrote:
This brings a strong-but-slow checksum algorithm, sha256.
Actually btrfs used sha256 at the early
I'll go run that and get you the output.
I can do the image dump, sure. I don't know how long it might take to
upload it somewhere though. Right now `btrfs fi df` shows about 2GiB
of metadata (it's a 120GiB volume). I'll see how large it ends up
after compression.
Thanks for the quick response,
Original Message
Subject: Re: Apparent metadata corruption (file that simultaneously
does/does not exist) on kernel 3.17.3
From: Daniel Miranda danielk...@gmail.com
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年11月25日 13:14
I'll go run that and get you the output.
In preparation for adding support for the lazytime mount option, we
need to be able to separate out the update_time() and write_time()
inode operations. Currently, only btrfs and xfs uses update_time().
We needed to preserve update_time() because btrfs wants to have a
special
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.
Signed-off-by: Theodore Ts'o ty...@mit.edu
---
fs/fs-writeback.c | 1 +
fs/inode.c | 28 +++-
include/linux/fs.h | 1 +
3 files changed, 25 insertions(+), 5 deletions(-)
diff --git
This is an updated version of what had originally been an
ext4-specific patch which significantly improves performance by lazily
writing timestamp updates (and in particular, mtime updates) to disk.
The in-memory timestamps are always correct, but they are only written
to disk when required for
Add an optimization for the MS_LAZYTIME mount option so that we will
opportunistically write out any inodes with the I_DIRTY_TIME flag set
in a particular inode table block when we need to update some inode in
that inode table block anyway.
Also add some temporary code so that we can set the
On Tue, 2014-11-25 at 02:11 +, boris wrote:
Hi all,
I was looking for a quick method of testing whether a working directory is a
subvolume.
Couldn't see an obvious one, so tried 'btrfs show somesubvol≥'. It printed
a fail message as expected but returned 0 exit status. Bug?
Hi
Add a new mount option which enables a new lazytime mode. This mode
causes atime, mtime, and ctime updates to only be made to the
in-memory version of the inode. The on-disk times will only get
updated when (a) if the inode needs to be updated for some non-time
related change, (b) if userspace
The only reason btrfs cloned code from the VFS layer was so it could
add a check to see if a subvolume is read-ony. Instead of doing that,
let's add a new inode operation which allows a file system to return
an error if the inode is read-only, and use that in update_time().
There may be other
Signed-off-by: Theodore Ts'o ty...@mit.edu
---
fs/fs-writeback.c | 5 -
fs/inode.c| 5 +
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index eb04277..cab2d6d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -27,6 +27,7
On 2014/11/25 00:03, Omar Sandoval wrote:
[snip]
The snapshot issue is a little tricker to resolve. I see a few options:
1. Just do the COW and hope for the best
2. As part of btrfs_swap_activate, COW any shared extents. If a snapshot
happens while a swap file is active, we'll fall back to 1.
Here are the logs. I'll send you a link to my dump directly after I
finish uploading it. Please notify me when you have downloaded it so I
can delete it.
checking extents
checking free space cache
checking fs roots
root 5 inode 17149868 errors 2000, link count wrong
unresolved ref dir
Steps to reproduce:
# mkfs.btrfs -f /dev/sdb
# mount -t btrfs /dev/sdb /mnt
# btrfs sub create /mnt/dir
# mount -t btrfs /dev/sdb /mnt -o subvol=dir,subvol=dir
It fails with:
mount: mount(2) failed: No such file or directory
Btrfs deal with subvolume mounting in a recursive way,
to avoid
Original Message
Subject: Re: Apparent metadata corruption (file that simultaneously
does/does not exist) on kernel 3.17.3
From: Daniel Miranda danielk...@gmail.com
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年11月25日 15:20
Here are the logs. I'll send you a link to my
On 11/25/2014 03:11 AM, boris wrote:
Hi all,
I was looking for a quick method of testing whether a working directory is a
subvolume.
Currently btrfs check that:
- the inode number is 255
- the path is a directory
From cmds-subvolume.c
[...]
/*
* test if path is a subvolume:
* this
Steps to reproduce:
# mkfs.btrfs -f /dev/sdb
# mount -t btrfs /dev/sdb /mnt
# btrfs sub create /mnt/dir
# mount -t btrfs /dev/sdb /mnt -o subvol=dir,subvol=dir
It fails with:
mount: mount(2) failed: No such file or directory
Btrfs deal with subvolume mounting in a recursive way,
to
I just ran the repair but the ghost file has not disappeared, unfortunately.
On Tue, Nov 25, 2014 at 5:26 AM, Qu Wenruo quwen...@cn.fujitsu.com wrote:
Original Message
Subject: Re: Apparent metadata corruption (file that simultaneously
does/does not exist) on kernel 3.17.3
66 matches
Mail list logo