Re: [PATCH] btrfs: take the error path out if btrfs_attach_transaction() fails

2017-09-28 Thread Anand Jain



On 09/27/2017 10:17 PM, David Sterba wrote:

On Wed, Sep 27, 2017 at 05:50:52PM +0800, Anand Jain wrote:

btrfs_init_new_device() calls btrfs_attach_transaction() to
commit sys chunks, however take the error path out if it fails.

Signed-off-by: Anand Jain 
---
  fs/btrfs/volumes.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fad3b10a1f81..b526d13a74da 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2494,7 +2494,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, 
const char *device_path
if (IS_ERR(trans)) {
if (PTR_ERR(trans) == -ENOENT)
return 0;
-   return PTR_ERR(trans);
+   ret = PTR_ERR(trans);
+   goto error_sysfs;


The label is introduced by another patch, please resend the whole
patchset, I've seen several iterations and feedback from various people
and I'm not sure I'm looking at the latest version.


 Pls consider V4, in the ML.


Regarding error handling in btrfs_init_new_device, the seeding device
makes it hard to read. This patch would lead to a double unlock of
uuid_mutex and sb::s_umount, because the label error_sysfs will continue
to do the cleanups, that were already partially done in the containing
'if (seeding_dev)' block where the test fails.


 Fixed this in
   [PATCH v4 3/3] btrfs: error out if btrfs_attach_transaction() fails


I'd suggest to first get rid of the in-place returns and add necessary
labels or separate exit sequences and then address the new error
handling.


  As it goes deeper there are quite a number of things which aren't
  un-done during fail error return .. adding one more to the list
  is sb->super_copy updates. With this current design on this function
  its kind of too difficult to undo and error return. As
  btrfs_init_new_device() is shared between normal device add and
  sprout device add. I am mulling on completely removing seed part
  to outside of the btrfs_init_new_device(). such as ..
prepare sprout.
ret = btrfs_init_new_device() which is without the seed part
if(ret) undo_prepare_sprout
else finish sprouting.
   Also with this I think we would find few duplicate code sections
   between btrfs_init_new_device() and replace device which will be
   a nice cleanup as a whole. This is a long term plan, for now
   I think v4 is good.

Thanks, Anand



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: tests: arg override in command line

2017-09-28 Thread Su Yue
Lowmem mode only repairs few cases which has a beacon file
".lowmem_repairable" in the case' directory.

However, defining TEST_ENABLE_OVERRIDE=true in command line does work
in above strategy.
Because _skip_spec() in tests/common.local isn't interpreted by shell
in that case.

Solve it by making _skip_spec() always be defined in common.local.

Reported-by: David Sterba 
Signed-off-by: Su Yue 
---
 tests/common   |  8 ++--
 tests/common.local | 11 +--
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/tests/common b/tests/common
index eb525a4d..690fee93 100644
--- a/tests/common
+++ b/tests/common
@@ -97,12 +97,8 @@ _get_spec_ins()
 _cmd_spec()
 {
if [ "$TEST_ENABLE_OVERRIDE" = 'true' ]; then
-   # if defined via common.local, use it, otherwise pass make
-   # arguments
-   if [ "$(type -t _skip_spec)" = 'function' ]; then
-   if _skip_spec "$@"; then
-   return
-   fi
+   if _skip_spec "$@"; then
+   return
fi
case "$1" in
check) echo -n "$TEST_ARGS_CHECK" ;;
diff --git a/tests/common.local b/tests/common.local
index 4864e391..f5e96f5b 100644
--- a/tests/common.local
+++ b/tests/common.local
@@ -3,14 +3,13 @@
 # additional arguments to various commands
 
 # already defined, eg. via make argument
-if [ -n "$TEST_ENABLE_OVERRIDE" ]; then
-   return
-fi
+if [ -z "$TEST_ENABLE_OVERRIDE" ]; then
+# set to 'true'
+TEST_ENABLE_OVERRIDE=false
 
-# set to 'true'
-TEST_ENABLE_OVERRIDE=false
+TEST_ARGS_CHECK=--mode=lowmem
+fi
 
-TEST_ARGS_CHECK=--mode=lowmem
 
 # gets arguments of a current command and can decide if the argument insertion
 # should happen, eg. if some option combination does not make sense or would
-- 
2.14.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: Explicitly handle btrfs_update_root failure

2017-09-28 Thread Nikolay Borisov
btrfs_udpate_root can fail and it aborts the transaction, the correct way to
handle an aborted transaction is to explicitly end with btrfs_end_transaction.
Even now the code is correct since btrfs_commit_transaction would handle an
aborted transaction but this is more of an implementation detail. So let's be
explicit in handling failure in btrfs_update_root.

Furthermore btrfs_commit_transaction can also fail and by ignoring it's return
value we could have left the in-memory copy of the root item in an inconsistent
state. So capture the error value which allows us to correctly revert the RO/RW
flags in case of commit failure.

Signed-off-by: Nikolay Borisov 
---
 fs/btrfs/ioctl.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d6715c2bcdc4..ee4ee7cbba72 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1842,8 +1842,13 @@ static noinline int btrfs_ioctl_subvol_setflags(struct 
file *file,
 
ret = btrfs_update_root(trans, fs_info->tree_root,
&root->root_key, &root->root_item);
+   if (ret < 0) {
+   btrfs_end_transaction(trans);
+   goto out_reset;
+   }
+
+   ret = btrfs_commit_transaction(trans);
 
-   btrfs_commit_transaction(trans);
 out_reset:
if (ret)
btrfs_set_root_flags(&root->root_item, root_flags);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] btrfs: Remove received_uuid during received snapshot ro->rw switch

2017-09-28 Thread Nikolay Borisov
Currently when a read-only snapshot is received and subsequently its ro property
is set to false i.e. switched to rw-mode the received_uuid of that subvol 
remains
intact. However, once the received volume is switched to RW mode we cannot
guaranteee that it contains the same data, so it makes sense to remove the
received uuid. The presence of the received_uuid can also cause problems when
the volume is being send.

Signed-off-by: Nikolay Borisov 
---

v3:
 * Rework the patch considering latest feedback from David Sterba i.e. 
 explicitly use btrfs_end_transaction 

 fs/btrfs/ioctl.c | 36 +---
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ee4ee7cbba72..c0374125cec2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1811,6 +1811,17 @@ static noinline int btrfs_ioctl_subvol_setflags(struct 
file *file,
goto out_drop_sem;
 
root_flags = btrfs_root_flags(&root->root_item);
+
+   /*
+* 1 - root item
+* 1 - uuid item
+*/
+   trans = btrfs_start_transaction(root, 2);
+   if (IS_ERR(trans)) {
+   ret = PTR_ERR(trans);
+   goto out_drop_sem;
+   }
+
if (flags & BTRFS_SUBVOL_RDONLY) {
btrfs_set_root_flags(&root->root_item,
 root_flags | BTRFS_ROOT_SUBVOL_RDONLY);
@@ -1824,22 +1835,33 @@ static noinline int btrfs_ioctl_subvol_setflags(struct 
file *file,
btrfs_set_root_flags(&root->root_item,
 root_flags & ~BTRFS_ROOT_SUBVOL_RDONLY);
spin_unlock(&root->root_item_lock);
+   if 
(!btrfs_is_empty_uuid(root->root_item.received_uuid)) {
+   ret = btrfs_uuid_tree_rem(trans, fs_info,
+  root->root_item.received_uuid,
+  BTRFS_UUID_KEY_RECEIVED_SUBVOL,
+  root->root_key.objectid);
+
+   if (ret && ret != -ENOENT) {
+   btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
+   goto out_reset;
+   }
+
+   memset(root->root_item.received_uuid, 0,
+  BTRFS_UUID_SIZE);
+   }
} else {
spin_unlock(&root->root_item_lock);
btrfs_warn(fs_info,
   "Attempt to set subvolume %llu read-write 
during send",
   root->root_key.objectid);
ret = -EPERM;
-   goto out_drop_sem;
+   btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
+   goto out_reset;
}
}
 
-   trans = btrfs_start_transaction(root, 1);
-   if (IS_ERR(trans)) {
-   ret = PTR_ERR(trans);
-   goto out_reset;
-   }
-
ret = btrfs_update_root(trans, fs_info->tree_root,
&root->root_key, &root->root_item);
if (ret < 0) {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] btrfs: tree-checker: Enhance output for check_extent_data_item

2017-09-28 Thread Qu Wenruo



On 2017年09月28日 14:09, Nikolay Borisov wrote:



On 28.09.2017 06:36, Qu Wenruo wrote:

Output the invalid member name and its bad value, along with its
expected value range or alignment.

Signed-off-by: Qu Wenruo 
---
  fs/btrfs/tree-checker.c | 92 +
  1 file changed, 70 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 52e9ab8c2a79..1324fcae93c0 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -63,6 +63,47 @@ static void generic_err(const struct btrfs_root *root,
va_end(args);
  }
  
+/*

+ * Customized reporter for extent data item, since its key objectid and
+ * offset has its own meaning.
+ */
+__printf(4, 5)
+static void file_extent_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct btrfs_key key;
+   struct va_format vaf;
+   va_list args;
+
+   btrfs_item_key_to_cpu(eb, &key, slot);
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s root=%llu tree_block=%llu slot=%d ino=%llu 
file_offset=%llu: %pV",


nit: Again, consider whether we should have : after the first %s so that
the string is consistent among different verifiers.


Consistence is important indeed.

I'll update the patchset to address it.

Thanks for pointing it out,
Qu




+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   key.objectid, key.offset, &vaf);
+   va_end(args);
+}
+
+/*
+ * Return 0 if the btrfs_file_extent_##name is aligned to @align
+ * Else return 1
+ */
+#define CHECK_FI_ALIGN(root, leaf, slot, fi, name, align)  \
+({ \
+   if (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align)) \
+   file_extent_err(root, leaf, slot,   \
+   "invalid file extent %s, have %llu, should be aligned to 
%u",\
+   #name, btrfs_file_extent_##name(leaf, fi),  \
+   align); \
+   (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align));   \
+})
+
  static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -72,15 +113,19 @@ static int check_extent_data_item(struct btrfs_root *root,
u32 item_size = btrfs_item_size_nr(leaf, slot);
  
  	if (!IS_ALIGNED(key->offset, sectorsize)) {

-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "unaligned key offset, have %llu should be aligned to 
%u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
  
  	fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
  
  	if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {

-   CORRUPT("invalid file extent type", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid file extent type, have %u expect range [0, 
%u]",
+   btrfs_file_extent_type(leaf, fi),
+   BTRFS_FILE_EXTENT_TYPES);
return -EUCLEAN;
}
  
@@ -89,18 +134,24 @@ static int check_extent_data_item(struct btrfs_root *root,

 * and must be caught in open_ctree().
 */
if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid file extent compression, have %u expect range [0, 
%u]",
+   btrfs_file_extent_compression(leaf, fi),
+   BTRFS_COMPRESS_TYPES);
return -EUCLEAN;
}
if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid file extent encryption, have %u expect 0",
+   btrfs_file_extent_encryption(leaf, fi));
return -EUCLEAN;
}
if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
/* Inline extent must have 0 as key offset */
if (key->offset) {
-   CORRUPT("inline extent has non-zero key offset",
-   leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid offset for in

[PATCH v2 1/2] btrfs: Refactor transaction handling

2017-09-28 Thread Nikolay Borisov
If btrfs_transaction_commit fails it will proceed to call cleanup_transaction,
which in turn already does btrfs_abort_transaction. So let's remove the
unnecessary code duplication. Also let's be explicit about handling failure
of btrfs_uuid_tree_add by calling btrfs_end_transaction.

Signed-off-by: Nikolay Borisov 
---

v2:
 * Collapse previous 1/3 and 2/3 into a single patch
 * Add the btrfs_end_transaction() call 

 fs/btrfs/ioctl.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d6715c2bcdc4..21f14f755f86 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5156,15 +5156,12 @@ static long _btrfs_ioctl_set_received_subvol(struct 
file *file,
  root->root_key.objectid);
if (ret < 0 && ret != -EEXIST) {
btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
goto out;
+
}
}
ret = btrfs_commit_transaction(trans);
-   if (ret < 0) {
-   btrfs_abort_transaction(trans, ret);
-   goto out;
-   }
-
 out:
up_write(&fs_info->subvol_sem);
mnt_drop_write_file(file);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] btrfs: Fix transaction abort during failure in btrfs_rm_dev_item

2017-09-28 Thread Nikolay Borisov
btrfs_rm_dev_item calls several function under an activa transaction, however
it fails to abort it if an error happens. Fix this by adding explicit
btrfs_abort_transaction/btrfs_end_transaction calls

Signed-off-by: Nikolay Borisov 
---

v2: 
 * Explicitly handle every failure case w.r.t transaction abort rather than 
 rely on final btrfs_commit_transaction() to do the right thing. 

 * Also consider the -ENOENT case from btrfs_search_slot as a failure.

 fs/btrfs/volumes.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0e8f16c305df..4709c7919ef2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1765,20 +1765,29 @@ static int btrfs_rm_dev_item(struct btrfs_fs_info 
*fs_info,
key.offset = device->devid;
 
ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
-   if (ret < 0)
+   if (ret < 0) {
+   btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
goto out;
+   }
 
if (ret > 0) {
ret = -ENOENT;
+   btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
goto out;
}
 
ret = btrfs_del_item(trans, root, path);
-   if (ret)
+   if (ret) {
+   btrfs_abort_transaction(trans, ret);
+   btrfs_end_transaction(trans);
goto out;
+   }
+
+   ret = btrfs_commit_transaction(trans);
 out:
btrfs_free_path(path);
-   btrfs_commit_transaction(trans);
return ret;
 }
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


task btrfs-transacti:651 blocked for more than 120 seconds

2017-09-28 Thread Olivier Bonvalet
Hi !

I have a virtual server (Xen) which very frequently hangs with only
this error in logs :

[ 1330.144124] INFO: task btrfs-transacti:651 blocked for more than 120 seconds.
[ 1330.144141]   Not tainted 4.9-dae-xen #2
[ 1330.144146] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1330.144179] btrfs-transacti D0   651  2 0x
[ 1330.144184]  8803a6c85b40  8803af857880 
8803a9762180
[ 1330.144190]  8803a7bb8140 c900173bfb10 8150ff1f 

[ 1330.144195]  8803a7bb8140 7fff 81510710 
c900173bfc18
[ 1330.144200] Call Trace:
[ 1330.144211]  [] ? __schedule+0x17f/0x530
[ 1330.144215]  [] ? bit_wait+0x50/0x50
[ 1330.144218]  [] ? schedule+0x2d/0x80
[ 1330.144221]  [] ? schedule_timeout+0x17e/0x2a0
[ 1330.144226]  [] ? xen_clocksource_get_cycles+0x11/0x20
[ 1330.144231]  [] ? ktime_get+0x36/0xa0
[ 1330.144234]  [] ? bit_wait+0x50/0x50
[ 1330.144237]  [] ? io_schedule_timeout+0x98/0x100
[ 1330.144240]  [] ? _raw_spin_unlock_irqrestore+0x11/0x20
[ 1330.144246]  [] ? bit_wait_io+0x12/0x60
[ 1330.144250]  [] ? __wait_on_bit+0x4e/0x80
[ 1330.144256]  [] ? wait_on_page_bit+0x6c/0x80
[ 1330.144261]  [] ? autoremove_wake_function+0x30/0x30
[ 1330.144265]  [] ? __filemap_fdatawait_range+0xc8/0x110
[ 1330.144270]  [] ? filemap_fdatawait_range+0x9/0x20
[ 1330.144298]  [] ? btrfs_wait_ordered_range+0x63/0x100 
[btrfs]
[ 1330.144310]  [] ? btrfs_wait_cache_io+0x58/0x1e0 [btrfs]
[ 1330.144320]  [] ? 
btrfs_start_dirty_block_groups+0x1c2/0x450 [btrfs]
[ 1330.144328]  [] ? do_group_exit+0x35/0xa0
[ 1330.144338]  [] ? btrfs_commit_transaction+0x147/0x9b0 
[btrfs]
[ 1330.144348]  [] ? start_transaction+0x92/0x3f0 [btrfs]
[ 1330.144357]  [] ? transaction_kthread+0x1d7/0x1f0 [btrfs]
[ 1330.144366]  [] ? btrfs_cleanup_transaction+0x4f0/0x4f0 
[btrfs]
[ 1330.144373]  [] ? kthread+0xc2/0xe0
[ 1330.144377]  [] ? kthread_create_on_node+0x40/0x40
[ 1330.144381]  [] ? ret_from_fork+0x25/0x30


It's a Debian Stretch system, running a 4.9.52 Linux kernel (on a Xen 4.8.2 
hypervisor).
With an old 4.1.x Linux kernel, I haven't any problem.


Is it a Btrfs bug ? Should I try a more recent kernel ? (which one ?)

Thanks in advance,

Olivier
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: task btrfs-transacti:651 blocked for more than 120 seconds

2017-09-28 Thread Nikolay Borisov


On 28.09.2017 13:16, Olivier Bonvalet wrote:
> Hi !
> 
> I have a virtual server (Xen) which very frequently hangs with only
> this error in logs :
> 
> [ 1330.144124] INFO: task btrfs-transacti:651 blocked for more than 120 
> seconds.
> [ 1330.144141]   Not tainted 4.9-dae-xen #2
> [ 1330.144146] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [ 1330.144179] btrfs-transacti D0   651  2 0x
> [ 1330.144184]  8803a6c85b40  8803af857880 
> 8803a9762180
> [ 1330.144190]  8803a7bb8140 c900173bfb10 8150ff1f 
> 
> [ 1330.144195]  8803a7bb8140 7fff 81510710 
> c900173bfc18
> [ 1330.144200] Call Trace:
> [ 1330.144211]  [] ? __schedule+0x17f/0x530
> [ 1330.144215]  [] ? bit_wait+0x50/0x50
> [ 1330.144218]  [] ? schedule+0x2d/0x80
> [ 1330.144221]  [] ? schedule_timeout+0x17e/0x2a0
> [ 1330.144226]  [] ? xen_clocksource_get_cycles+0x11/0x20
> [ 1330.144231]  [] ? ktime_get+0x36/0xa0
> [ 1330.144234]  [] ? bit_wait+0x50/0x50
> [ 1330.144237]  [] ? io_schedule_timeout+0x98/0x100
> [ 1330.144240]  [] ? _raw_spin_unlock_irqrestore+0x11/0x20
> [ 1330.144246]  [] ? bit_wait_io+0x12/0x60
> [ 1330.144250]  [] ? __wait_on_bit+0x4e/0x80
> [ 1330.144256]  [] ? wait_on_page_bit+0x6c/0x80
> [ 1330.144261]  [] ? autoremove_wake_function+0x30/0x30
> [ 1330.144265]  [] ? __filemap_fdatawait_range+0xc8/0x110
> [ 1330.144270]  [] ? filemap_fdatawait_range+0x9/0x20
> [ 1330.144298]  [] ? btrfs_wait_ordered_range+0x63/0x100 
> [btrfs]
> [ 1330.144310]  [] ? btrfs_wait_cache_io+0x58/0x1e0 [btrfs]
> [ 1330.144320]  [] ? 
> btrfs_start_dirty_block_groups+0x1c2/0x450 [btrfs]
> [ 1330.144328]  [] ? do_group_exit+0x35/0xa0
> [ 1330.144338]  [] ? btrfs_commit_transaction+0x147/0x9b0 
> [btrfs]
> [ 1330.144348]  [] ? start_transaction+0x92/0x3f0 [btrfs]
> [ 1330.144357]  [] ? transaction_kthread+0x1d7/0x1f0 [btrfs]
> [ 1330.144366]  [] ? btrfs_cleanup_transaction+0x4f0/0x4f0 
> [btrfs]
> [ 1330.144373]  [] ? kthread+0xc2/0xe0
> [ 1330.144377]  [] ? kthread_create_on_node+0x40/0x40
> [ 1330.144381]  [] ? ret_from_fork+0x25/0x30

So what this stack trace means is that transaction commit has hung. And
judging by the called functions (assuming they are correct, though the ?
aren't very encouraging). Concretely, it means that an io has been
started for a certain range of addresses and transaction commit is now
waiting to be awaken upon completion of write. When this occurs can you
see if there is io activity from that particular guest (assuming you
have access to the hypervisor)? It might be a bug in btrfs or you might
be hitting something else in the hypervisor


> 
> 
> It's a Debian Stretch system, running a 4.9.52 Linux kernel (on a Xen 4.8.2 
> hypervisor).
> With an old 4.1.x Linux kernel, I haven't any problem.
> 
> 
> Is it a Btrfs bug ? Should I try a more recent kernel ? (which one ?)
> 
> Thanks in advance,
> 
> Olivier
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Remove misleading BCP 78 boilerplate

2017-09-28 Thread Nicholas D Steeves
Hi David,

On 18 September 2017 at 10:40, David Sterba  wrote:
> On Sun, Sep 17, 2017 at 07:52:27PM -0400, Nicholas D Steeves wrote:
>> BCP 78 applies to RFC 6234, but sha224-256.c is Simplified BSD.
>>
>> This causes the following lintian error when building on Debian and
>> Debian derivatives:
>>
>> E: btrfs-progs source: license-problem-non-free-RFC-BCP78
>>tests/sha224-256.c
>>
>> Please consult the following email from debian-le...@lists.debian.org
>> for more information:
>>
>> https://lists.debian.org/debian-legal/2017/08/msg4.html
>
> Thanks, this looks like I've copied too much from the RFC and was not
> aware of the BCP license issues. I believe the copyright notice(s) past
> the line mentioning the filename(s) should be enough to satisfy the
> licensing requirements and also the debian license checker.

Thank you for applying these so quickly, and for the new release :-)

Sincerely,
Nicholas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re : task btrfs-transacti:651 blocked for more than 120 seconds

2017-09-28 Thread Olivier Bonvalet
Le jeudi 28 septembre 2017 à 14:18 +0300, Nikolay Borisov a écrit :
> So what this stack trace means is that transaction commit has hung.
> And
> judging by the called functions (assuming they are correct, though
> the ?
> aren't very encouraging). Concretely, it means that an io has been
> started for a certain range of addresses and transaction commit is
> now
> waiting to be awaken upon completion of write. When this occurs can
> you
> see if there is io activity from that particular guest (assuming you
> have access to the hypervisor)? It might be a bug in btrfs or you
> might
> be hitting something else in the hypervisor


Hello,

thanks for your answer. From the hypervisor, I don't see any IO during
this hang.

I tried to clone the VM to simulate the problem, and I also have the
problem without Btrfs :

[ 3263.452023] INFO: task systemd:1 blocked for more than 120 seconds.
[ 3263.452040]   Tainted: GW   4.9-dae-xen #2
[ 3263.452044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3263.452052] systemd D0 1  0 0x
[ 3263.452060]  8803a71ca000  8803af857880 
8803a9762dc0
[ 3263.452070]  8803a96fcc80 c9001623f990 8150ff1f 

[ 3263.452079]  8803a96fcc80 7fff 81510710 
c9001623faa0
[ 3263.452087] Call Trace:
[ 3263.452099]  [] ? __schedule+0x17f/0x530
[ 3263.452105]  [] ? bit_wait+0x50/0x50
[ 3263.452110]  [] ? schedule+0x2d/0x80
[ 3263.452116]  [] ? schedule_timeout+0x17e/0x2a0
[ 3263.452121]  [] ? xen_clocksource_get_cycles+0x11/0x20
[ 3263.452126]  [] ? ktime_get+0x36/0xa0
[ 3263.452130]  [] ? bit_wait+0x50/0x50
[ 3263.452134]  [] ? io_schedule_timeout+0x98/0x100
[ 3263.452137]  [] ? _raw_spin_unlock_irqrestore+0x11/0x20
[ 3263.452141]  [] ? bit_wait_io+0x12/0x60
[ 3263.452145]  [] ? __wait_on_bit+0x4e/0x80
[ 3263.452149]  [] ? bit_wait+0x50/0x50
[ 3263.452153]  [] ? out_of_line_wait_on_bit+0x69/0x80
[ 3263.452157]  [] ? autoremove_wake_function+0x30/0x30
[ 3263.452163]  [] ? ext4_find_entry+0x350/0x5d0
[ 3263.452168]  [] ? d_alloc_parallel+0xa0/0x480
[ 3263.452172]  [] ? __d_lookup_done+0x68/0xd0
[ 3263.452175]  [] ? d_splice_alias+0x158/0x3b0
[ 3263.452179]  [] ? ext4_lookup+0x42/0x1f0
[ 3263.452184]  [] ? lookup_slow+0x8e/0x130
[ 3263.452187]  [] ? walk_component+0x1ca/0x300
[ 3263.452193]  [] ? link_path_walk+0x18e/0x570
[ 3263.452199]  [] ? path_init+0x1c3/0x320
[ 3263.452207]  [] ? path_openat+0xe2/0x1380
[ 3263.452214]  [] ? do_filp_open+0x79/0xd0
[ 3263.45]  [] ? kmem_cache_alloc+0x71/0x400
[ 3263.452228]  [] ? __check_object_size+0xf7/0x1c4
[ 3263.452235]  [] ? do_sys_open+0x11f/0x1f0
[ 3263.452238]  [] ? entry_SYSCALL_64_fastpath+0x1a/0xa9


So I will try to see with Xen developpers.

Thanks,

Olivier
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 0/6] Btrfs: populate heuristic with code

2017-09-28 Thread Timofey Titovets
Based on linux master 4.14-rc2
Duplicated to github:
https://github.com/Nefelim4ag/linux/tree/heuristic_v8

Compile tested, hand tested on live system

Patches short:
1. Implement workspaces for heuristic
   Separate heuristic/compression workspaces
   Main target for that patch:
   Maximum code sharing, minimum changes

2. Add heuristic counters and buffer to workspaces
   Add some base macros for heuristic

3. Implement simple input data sampling
   It's get 16 byte samples with 256 bytes shifts
   over input data. Collect info about how many
   different bytes (symbols) has been found in sample data
   (i.e. systematic sampling used for now)

4. Implement check sample to repeated data
   Just iterate over sample and do memcmp()
   ex. will detect zeroed data

5. Add code for calculate how many unique bytes has been found
   in sample data.
   That heuristic can detect text like data (configs, xml, json, html & etc)
   Because in most text like data byte set are restricted to limit number
   of possible characters, and that restriction in most cases
   make data easy compressible.

6. Add code for calculate byte core set size
   i.e. how many unique bytes use 90% of sample data

   Several type of structured binary data have in general
   nearly all types of bytes, but distribution can be Uniform
   where in bucket all byte types will have the nearly same count
   (ex. Encrypted data)
   and as ex. Normal (Gaussian), where count of bytes will be not so linear

   That code require that numbers in bucket must be sorted
   That can detect easy compressible data with many repeated bytes
   That can detect not compressible data with evenly distributed bytes

Changes v1 -> v2:
  - Change input data iterator shift 512 -> 256
  - Replace magic macro numbers with direct values
  - Drop useless symbol population in bucket
as no one care about where and what symbol stored
in bucket at now

Changes v2 -> v3 (only update #3 patch):
  - Fix u64 division problem by use u32 for input_size
  - Fix input size calculation start - end -> end - start
  - Add missing sort.h header

Changes v3 -> v4 (only update #1 patch):
  - Change counter type in bucket item u16 -> u32
  - Drop other fields from bucket item for now,
no one use it

Change v4 -> v5
  - Move heuristic code to external file
  - Make heuristic use compression workspaces
  - Add check sample to zeroes

Change v5 -> v6
  - Add some code to hande page unaligned range start/end
  - replace sample zeroed check with check for repeated data

Change v6 -> v7
  - Add missing part of first patch
  - Make use of IS_ALIGNED() for check tail aligment

Change v7 -> v8
  - All code moved to compression.c (again)
  - Heuristic workspaces inmplemented another way
i.e. only share logic with compression workspaces
  - Some style fixes suggested by Devid
  - Move sampling function from heuristic code
(I'm afraid of big functions)
  - Much more comments and explanations

Timofey Titovets (6):
  Btrfs: compression.c separated heuristic/compression workspaces
  Btrfs: heuristic workspace add bucket and sample items, macros
  Btrfs: implement heuristic sampling logic
  Btrfs: heuristic add detection of repeated data patterns
  Btrfs: heuristic add byte set calculation
  Btrfs: heuristic add byte core set calculation

 fs/btrfs/compression.c | 393 +
 1 file changed, 366 insertions(+), 27 deletions(-)

--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 6/6] Btrfs: heuristic add byte core set calculation

2017-09-28 Thread Timofey Titovets
Calculate byte core set for data sample:
Sort bucket's numbers in decreasing order
Count how many numbers use 90% of sample
If core set are low (<=25%), data are easily compressible
If core set high (>=80%), data are not compressible

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 68 ++
 1 file changed, 68 insertions(+)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 83daf2f9d051..1aa04ae214a7 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -1201,6 +1202,62 @@ int btrfs_decompress_buf2page(const char *buf, unsigned 
long buf_start,
 }


+/* Compare buckets by size, ascending */
+static inline int bucket_comp_rev(const void *lv, const void *rv)
+{
+   const struct bucket_item *l = (struct bucket_item *)(lv);
+   const struct bucket_item *r = (struct bucket_item *)(rv);
+
+   return r->count - l->count;
+}
+
+/*
+ * Byte Core set size
+ * How many bytes use 90% of sample
+ *
+ * Several type of structure d binary data have in general
+ * nearly all types of bytes, but distribution can be Uniform
+ * where in bucket all byte types will have the nearly same count
+ * (ex. Encrypted data)
+ * and as ex. Normal (Gaussian), where count of bytes will be not so linear
+ * in that case data can be compressible, probably compressible, and
+ * not compressible, so assume:
+ *
+ * @BYTE_CORE_SET_LOW - main part of byte types repeated frequently
+ *  compression algo can easy fix that
+ * @BYTE_CORE_SET_HIGH - data have Uniform distribution and with high
+ *   probability not compressible
+ *
+ */
+
+#define BYTE_CORE_SET_LOW  64
+#define BYTE_CORE_SET_HIGH 200
+
+static int byte_core_set_size(struct heuristic_ws *ws)
+{
+   u32 i;
+   u32 coreset_sum = 0;
+   u32 core_set_threshold = ws->sample_size * 90 / 100;
+   struct bucket_item *bucket = ws->bucket;
+
+   /* Sort in reverse order */
+   sort(bucket, BUCKET_SIZE, sizeof(*bucket), &bucket_comp_rev, NULL);
+
+   for (i = 0; i < BYTE_CORE_SET_LOW; i++)
+   coreset_sum += bucket[i].count;
+
+   if (coreset_sum > core_set_threshold)
+   return i;
+
+   for (; i < BYTE_CORE_SET_HIGH && bucket[i].count > 0; i++) {
+   coreset_sum += bucket[i].count;
+   if (coreset_sum > core_set_threshold)
+   break;
+   }
+
+   return i;
+}
+
 /*
  * Count byte types in bucket
  * That heuristic can detect text like data (configs, xml, json, html & etc)
@@ -1348,6 +1405,17 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
goto out;
}

+   i = byte_core_set_size(ws);
+   if (i <= BYTE_CORE_SET_LOW) {
+   ret = 3;
+   goto out;
+   }
+
+   if (i >= BYTE_CORE_SET_HIGH) {
+   ret = 0;
+   goto out;
+   }
+
 out:
__free_workspace(0, ws_list, true);
return ret;
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 2/6] Btrfs: heuristic workspace add bucket and sample items, macros

2017-09-28 Thread Timofey Titovets
Added macros:
 - For future sampling algo
 - For bucket size

Heuristic workspace:
 - Add bucket for storing byte type counters
 - Add sample array for storing partial copy of
   input data range
 - Add counter for store current sample size to workspace

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 57 +-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index c3624e8e3919..1715655d050e 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -691,7 +691,50 @@ blk_status_t btrfs_submit_compressed_read(struct inode 
*inode, struct bio *bio,
 }


+/*
+ * Heuristic use systematic sampling to collect data from
+ * input data range, so some constant needed to control algo
+ *
+ * @SAMPLING_READ_SIZE - how many bytes will be copied from on each sample
+ * @SAMPLING_INTERVAL  - period that used to iterate over input data range
+ */
+#define SAMPLING_READ_SIZE 16
+#define SAMPLING_INTERVAL 256
+
+/*
+ * For statistic analize of input data range
+ * consider that data consists from bytes
+ * so this is Galois Field with 256 objects
+ * each object have attribute count, i.e. how many times
+ * that object detected in sample
+ */
+#define BUCKET_SIZE 256
+
+/*
+ * The size of the sample is based on a statistical sampling rule of thumb.
+ * That common to perform sampling tests as long as number of elements in
+ * each cell is at least five.
+ *
+ * Instead of five, for now choose 32 value to obtain more accurate results.
+ * If the data contains the maximum number of symbols, which is 256,
+ * lets obtain a sample size bound of 8192.
+ *
+ * So sample at most 8KB of data per data range: 16 consecutive
+ * bytes from up to 512 locations.
+ */
+#define MAX_SAMPLE_SIZE (BTRFS_MAX_UNCOMPRESSED * \
+ SAMPLING_READ_SIZE / SAMPLING_INTERVAL)
+
+
+struct bucket_item {
+   u32 count;
+};
+
 struct heuristic_ws {
+   /* Partial copy of input data */
+   u8 *sample;
+   /* Bucket store counter for each byte type */
+   struct bucket_item *bucket;
struct list_head list;
 };

@@ -701,6 +744,8 @@ static void free_heuristic_ws(struct list_head *ws)

workspace = list_entry(ws, struct heuristic_ws, list);

+   kvfree(workspace->sample);
+   kfree(workspace->bucket);
kfree(workspace);
 }

@@ -711,9 +756,19 @@ static struct list_head *alloc_heuristic_ws(void){
if (!ws)
return ERR_PTR(-ENOMEM);

-   INIT_LIST_HEAD(&ws->list);
+   ws->sample = kvmalloc(MAX_SAMPLE_SIZE, GFP_KERNEL);
+   if (!ws->sample)
+   goto fail;
+
+   ws->bucket = kcalloc(BUCKET_SIZE, sizeof(*ws->bucket), GFP_KERNEL);
+   if (!ws->bucket)
+   goto fail;

+   INIT_LIST_HEAD(&ws->list);
return &ws->list;
+fail:
+   free_heuristic_ws(&ws->list);
+   return ERR_PTR(-ENOMEM);
 }

 struct workspaces_list {
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 3/6] Btrfs: implement heuristic sampling logic

2017-09-28 Thread Timofey Titovets
Copy sample data from input data range to sample buffer
then calculate byte type count for that sample into bucket.

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 71 +++---
 1 file changed, 61 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 1715655d050e..e2419639ae7f 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -725,7 +725,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode 
*inode, struct bio *bio,
 #define MAX_SAMPLE_SIZE (BTRFS_MAX_UNCOMPRESSED * \
  SAMPLING_READ_SIZE / SAMPLING_INTERVAL)

-
 struct bucket_item {
u32 count;
 };
@@ -733,6 +732,7 @@ struct bucket_item {
 struct heuristic_ws {
/* Partial copy of input data */
u8 *sample;
+   u32 sample_size;
/* Bucket store counter for each byte type */
struct bucket_item *bucket;
struct list_head list;
@@ -1200,6 +1200,57 @@ int btrfs_decompress_buf2page(const char *buf, unsigned 
long buf_start,
return 1;
 }

+static void heuristic_collect_sample(struct inode *inode, u64 start, u64 end,
+struct heuristic_ws *ws)
+{
+   struct page *page;
+   u64 index, index_end;
+   u32 i, curr_sample_pos;
+   u8 *in_data;
+
+   /*
+* Compression only handle first 128kb of input range
+* And just shift over range in loop for compressing it.
+* Let's do the same.
+*
+* MAX_SAMPLE_SIZE - calculated in assume that heuristic will process
+* not more then BTRFS_MAX_UNCOMPRESSED at run
+*/
+   if (end - start > BTRFS_MAX_UNCOMPRESSED)
+   end = start + BTRFS_MAX_UNCOMPRESSED;
+
+   index = start >> PAGE_SHIFT;
+   index_end = end >> PAGE_SHIFT;
+
+   /* Don't miss unaligned end */
+   if (!IS_ALIGNED(end, PAGE_SIZE))
+   index_end++;
+
+   curr_sample_pos = 0;
+   while (index < index_end) {
+   page = find_get_page(inode->i_mapping, index);
+   in_data = kmap(page);
+   /* Handle case where start unaligned to PAGE_SIZE */
+   i = start % PAGE_SIZE;
+   while (i < PAGE_SIZE - SAMPLING_READ_SIZE) {
+   /* Don't sample mem trash from last page */
+   if (start > end - SAMPLING_READ_SIZE)
+   break;
+   memcpy(&ws->sample[curr_sample_pos],
+  &in_data[i], SAMPLING_READ_SIZE);
+   i += SAMPLING_INTERVAL;
+   start += SAMPLING_INTERVAL;
+   curr_sample_pos += SAMPLING_READ_SIZE;
+   }
+   kunmap(page);
+   put_page(page);
+
+   index++;
+   }
+
+   ws->sample_size = curr_sample_pos;
+}
+
 /*
  * Compression heuristic.
  *
@@ -1219,19 +1270,19 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
 {
struct list_head *ws_list = __find_workspace(0, true);
struct heuristic_ws *ws;
-   u64 index = start >> PAGE_SHIFT;
-   u64 end_index = end >> PAGE_SHIFT;
-   struct page *page;
+   u32 i;
+   u8 byte;
int ret = 1;

ws = list_entry(ws_list, struct heuristic_ws, list);

-   while (index <= end_index) {
-   page = find_get_page(inode->i_mapping, index);
-   kmap(page);
-   kunmap(page);
-   put_page(page);
-   index++;
+   heuristic_collect_sample(inode, start, end, ws);
+
+   memset(ws->bucket, 0, sizeof(*ws->bucket)*BUCKET_SIZE);
+
+   for (i = 0; i < ws->sample_size; i++) {
+   byte = ws->sample[i];
+   ws->bucket[byte].count++;
}

__free_workspace(0, ws_list, true);
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 4/6] Btrfs: heuristic add detection of repeated data patterns

2017-09-28 Thread Timofey Titovets
Walk over data sample and use memcmp to detect
repeated data (like zeroed)

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index e2419639ae7f..1cb4df023d5e 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -1200,6 +1200,16 @@ int btrfs_decompress_buf2page(const char *buf, unsigned 
long buf_start,
return 1;
 }

+
+static bool sample_repeated_patterns(struct heuristic_ws *ws)
+{
+   u32 half_of_sample = ws->sample_size / 2;
+   u8 *p = ws->sample;
+
+   return !memcmp(&p[0], &p[half_of_sample], half_of_sample);
+}
+
+
 static void heuristic_collect_sample(struct inode *inode, u64 start, u64 end,
 struct heuristic_ws *ws)
 {
@@ -1278,6 +1288,11 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)

heuristic_collect_sample(inode, start, end, ws);

+   if (sample_repeated_patterns(ws)) {
+   ret = 1;
+   goto out;
+   }
+
memset(ws->bucket, 0, sizeof(*ws->bucket)*BUCKET_SIZE);

for (i = 0; i < ws->sample_size; i++) {
@@ -1285,7 +1300,7 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
ws->bucket[byte].count++;
}

+out:
__free_workspace(0, ws_list, true);
-
return ret;
 }
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 5/6] Btrfs: heuristic add byte set calculation

2017-09-28 Thread Timofey Titovets
Calculate byte set size for data sample:
Calculate how many unique bytes has been in sample
By count all bytes in bucket with count > 0
If byte set low (~25%), data are easily compressible
Otherwise need additional analize

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 48 
 1 file changed, 48 insertions(+)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 1cb4df023d5e..83daf2f9d051 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -1201,6 +1201,48 @@ int btrfs_decompress_buf2page(const char *buf, unsigned 
long buf_start,
 }


+/*
+ * Count byte types in bucket
+ * That heuristic can detect text like data (configs, xml, json, html & etc)
+ * Because in most text like data byte set are restricted to limit number
+ * of possible characters, and that restriction in most cases
+ * make data easy compressible.
+ *
+ * @BYTE_SET_THRESHOLD - assume that all data with that byte set size:
+ * less - compressible
+ * more - need additional analize
+ */
+
+#define BYTE_SET_THRESHOLD 64
+
+static u32 byte_set_size(const struct heuristic_ws *ws)
+{
+   u32 i;
+   u32 byte_set_size = 0;
+
+   for (i = 0; i < BYTE_SET_THRESHOLD; i++) {
+   if (ws->bucket[i].count > 0)
+   byte_set_size++;
+   }
+
+   /*
+* Continue collecting count of byte types in bucket
+* If byte set size bigger then threshold
+* That useless to continue, because for that data type
+* detection technique fail
+*/
+   for (; i < BUCKET_SIZE; i++) {
+   if (ws->bucket[i].count > 0) {
+   byte_set_size++;
+   if (byte_set_size > BYTE_SET_THRESHOLD)
+   return byte_set_size;
+   }
+   }
+
+   return byte_set_size;
+}
+
+
 static bool sample_repeated_patterns(struct heuristic_ws *ws)
 {
u32 half_of_sample = ws->sample_size / 2;
@@ -1300,6 +1342,12 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
ws->bucket[byte].count++;
}

+   i = byte_set_size(ws);
+   if (i < BYTE_SET_THRESHOLD) {
+   ret = 2;
+   goto out;
+   }
+
 out:
__free_workspace(0, ws_list, true);
return ret;
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 1/6] Btrfs: compression.c separated heuristic/compression workspaces

2017-09-28 Thread Timofey Titovets
Compression heuristic itself is not a compression type,
as current infrastructure supposed to provide workspaces
for several compression types, it's difficult to just add
heuristic workspace.

Just refactor the code to support compression/heuristic
workspaces with maximum code sharing and minimum changes in it.

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 138 ++---
 1 file changed, 120 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index b51d23f5cafa..c3624e8e3919 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -690,7 +690,33 @@ blk_status_t btrfs_submit_compressed_read(struct inode 
*inode, struct bio *bio,
return ret;
 }

-static struct {
+
+struct heuristic_ws {
+   struct list_head list;
+};
+
+static void free_heuristic_ws(struct list_head *ws)
+{
+   struct heuristic_ws *workspace;
+
+   workspace = list_entry(ws, struct heuristic_ws, list);
+
+   kfree(workspace);
+}
+
+static struct list_head *alloc_heuristic_ws(void){
+   struct heuristic_ws *ws;
+
+   ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+   if (!ws)
+   return ERR_PTR(-ENOMEM);
+
+   INIT_LIST_HEAD(&ws->list);
+
+   return &ws->list;
+}
+
+struct workspaces_list {
struct list_head idle_ws;
spinlock_t ws_lock;
/* Number of free workspaces */
@@ -699,7 +725,11 @@ static struct {
atomic_t total_ws;
/* Waiters for a free workspace */
wait_queue_head_t ws_wait;
-} btrfs_comp_ws[BTRFS_COMPRESS_TYPES];
+};
+
+static struct workspaces_list btrfs_comp_ws[BTRFS_COMPRESS_TYPES];
+
+static struct workspaces_list btrfs_heuristic_ws;

 static const struct btrfs_compress_op * const btrfs_compress_op[] = {
&btrfs_zlib_compress,
@@ -709,11 +739,24 @@ static const struct btrfs_compress_op * const 
btrfs_compress_op[] = {

 void __init btrfs_init_compress(void)
 {
+   struct list_head *workspace;
int i;

-   for (i = 0; i < BTRFS_COMPRESS_TYPES; i++) {
-   struct list_head *workspace;
+   INIT_LIST_HEAD(&btrfs_heuristic_ws.idle_ws);
+   spin_lock_init(&btrfs_heuristic_ws.ws_lock);
+   atomic_set(&btrfs_heuristic_ws.total_ws, 0);
+   init_waitqueue_head(&btrfs_heuristic_ws.ws_wait);

+   workspace = alloc_heuristic_ws();
+   if (IS_ERR(workspace)) {
+   pr_warn("BTRFS: cannot preallocate heuristic workspace, will 
try later\n");
+   } else {
+   atomic_set(&btrfs_heuristic_ws.total_ws, 1);
+   btrfs_heuristic_ws.free_ws = 1;
+   list_add(workspace, &btrfs_heuristic_ws.idle_ws);
+   }
+
+   for (i = 0; i < BTRFS_COMPRESS_TYPES; i++) {
INIT_LIST_HEAD(&btrfs_comp_ws[i].idle_ws);
spin_lock_init(&btrfs_comp_ws[i].ws_lock);
atomic_set(&btrfs_comp_ws[i].total_ws, 0);
@@ -740,18 +783,33 @@ void __init btrfs_init_compress(void)
  * Preallocation makes a forward progress guarantees and we do not return
  * errors.
  */
-static struct list_head *find_workspace(int type)
+static struct list_head *__find_workspace(int type, bool heuristic)
 {
struct list_head *workspace;
int cpus = num_online_cpus();
int idx = type - 1;
unsigned nofs_flag;

-   struct list_head *idle_ws   = &btrfs_comp_ws[idx].idle_ws;
-   spinlock_t *ws_lock = &btrfs_comp_ws[idx].ws_lock;
-   atomic_t *total_ws  = &btrfs_comp_ws[idx].total_ws;
-   wait_queue_head_t *ws_wait  = &btrfs_comp_ws[idx].ws_wait;
-   int *free_ws= &btrfs_comp_ws[idx].free_ws;
+   struct list_head *idle_ws;
+   spinlock_t *ws_lock;
+   atomic_t *total_ws;
+   wait_queue_head_t *ws_wait;
+   int *free_ws;
+
+   if (!heuristic) {
+   idle_ws = &btrfs_comp_ws[idx].idle_ws;
+   ws_lock = &btrfs_comp_ws[idx].ws_lock;
+   total_ws= &btrfs_comp_ws[idx].total_ws;
+   ws_wait = &btrfs_comp_ws[idx].ws_wait;
+   free_ws = &btrfs_comp_ws[idx].free_ws;
+   } else {
+   idle_ws = &btrfs_heuristic_ws.idle_ws;
+   ws_lock = &btrfs_heuristic_ws.ws_lock;
+   total_ws= &btrfs_heuristic_ws.total_ws;
+   ws_wait = &btrfs_heuristic_ws.ws_wait;
+   free_ws = &btrfs_heuristic_ws.free_ws;
+   }
+
 again:
spin_lock(ws_lock);
if (!list_empty(idle_ws)) {
@@ -781,7 +839,10 @@ static struct list_head *find_workspace(int type)
 * context of btrfs_compress_bio/btrfs_compress_pages
 */
nofs_flag = memalloc_nofs_save();
-   workspace = btrfs_compress_op[idx]->alloc_workspace();
+   if (!heuristic)
+   workspace = btrfs_compress_op[idx]->alloc_workspace();
+   else
+   w

[PATCH] btrfs: Use DIV_ROUND_UP rathen than opencoding it

2017-09-28 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov 
---
 fs/btrfs/extent-tree.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e2d7e86b51d1..9e67616892cd 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2896,9 +2896,8 @@ u64 btrfs_csum_bytes_to_leaves(struct btrfs_fs_info 
*fs_info, u64 csum_bytes)
num_csums_per_leaf = div64_u64(csum_size,
(u64)btrfs_super_csum_size(fs_info->super_copy));
num_csums = div64_u64(csum_bytes, fs_info->sectorsize);
-   num_csums += num_csums_per_leaf - 1;
-   num_csums = div64_u64(num_csums, num_csums_per_leaf);
-   return num_csums;
+
+   return DIV_ROUND_UP(num_csums, num_csums_per_leaf);
 }
 
 int btrfs_check_space_for_delayed_refs(struct btrfs_trans_handle *trans,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re : [Xen-devel] Re : task btrfs-transacti:651 blocked for more than 120 seconds

2017-09-28 Thread Olivier Bonvalet
Le jeudi 28 septembre 2017 à 16:28 +0200, Olivier Bonvalet a écrit :
> [ 3263.452023] INFO: task systemd:1 blocked for more than 120
> seconds.
> [ 3263.452040]   Tainted: GW   4.9-dae-xen #2
> [ 3263.452044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 3263.452052] systemd D0 1  0 0x
> [ 3263.452060]  8803a71ca000  8803af857880
> 8803a9762dc0
> [ 3263.452070]  8803a96fcc80 c9001623f990 8150ff1f
> 
> [ 3263.452079]  8803a96fcc80 7fff 81510710
> c9001623faa0
> [ 3263.452087] Call Trace:
> [ 3263.452099]  [] ? __schedule+0x17f/0x530
> [ 3263.452105]  [] ? bit_wait+0x50/0x50
> [ 3263.452110]  [] ? schedule+0x2d/0x80
> [ 3263.452116]  [] ? schedule_timeout+0x17e/0x2a0
> [ 3263.452121]  [] ?
> xen_clocksource_get_cycles+0x11/0x20
> [ 3263.452126]  [] ? ktime_get+0x36/0xa0
> [ 3263.452130]  [] ? bit_wait+0x50/0x50
> [ 3263.452134]  [] ? io_schedule_timeout+0x98/0x100
> [ 3263.452137]  [] ?
> _raw_spin_unlock_irqrestore+0x11/0x20
> [ 3263.452141]  [] ? bit_wait_io+0x12/0x60
> [ 3263.452145]  [] ? __wait_on_bit+0x4e/0x80
> [ 3263.452149]  [] ? bit_wait+0x50/0x50
> [ 3263.452153]  [] ?
> out_of_line_wait_on_bit+0x69/0x80
> [ 3263.452157]  [] ?
> autoremove_wake_function+0x30/0x30
> [ 3263.452163]  [] ? ext4_find_entry+0x350/0x5d0
> [ 3263.452168]  [] ? d_alloc_parallel+0xa0/0x480
> [ 3263.452172]  [] ? __d_lookup_done+0x68/0xd0
> [ 3263.452175]  [] ? d_splice_alias+0x158/0x3b0
> [ 3263.452179]  [] ? ext4_lookup+0x42/0x1f0
> [ 3263.452184]  [] ? lookup_slow+0x8e/0x130
> [ 3263.452187]  [] ? walk_component+0x1ca/0x300
> [ 3263.452193]  [] ? link_path_walk+0x18e/0x570
> [ 3263.452199]  [] ? path_init+0x1c3/0x320
> [ 3263.452207]  [] ? path_openat+0xe2/0x1380
> [ 3263.452214]  [] ? do_filp_open+0x79/0xd0
> [ 3263.45]  [] ? kmem_cache_alloc+0x71/0x400
> [ 3263.452228]  [] ? __check_object_size+0xf7/0x1c4
> [ 3263.452235]  [] ? do_sys_open+0x11f/0x1f0
> [ 3263.452238]  [] ?
> entry_SYSCALL_64_fastpath+0x1a/0xa9

Just in case, an other example :

[ 1088.476044] INFO: task jbd2/xvdb-8:494 blocked for more than 120 seconds.
[ 1088.476058]   Tainted: GW   4.9-dae-xen #2
[ 1088.476061] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1088.476066] jbd2/xvdb-8 D0   494  2 0x
[ 1088.476072]  8800fd036480  8803af8d7880 
8803a8c6e580
[ 1088.476079]  88038756d280 c9001737fb90 8150ff1f 
1001
[ 1088.476085]  88038756d280 7fff 81510710 
c9001737fc98
[ 1088.476091] Call Trace:
[ 1088.476102]  [] ? __schedule+0x17f/0x530
[ 1088.476107]  [] ? bit_wait+0x50/0x50
[ 1088.476114]  [] ? schedule+0x2d/0x80
[ 1088.476117]  [] ? schedule_timeout+0x17e/0x2a0
[ 1088.476123]  [] ? xen_clocksource_get_cycles+0x11/0x20
[ 1088.476126]  [] ? xen_clocksource_get_cycles+0x11/0x20
[ 1088.476132]  [] ? ktime_get+0x36/0xa0
[ 1088.476136]  [] ? bit_wait+0x50/0x50
[ 1088.476139]  [] ? io_schedule_timeout+0x98/0x100
[ 1088.476143]  [] ? _raw_spin_unlock_irqrestore+0x11/0x20
[ 1088.476147]  [] ? bit_wait_io+0x12/0x60
[ 1088.476151]  [] ? __wait_on_bit+0x4e/0x80
[ 1088.476155]  [] ? bit_wait+0x50/0x50
[ 1088.476159]  [] ? out_of_line_wait_on_bit+0x69/0x80
[ 1088.476163]  [] ? autoremove_wake_function+0x30/0x30
[ 1088.476170]  [] ? 
jbd2_journal_commit_transaction+0xe7e/0x1610
[ 1088.476177]  [] ? lock_timer_base+0x76/0x90
[ 1088.476182]  [] ? kjournald2+0xad/0x230
[ 1088.476189]  [] ? wake_atomic_t_function+0x50/0x50
[ 1088.476193]  [] ? commit_timeout+0x10/0x10
[ 1088.476197]  [] ? do_group_exit+0x35/0xa0
[ 1088.476201]  [] ? kthread+0xc2/0xe0
[ 1088.476205]  [] ? kthread_create_on_node+0x40/0x40
[ 1088.476209]  [] ? ret_from_fork+0x25/0x30



and also from the Dom0 (rewritten from screenshot) :

watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [kworker/11:0:26273]
Modules linked in: ...
CPU: 11 PID: 26273 Comm: kworker/11:0 Taineted: G D W L 4.13-dae-dom0 #2
Harware name: Intel Corporation S2600CWR/S2600CWR, BIOS 
SE5C610.86B.01.01.0019.101220160604 10/12/2016
Workqueue: events wait_rcu_exp_gp
task: ... task.stack: ...
RIP: e030:smp_call_function_single+0x6b/0xc0
...
Call Trace:
 ? sync_rcu_exp_select_cpus+0x2b5/0x410
 ? rcu_barrier_func+0x40/0x40
 ? wait_rcu_rxp_gp+0x16/0x30
 ? process_one_work+0x1ad/0x340
 ? worker_thread+0x45/0x3f0
 ? kthread+0xf2/0x130
 ? process_one_work+0x340/0x340
 ? kthread_create_on_node+0x40/0x40
 ? do_group_exit+0x35/0xa0
 ? ret_from_fork+0x25/0x30
...



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: fix invalid assert in backref.c

2017-09-28 Thread Josef Bacik
This should be verify'ing that we have an empty key, not that we have a
filled out key.

Signed-off-by: Josef Bacik 
---
Dave this is on top of your ext/jeffm/extent-cache branch and fixes the segfault
you reported.

 backref.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backref.c b/backref.c
index 8fc0fae779f2..8615f6b8677a 100644
--- a/backref.c
+++ b/backref.c
@@ -465,7 +465,7 @@ static int __add_missing_keys(struct btrfs_fs_info *fs_info,
 
ASSERT(ref->root_id);
ASSERT(!ref->parent);
-   ASSERT(ref->key_for_search.type);
+   ASSERT(!ref->key_for_search.type);
BUG_ON(!ref->wanted_disk_byte);
eb = read_tree_block(fs_info, ref->wanted_disk_byte, 0);
if (!extent_buffer_uptodate(eb)) {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Mount error on 32 bits, ok on 64 bits

2017-09-28 Thread Jean-Denis Girard
Hi list,

I have an Alix motherboard with a SD card using btrfs running fine since
2010. Today, I wanted to upgrade to kernel 4.13.4 from 4.9.52 (i586). As
always, I cross-compiled from my main system, installed on Alix, but
boot failed while trying to mount root: "BTRFS critical (device sda1):
unable to find logical 1740640256 length 4096".

Rebooting to 4.9.52 on the Alix works just fine. Btrfs scrub returns no
error.

Then I took off the SD card, and plugged it in my Intel desktop (running
with kernel 4.13.3-x86_64), and ran a check: no errors were reported. I
was also able to mount the file-system on that other system without
trouble. I mounted with clear_cache, just in case, but it still it would
not mount on the Alix system with kernel 4.13.

One obvious different the 2 systems is 32 / 64 bits. Is this a known
problem? How can I help to fix it? My Alix kernel config is attached

Check and mount on the Intel system:

[jdg@tiare linux]$ uname -a
Linux tiare.sysnux.pf 4.13.3-snx #2 SMP Wed Sep 20 09:09:58 -10 2017
x86_64 x86_64 x86_64 GNU/Linux

[jdg@tiare ~]$ sudo btrfs check /dev/sdd1
Checking filesystem on /dev/sdd1
UUID: 2e9cb3a5-f719-4c0a-a1f6-eef1dd9f84a8
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 598966272 bytes used, no error found
total csum bytes: 545176
total tree bytes: 40312832
total fs tree bytes: 36491264
total extent tree bytes: 2760704
btree space waste bytes: 10016852
file data blocks allocated: 558653440
referenced 987086848

 BTRFS: device fsid 2e9cb3a5-f719-4c0a-a1f6-eef1dd9f84a8 devid 1 transid
1982285 /dev/sdd1
 BTRFS info (device sdd1): force clearing of disk cache
 BTRFS info (device sdd1): disk space caching is enabled
 BTRFS info (device sdd1): has skinny extents
 BTRFS info (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0,
corrupt 4, gen 0
 BTRFS info (device sdd1): checking UUID tree


Info from the running Alix system:

jdg@gw:~$ uname -a
Linux gw.sysnux.pf 4.9.52-sysnux-ix100 #1 Thu Sep 28 12:49:53 -10 2017
i586 GNU/Linux

jdg@gw:~$ sudo btrfs scrub start -B /
scrub done for 2e9cb3a5-f719-4c0a-a1f6-eef1dd9f84a8
scrub started at Thu Sep 28 14:48:17 2017 and finished after
00:00:30
total bytes scrubbed: 570.76MiB with 0 errors
jdg@gw:~$ sudo btrfs dev stats /
[/dev/sda1].write_io_errs0
[/dev/sda1].read_io_errs 0
[/dev/sda1].flush_io_errs0
[/dev/sda1].corruption_errs  4
[/dev/sda1].generation_errs  0


Boot failure on Alix with 4.13.4:

 Btrfs loaded, crc32c=crc32c-generic
 ata1.00: CFA: SanDisk SDCFH-008G, HDX 6.03, max UDMA/66
 ata1.00: 15625216 sectors, multi 0: LBA48  
 ata1.00: limited to UDMA/33 due to 40-wire cable
 ata1.00: configured for UDMA/33
 scsi 0:0:0:0: Direct-Access ATA  SanDisk SDCFH-00 6.03 PQ: 0
ANSI: 5
 sd 0:0:0:0: [sda] 15625216 512-byte logical blocks: (8.00 GB/7.45 GiB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
  sda: sda1
 sd 0:0:0:0: [sda] Attached SCSI removable disk
 BTRFS: device fsid 2e9cb3a5-f719-4c0a-a1f6-eef1dd9f84a8 devid 1 transid
1982236 /dev/root
 BTRFS info (device sda1): disk space caching is enabled
 BTRFS info (device sda1): has skinny extents
 BTRFS critical (device sda1): unable to find logical 1740640256 length
4096
 BTRFS error (device sda1): failed to read chunk root
 BTRFS error (device sda1): open_ctree failed
 List of all partitions:
 0800 7812608 sda  
  driver: sd
   0801 2097152 sda1 -01
  
 No filesystem could mount root, tried:  
  btrfs
  
 Kernel panic - not syncing: VFS: Unable to mount root fs on
unknown-block(8,1)


Thanks,
-- 
Jean-Denis Girard

SysNux   Systèmes   Linux   en   Polynésie  française
https://www.sysnux.pf/   Tél: +689 40.50.10.40 / GSM: +689 87.797.527
#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 4.13.4 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_BITS_MAX=16
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_D

[PATCH v2 1/5] btrfs-progs: Move leaf and node validation checker to tree-checker.c

2017-09-28 Thread Qu Wenruo
It's no doubt the comprehensive tree block checker will become larger
and larger, so move them into their own file is quite reasonable.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/ctree.h|   4 +
 fs/btrfs/disk-io.c  | 284 +---
 fs/btrfs/tree-checker.c | 309 
 4 files changed, 317 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 962a95aefb81..88255e133ade 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-  uuid-tree.o props.o hash.o free-space-tree.o
+  uuid-tree.o props.o hash.o free-space-tree.o tree-checker.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index ea9c5648ff70..6b7c6fcbc5d5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3732,4 +3732,8 @@ static inline int btrfs_is_testing(struct btrfs_fs_info 
*fs_info)
 #endif
return 0;
 }
+
+/* Tree block validation checker */
+int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf);
+int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node);
 #endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c8633f2abdf1..57a9055655d3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -543,284 +543,6 @@ static int check_tree_block_fsid(struct btrfs_fs_info 
*fs_info,
return ret;
 }
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
-static int check_extent_data_item(struct btrfs_root *root,
- struct extent_buffer *leaf,
- struct btrfs_key *key, int slot)
-{
-   struct btrfs_file_extent_item *fi;
-   u32 sectorsize = root->fs_info->sectorsize;
-   u32 item_size = btrfs_item_size_nr(leaf, slot);
-
-   if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
-
-   if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /*
-* Support for new compression/encrption must introduce incompat flag,
-* and must be caught in open_ctree().
-*/
-   if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
-   /* Inline extent must have 0 as key offset */
-   if (key->offset) {
-   CORRUPT("inline extent has non-zero key offset",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /* Compressed inline extent has no on-disk size, skip it */
-   if (btrfs_file_extent_compression(leaf, fi) !=
-   BTRFS_COMPRESS_NONE)
-   return 0;
-
-   /* Uncompressed inline extent size must match item size */
-   if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
-   btrfs_file_extent_ram_bytes(leaf, fi)) {
-   CORRUPT("plaintext inline extent has invalid size",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   return 0;
-   }
-
-   /* Regular or preallocated extent has fixed item size */
-   if (item_size != sizeof(*fi)) {
-   CORRUPT(
-   "regluar or preallocated extent data item size is invalid",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
-   

[PATCH v2 2/5] btrfs: tree-checker: Enhance btrfs_check_node output

2017-09-28 Thread Qu Wenruo
Use inline function to replace macro since we don't need
stringification.
(Macro still exist until all caller get updated)

And add more info about the error.

For nr_items error, report if it's too large or too small, and output
valid value range.

For blk pointer, added a new alignment checker.

For key order, also output the next key to make the problem more
obvious.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 65 ++---
 1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 301243a69dea..a51f2503acc4 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -37,6 +37,48 @@
   btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
   reason, btrfs_header_bytenr(eb), root->objectid, slot)
 
+/*
+ * Error message should follow the format below:
+ * corrupt : , [, ]
+ *
+ * @type:  Either leaf or node
+ * @identifier:The necessary info to locate the leaf/node.
+ * It's recommened to decode key.objecitd/offset if it's
+ * meaningful.
+ * @reason:What's wrong
+ * @bad_value: Optional, it's recommened to output bad value and its
+ * expected value (range).
+ *
+ * Since comma is used to separate the components, only SPACE is allowed
+ * inside each component.
+ */
+
+/*
+ * Append the generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to
+ * @fmt.
+ * Allowing user to customize their output.
+ */
+__printf(4, 5)
+static void generic_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct va_format vaf;
+   va_list args;
+
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   &vaf);
+   va_end(args);
+}
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -282,8 +324,10 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
 
if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
btrfs_crit(root->fs_info,
-  "corrupt node: block %llu root %llu nritems %lu",
-  node->start, root->objectid, nr);
+   "corrupt node: root=%llu block=%llu, nritems too %s, 
have %lu expect range [1,%u]",
+  root->objectid, node->start,
+  nr == 0 ? "small" : "large", nr,
+  BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
return -EIO;
}
 
@@ -293,13 +337,26 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
btrfs_node_key_to_cpu(node, &next_key, slot + 1);
 
if (!bytenr) {
-   CORRUPT("invalid item slot", node, root, slot);
+   generic_err(root, node, slot,
+   "invalid node pointer, have %llu shouldn't be 
0",
+   bytenr);
ret = -EIO;
goto out;
}
+   if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
+   generic_err(root, node, slot,
+   "unaligned pointer, have %llu should be aligned 
to %u",
+   bytenr, root->fs_info->sectorsize);
+   ret = -EUCLEAN;
+   goto out;
+   }
 
if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
-   CORRUPT("bad key order", node, root, slot);
+   generic_err(root, node, slot,
+   "bad key order, current key (%llu %u %llu) next 
key (%llu %u %llu)",
+   key.objectid, key.type, key.offset,
+   next_key.objectid, next_key.type,
+   next_key.offset);
ret = -EIO;
goto out;
}
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/5] btrfs: tree-checker: Enhance output for check_extent_data_item

2017-09-28 Thread Qu Wenruo
Output the invalid member name and its bad value, along with its
expected value range or alignment.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 98 +++--
 1 file changed, 70 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index a5b743763362..9aeb5a288e2b 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -31,12 +31,6 @@
 #include "disk-io.h"
 #include "compression.h"
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
 /*
  * Error message should follow the format below:
  * corrupt : , [, ]
@@ -79,6 +73,47 @@ static void generic_err(const struct btrfs_root *root,
va_end(args);
 }
 
+/*
+ * Customized reporter for extent data item, since its key objectid and
+ * offset has its own meaning.
+ */
+__printf(4, 5)
+static void file_extent_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct btrfs_key key;
+   struct va_format vaf;
+   va_list args;
+
+   btrfs_item_key_to_cpu(eb, &key, slot);
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d ino=%llu 
file_offset=%llu, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   key.objectid, key.offset, &vaf);
+   va_end(args);
+}
+
+/*
+ * Return 0 if the btrfs_file_extent_##name is aligned to @align
+ * Else return 1
+ */
+#define CHECK_FI_ALIGN(root, leaf, slot, fi, name, align)  \
+({ \
+   if (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align)) \
+   file_extent_err(root, leaf, slot,   \
+   "invalid %s for file extent, have %llu, should be 
aligned to %u",\
+   #name, btrfs_file_extent_##name(leaf, fi),  \
+   align); \
+   (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align));   \
+})
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -88,15 +123,19 @@ static int check_extent_data_item(struct btrfs_root *root,
u32 item_size = btrfs_item_size_nr(leaf, slot);
 
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "unaligned file_offset for file extent, have %llu 
should be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
 
fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
 
if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid type for file extent, have %u expect range [0, 
%u]",
+   btrfs_file_extent_type(leaf, fi),
+   BTRFS_FILE_EXTENT_TYPES);
return -EUCLEAN;
}
 
@@ -105,18 +144,24 @@ static int check_extent_data_item(struct btrfs_root *root,
 * and must be caught in open_ctree().
 */
if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid compression for file extent, have %u expect 
range [0, %u]",
+   btrfs_file_extent_compression(leaf, fi),
+   BTRFS_COMPRESS_TYPES);
return -EUCLEAN;
}
if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid encryption for file extent, have %u expect 0",
+   btrfs_file_extent_encryption(leaf, fi));
return -EUCLEAN;
}
if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
/* Inline extent must have 0 as key offset */
if (key->offset) {
-   

[PATCH v2 3/5] btrfs: tree-checker: Enhance output for btrfs_check_leaf

2017-09-28 Thread Qu Wenruo
Enhance the output to print:
1) Reason
2) Bad value
   If reason can't explain enough
3) Good value (range)

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index a51f2503acc4..94027f4215e9 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -232,8 +232,9 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
eb = btrfs_root_node(check_root);
/* if leaf is the root, then it's fine */
if (leaf != eb) {
-   CORRUPT("non-root leaf's nritems is 0",
-   leaf, check_root, 0);
+   generic_err(check_root, leaf, 0,
+   "invalid nritems, have %u shouldn't be 
0 for non-root leaf",
+   nritems);
free_extent_buffer(eb);
return -EUCLEAN;
}
@@ -264,7 +265,11 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 
/* Make sure the keys are in the right order */
if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) {
-   CORRUPT("bad key order", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "bad key order, prev key (%llu %u %llu) current 
key (%llu %u %llu)",
+   prev_key.objectid, prev_key.type,
+   prev_key.offset, key.objectid, key.type,
+   key.offset);
return -EUCLEAN;
}
 
@@ -279,7 +284,10 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
item_end_expected = btrfs_item_offset_nr(leaf,
 slot - 1);
if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
-   CORRUPT("slot offset bad", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "discontinious item end, have %u expect %u",
+   btrfs_item_end_nr(leaf, slot),
+   item_end_expected);
return -EUCLEAN;
}
 
@@ -290,14 +298,21 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 */
if (btrfs_item_end_nr(leaf, slot) >
BTRFS_LEAF_DATA_SIZE(fs_info)) {
-   CORRUPT("slot end outside of leaf", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot end outside of leaf, have %u expect range 
[0, %u]",
+   btrfs_item_end_nr(leaf, slot),
+   BTRFS_LEAF_DATA_SIZE(fs_info));
return -EUCLEAN;
}
 
/* Also check if the item pointer overlaps with btrfs item. */
if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
btrfs_item_ptr_offset(leaf, slot)) {
-   CORRUPT("slot overlap with its data", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot overlap with its data, item end %lu data 
start %lu",
+   btrfs_item_nr_offset(slot) +
+   sizeof(struct btrfs_item),
+   btrfs_item_ptr_offset(leaf, slot));
return -EUCLEAN;
}
 
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/5] Enhance tree block validation checker

2017-09-28 Thread Qu Wenruo
The patchset can be fetched from github:
https://github.com/adam900710/linux/tree/checker_enhance

It's based on David's misc-next branch, with following commit as base:
a5e50b4b444c ("btrfs: Add checker for EXTENT_CSUM")

According to David's suggestion, enhance the output format of tree block
validation checker.

And move them into one separate file: tree-checker.c.

Also added a output format rule to try to make all output message
follow the same format.

Some example output using btrfsck fsck-test images looks like:

For unagliend file extent member:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=7 
ino=257 file_offset=0, invalid disk_bytenr for file extent, have 755944791, 
should be aligned to 4096
---

For bad leaf holes:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=28, 
discontinious item end, have 9387 expect 15018
---

Changelog:
v2:
  Unify the error string format, so it should be easier to grep them
  from dmesg. Thanks Nikolay for pointing this out.
  Remove unused CORRUPT() macro.

Qu Wenruo (5):
  btrfs-progs: Move leaf and node validation checker to tree-checker.c
  btrfs: tree-checker: Enhance btrfs_check_node output
  btrfs: tree-checker: Enhance output for btrfs_check_leaf
  btrfs: tree-checker: Enhance output for check_csum_item
  btrfs: tree-checker: Enhance output for check_extent_data_item

 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/ctree.h|   4 +
 fs/btrfs/disk-io.c  | 284 +---
 fs/btrfs/tree-checker.c | 429 
 4 files changed, 437 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c

-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/5] btrfs: tree-checker: Enhance output for check_csum_item

2017-09-28 Thread Qu Wenruo
Output the bad value and expected good value (or its alignment).

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 94027f4215e9..a5b743763362 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -163,15 +163,21 @@ static int check_csum_item(struct btrfs_root *root, 
struct extent_buffer *leaf,
u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
 
if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
-   CORRUPT("invalid objectid for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "invalid key objectid for csum item, have %llu expect 
%llu",
+   key->objectid, BTRFS_EXTENT_CSUM_OBJECTID);
return -EUCLEAN;
}
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned key offset for csum item, have %llu should 
be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
-   CORRUPT("unaligned csum item size", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned item size for csum item, have %u should be 
aligned to %u",
+   btrfs_item_size_nr(leaf, slot), csumsize);
return -EUCLEAN;
}
return 0;
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mount error on 32 bits, ok on 64 bits

2017-09-28 Thread Jean-Denis Girard
Le 28/09/2017 à 15:29, Jean-Denis Girard a écrit :
> Hi list,
> 
> I have an Alix motherboard with a SD card using btrfs running fine since
> 2010. Today, I wanted to upgrade to kernel 4.13.4 from 4.9.52 (i586). As
> always, I cross-compiled from my main system, installed on Alix, but
> boot failed while trying to mount root: "BTRFS critical (device sda1):
> unable to find logical 1740640256 length 4096".
> 
> Rebooting to 4.9.52 on the Alix works just fine. Btrfs scrub returns no
> error.

FWIW, kernels up to 4.12.14 are ok, and 4.13-rc1 has the same probem.
Which patches should I try to reverse first?


Thanks,
-- 
Jean-Denis Girard

SysNux   Systèmes   Linux   en   Polynésie  française
https://www.sysnux.pf/   Tél: +689 40.50.10.40 / GSM: +689 87.797.527

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/5] btrfs: tree-checker: Enhance btrfs_check_node output

2017-09-28 Thread Nikolay Borisov


On 29.09.2017 04:36, Qu Wenruo wrote:
> Use inline function to replace macro since we don't need
> stringification.
> (Macro still exist until all caller get updated)
> 
> And add more info about the error.
> 
> For nr_items error, report if it's too large or too small, and output
> valid value range.
> 
> For blk pointer, added a new alignment checker.
> 
> For key order, also output the next key to make the problem more
> obvious.
> 
> Signed-off-by: Qu Wenruo 
> ---
>  fs/btrfs/tree-checker.c | 65 
> ++---
>  1 file changed, 61 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 301243a69dea..a51f2503acc4 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -37,6 +37,48 @@
>  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
>  reason, btrfs_header_bytenr(eb), root->objectid, slot)
>  
> +/*
> + * Error message should follow the format below:
> + * corrupt : , [, ]
> + *
> + * @type:Either leaf or node
> + * @identifier:  The necessary info to locate the leaf/node.
> + *   It's recommened to decode key.objecitd/offset if it's
> + *   meaningful.
> + * @reason:  What's wrong
> + * @bad_value:   Optional, it's recommened to output bad value and its
> + *   expected value (range).
> + *
> + * Since comma is used to separate the components, only SPACE is allowed
> + * inside each component.
> + */
> +
> +/*
> + * Append the generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to
> + * @fmt.
> + * Allowing user to customize their output.
> + */
> +__printf(4, 5)
> +static void generic_err(const struct btrfs_root *root,
> + const struct extent_buffer *eb,
> + int slot, const char *fmt, ...)
> +{
> + struct va_format vaf;
> + va_list args;
> +
> + va_start(args, fmt);
> +
> + vaf.fmt = fmt;
> + vaf.va = &args;
> +
> + btrfs_crit(root->fs_info,
> + "corrupt %s: root=%llu block=%llu slot=%d, %pV",
> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
> + root->objectid, btrfs_header_bytenr(eb), slot,
> + &vaf);
> + va_end(args);
> +}
> +
>  static int check_extent_data_item(struct btrfs_root *root,
> struct extent_buffer *leaf,
> struct btrfs_key *key, int slot)
> @@ -282,8 +324,10 @@ int btrfs_check_node(struct btrfs_root *root, struct 
> extent_buffer *node)
>  
>   if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
>   btrfs_crit(root->fs_info,
> -"corrupt node: block %llu root %llu nritems %lu",
> -node->start, root->objectid, nr);
> + "corrupt node: root=%llu block=%llu, nritems too %s, 
> have %lu expect range [1,%u]",
> +root->objectid, node->start,
> +nr == 0 ? "small" : "large", nr,
> +BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
>   return -EIO;

This is separate from this patch but :

Why not EUCLEAN, could we get this error because of corrupted data and
not necessarily EIO ? Your other patches consistently use EUCLEAN ?

>   }
>  
> @@ -293,13 +337,26 @@ int btrfs_check_node(struct btrfs_root *root, struct 
> extent_buffer *node)
>   btrfs_node_key_to_cpu(node, &next_key, slot + 1);
>  
>   if (!bytenr) {
> - CORRUPT("invalid item slot", node, root, slot);
> + generic_err(root, node, slot,
> + "invalid node pointer, have %llu shouldn't be 
> 0",
> + bytenr);

nit: Perhaps just say "Invalid null node pointer", if we trigger this
assert it means bytenr is 0 so I see no reason why we should be doing
any special formatting. It's not a big deal so might not be worth it a
resend unless there are other comments.

>   ret = -EIO;

Ditto w.r.t EIO  ?

>   goto out;
>   }
> + if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
> + generic_err(root, node, slot,
> + "unaligned pointer, have %llu should be aligned 
> to %u",
> + bytenr, root->fs_info->sectorsize);
> + ret = -EUCLEAN;
> + goto out;
> + }
>  
>   if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
> - CORRUPT("bad key order", node, root, slot);
> + generic_err(root, node, slot,
> + "bad key order, current key (%llu %u %llu) next 
> key (%llu %u %llu)",
> + key.objectid, key.type, key.offset,
> + next_key.objectid, next_key.type,
> + 

Re: [PATCH v2 2/5] btrfs: tree-checker: Enhance btrfs_check_node output

2017-09-28 Thread Qu Wenruo



On 2017年09月29日 14:05, Nikolay Borisov wrote:



On 29.09.2017 04:36, Qu Wenruo wrote:

Use inline function to replace macro since we don't need
stringification.
(Macro still exist until all caller get updated)

And add more info about the error.

For nr_items error, report if it's too large or too small, and output
valid value range.

For blk pointer, added a new alignment checker.

For key order, also output the next key to make the problem more
obvious.

Signed-off-by: Qu Wenruo 
---
  fs/btrfs/tree-checker.c | 65 ++---
  1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 301243a69dea..a51f2503acc4 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -37,6 +37,48 @@
   btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
   reason, btrfs_header_bytenr(eb), root->objectid, slot)
  
+/*

+ * Error message should follow the format below:
+ * corrupt : , [, ]
+ *
+ * @type:  Either leaf or node
+ * @identifier:The necessary info to locate the leaf/node.
+ * It's recommened to decode key.objecitd/offset if it's
+ * meaningful.
+ * @reason:What's wrong
+ * @bad_value: Optional, it's recommened to output bad value and its
+ * expected value (range).
+ *
+ * Since comma is used to separate the components, only SPACE is allowed
+ * inside each component.
+ */
+
+/*
+ * Append the generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to
+ * @fmt.
+ * Allowing user to customize their output.
+ */
+__printf(4, 5)
+static void generic_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct va_format vaf;
+   va_list args;
+
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   &vaf);
+   va_end(args);
+}
+
  static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -282,8 +324,10 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
  
  	if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {

btrfs_crit(root->fs_info,
-  "corrupt node: block %llu root %llu nritems %lu",
-  node->start, root->objectid, nr);
+   "corrupt node: root=%llu block=%llu, nritems too %s, have 
%lu expect range [1,%u]",
+  root->objectid, node->start,
+  nr == 0 ? "small" : "large", nr,
+  BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
return -EIO;


This is separate from this patch but :

Why not EUCLEAN, could we get this error because of corrupted data and
not necessarily EIO ? Your other patches consistently use EUCLEAN ?


Just forgot that.

Old code I didn't modify, but since it's moved to new place, EUCLEAN 
makes sense.


I'll update the patchset (if there is any).

Thanks for pointing this out,
Qu




}
  
@@ -293,13 +337,26 @@ int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)

btrfs_node_key_to_cpu(node, &next_key, slot + 1);
  
  		if (!bytenr) {

-   CORRUPT("invalid item slot", node, root, slot);
+   generic_err(root, node, slot,
+   "invalid node pointer, have %llu shouldn't be 
0",
+   bytenr);


nit: Perhaps just say "Invalid null node pointer", if we trigger this
assert it means bytenr is 0 so I see no reason why we should be doing
any special formatting. It's not a big deal so might not be worth it a
resend unless there are other comments.


ret = -EIO;


Ditto w.r.t EIO  ?


goto out;
}
+   if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
+   generic_err(root, node, slot,
+   "unaligned pointer, have %llu should be aligned to 
%u",
+   bytenr, root->fs_info->sectorsize);
+   ret = -EUCLEAN;
+   goto out;
+   }
  
  		if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {

-   CORRUPT("bad key order", node, root, slot);
+   generic_err(root, node, slot,
+   "bad key order, current key (%llu %u %llu) next key 
(%llu %u %llu)",
+   key.objectid, key.type, key.offset,
+   

WARNING: CPU: 1 PID: 13825 at fs/btrfs/backref.c:1255 find_parent_nodes+0xb5c/0x1310

2017-09-28 Thread Paul Jones
Hi,

Just ran into this warning while running deduplication. There were 10's of 
thousands of them over a 24hr period. No other problems were reported.
Filesystem is raid1, freshly converted from single. Zstd compression.  
4.14.0-rc2 kernel


Sep 28 14:57:06 home kernel: [ cut here ]
Sep 28 14:57:06 home kernel: WARNING: CPU: 1 PID: 13825 at 
fs/btrfs/backref.c:1255 find_parent_nodes+0xb5c/0x1310
Sep 28 14:57:06 home kernel: Modules linked in: l2tp_netlink l2tp_core 
udp_tunnel ip6_udp_tunnel cls_u32 sch_htb sch_sfq nf_conntrack_pptp 
nf_conntrack_proto_gre nf_conntrack_sane nf_conntrack_sip ts_kmp 
nf_conntrack_amanda nf_conntrack_snmp nf_conntrack_h323 nf_conntrack_netbios_ns 
nf_conntrack_broadcast nf_conntrack_tftp nf_conntrack_ftp nf_conntrack_irc 
xt_NETMAP xt_TCPMSS xt_CHECKSUM ipt_rpfilter xt_DSCP xt_dscp xt_statistic xt_CT 
xt_AUDIT xt_NFLOG xt_time xt_connlimit xt_realm xt_NFQUEUE xt_tcpmss 
xt_addrtype xt_pkttype iptable_raw xt_TPROXY nf_defrag_ipv6 xt_CLASSIFY xt_mark 
xt_hashlimit xt_comment xt_length xt_connmark xt_owner xt_recent xt_iprange 
xt_physdev xt_policy iptable_mangle xt_nat xt_multiport xt_conntrack ipt_REJECT 
nf_reject_ipv4 ipt_MASQUERADE nf_nat_masquerade_ipv4 ipt_ECN ipt_CLUSTERIP 
ipt_ah
Sep 28 14:57:06 home kernel:  iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat iptable_filter ip_tables nfsd auth_rpcgss oid_registry 
nfs_acl binfmt_misc dm_cache_smq dm_cache dm_persistent_data dm_bufio 
dm_bio_prison k10temp hwmon_vid intel_powerclamp coretemp pcbc iTCO_wdt 
iTCO_vendor_support aesni_intel crypto_simd cryptd glue_helper pcspkr i2c_i801 
lpc_ich mfd_core xts aes_x86_64 cbc sha512_generic iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi ixgb macvlan igb dca i2c_algo_bit e1000 atl1c 
fuse nfs lockd grace sunrpc dm_mirror dm_region_hash dm_log dm_mod hid_sunplus 
hid_sony hid_samsung hid_pl hid_petalynx hid_gyration usbhid xhci_plat_hcd 
ohci_pci ohci_hcd uhci_hcd usb_storage megaraid_sas megaraid_mbox megaraid_mm 
megaraid mptsas scsi_transport_sas mptspi scsi_transport_spi mptscsih mptbase
Sep 28 14:57:06 home kernel:  sata_inic162x ata_piix sata_nv sata_sil24 
pata_jmicron pata_amd pata_mpiix ahci libahci xhci_pci ehci_pci r8169 xhci_hcd 
mii ehci_hcd
Sep 28 14:57:06 home kernel: CPU: 1 PID: 13825 Comm: crawl Not tainted 
4.14.0-rc2 #2
Sep 28 14:57:06 home kernel: Hardware name: System manufacturer System Product 
Name/P8Z68-V LE, BIOS 4101 05/09/2013
Sep 28 14:57:06 home kernel: task: 8803dde96140 task.stack: c90018f9
Sep 28 14:57:06 home kernel: RIP: 0010:find_parent_nodes+0xb5c/0x1310
Sep 28 14:57:06 home kernel: RSP: 0018:c90018f93b30 EFLAGS: 00010286
Sep 28 14:57:06 home kernel: RAX:  RBX: 8803f8453318 RCX: 
0001
Sep 28 14:57:06 home kernel: RDX:  RSI: 88040b9ca338 RDI: 
8802c831bec8
Sep 28 14:57:06 home kernel: RBP: c90018f93c50 R08: 8803bb36d4e0 R09: 

Sep 28 14:57:06 home kernel: R10: 8802c831bec8 R11: c90018f93bf0 R12: 
0001
Sep 28 14:57:06 home kernel: R13: c90018f93c10 R14: 8803fc295ac0 R15: 
8802c831bec8
Sep 28 14:57:06 home kernel: FS:  7f5dd2ca5700() 
GS:88041ec4() knlGS:
Sep 28 14:57:06 home kernel: CS:  0010 DS:  ES:  CR0: 80050033
Sep 28 14:57:06 home kernel: CR2: 7f8a30efc000 CR3: 00034037a005 CR4: 
001606a0
Sep 28 14:57:06 home kernel: Call Trace:
Sep 28 14:57:06 home kernel:  btrfs_find_all_roots_safe+0x91/0x100
Sep 28 14:57:06 home kernel:  ? btrfs_find_all_roots_safe+0x91/0x100
Sep 28 14:57:06 home kernel:  ? extent_same_check_offsets+0x70/0x70
Sep 28 14:57:06 home kernel:  iterate_extent_inodes+0x1d1/0x260
Sep 28 14:57:06 home kernel:  iterate_inodes_from_logical+0x7d/0xa0
Sep 28 14:57:06 home kernel:  ? iterate_inodes_from_logical+0x7d/0xa0
Sep 28 14:57:06 home kernel:  ? extent_same_check_offsets+0x70/0x70
Sep 28 14:57:06 home kernel:  btrfs_ioctl+0x8aa/0x23a0
Sep 28 14:57:06 home kernel:  ? generic_file_read_iter+0x322/0x7d0
Sep 28 14:57:06 home kernel:  ? _copy_to_user+0x26/0x30
Sep 28 14:57:06 home kernel:  ? cp_new_stat+0x108/0x120
Sep 28 14:57:06 home kernel:  do_vfs_ioctl+0x8d/0x5b0
Sep 28 14:57:06 home kernel:  ? do_vfs_ioctl+0x8d/0x5b0
Sep 28 14:57:06 home kernel:  ? SyS_newfstat+0x35/0x50
Sep 28 14:57:06 home kernel:  SyS_ioctl+0x3c/0x70
Sep 28 14:57:06 home kernel:  entry_SYSCALL_64_fastpath+0x13/0x94
Sep 28 14:57:06 home kernel: RIP: 0033:0x7f5dd2f8afd7
Sep 28 14:57:06 home kernel: RSP: 002b:7f5dd2ca25c8 EFLAGS: 0246 
ORIG_RAX: 0010
Sep 28 14:57:06 home kernel: RAX: ffda RBX: 7f5dcc38f0e0 RCX: 
7f5dd2f8afd7
Sep 28 14:57:06 home kernel: RDX: 7f5dd2ca2698 RSI: c0389424 RDI: 
0003
Sep 28 14:57:06 home kernel: RBP: 0030 R08:  R09: 
7ffe8b102080
Sep 28 14:57:06 home kernel: R10: 7f5dcc3e0870 R11: 

[PATCH v3 2/5] btrfs: tree-checker: Enhance btrfs_check_node output

2017-09-28 Thread Qu Wenruo
Use inline function to replace macro since we don't need
stringification.
(Macro still exist until all caller get updated)

And add more info about the error, and replace EIO with EUCLEAN.

For nr_items error, report if it's too large or too small, and output
valid value range.

For blk pointer, added a new alignment checker.

For key order, also output the next key to make the problem more
obvious.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 71 -
 1 file changed, 64 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 301243a69dea..94acf3f5d6fd 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -37,6 +37,48 @@
   btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
   reason, btrfs_header_bytenr(eb), root->objectid, slot)
 
+/*
+ * Error message should follow the format below:
+ * corrupt : , [, ]
+ *
+ * @type:  Either leaf or node
+ * @identifier:The necessary info to locate the leaf/node.
+ * It's recommened to decode key.objecitd/offset if it's
+ * meaningful.
+ * @reason:What's wrong
+ * @bad_value: Optional, it's recommened to output bad value and its
+ * expected value (range).
+ *
+ * Since comma is used to separate the components, only SPACE is allowed
+ * inside each component.
+ */
+
+/*
+ * Append the generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to
+ * @fmt.
+ * Allowing user to customize their output.
+ */
+__printf(4, 5)
+static void generic_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct va_format vaf;
+   va_list args;
+
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   &vaf);
+   va_end(args);
+}
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -282,9 +324,11 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
 
if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
btrfs_crit(root->fs_info,
-  "corrupt node: block %llu root %llu nritems %lu",
-  node->start, root->objectid, nr);
-   return -EIO;
+   "corrupt node: root=%llu block=%llu, nritems too %s, 
have %lu expect range [1,%u]",
+  root->objectid, node->start,
+  nr == 0 ? "small" : "large", nr,
+  BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
+   return -EUCLEAN;
}
 
for (slot = 0; slot < nr - 1; slot++) {
@@ -293,14 +337,27 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
btrfs_node_key_to_cpu(node, &next_key, slot + 1);
 
if (!bytenr) {
-   CORRUPT("invalid item slot", node, root, slot);
-   ret = -EIO;
+   generic_err(root, node, slot,
+   "invalid node pointer, have %llu shouldn't be 
0",
+   bytenr);
+   ret = -EUCLEAN;
+   goto out;
+   }
+   if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
+   generic_err(root, node, slot,
+   "unaligned pointer, have %llu should be aligned 
to %u",
+   bytenr, root->fs_info->sectorsize);
+   ret = -EUCLEAN;
goto out;
}
 
if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
-   CORRUPT("bad key order", node, root, slot);
-   ret = -EIO;
+   generic_err(root, node, slot,
+   "bad key order, current key (%llu %u %llu) next 
key (%llu %u %llu)",
+   key.objectid, key.type, key.offset,
+   next_key.objectid, next_key.type,
+   next_key.offset);
+   ret = -EUCLEAN;
goto out;
}
}
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 4/5] btrfs: tree-checker: Enhance output for check_csum_item

2017-09-28 Thread Qu Wenruo
Output the bad value and expected good value (or its alignment).

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 183ff7faa218..c0fd192f8140 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -163,15 +163,21 @@ static int check_csum_item(struct btrfs_root *root, 
struct extent_buffer *leaf,
u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
 
if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
-   CORRUPT("invalid objectid for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "invalid key objectid for csum item, have %llu expect 
%llu",
+   key->objectid, BTRFS_EXTENT_CSUM_OBJECTID);
return -EUCLEAN;
}
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned key offset for csum item, have %llu should 
be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
-   CORRUPT("unaligned csum item size", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned item size for csum item, have %u should be 
aligned to %u",
+   btrfs_item_size_nr(leaf, slot), csumsize);
return -EUCLEAN;
}
return 0;
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/5] btrfs: Move leaf and node validation checker to tree-checker.c

2017-09-28 Thread Qu Wenruo
It's no doubt the comprehensive tree block checker will become larger
and larger, so move them into their own file is quite reasonable.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/ctree.h|   4 +
 fs/btrfs/disk-io.c  | 284 +---
 fs/btrfs/tree-checker.c | 309 
 4 files changed, 317 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 962a95aefb81..88255e133ade 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-  uuid-tree.o props.o hash.o free-space-tree.o
+  uuid-tree.o props.o hash.o free-space-tree.o tree-checker.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index ea9c5648ff70..6b7c6fcbc5d5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3732,4 +3732,8 @@ static inline int btrfs_is_testing(struct btrfs_fs_info 
*fs_info)
 #endif
return 0;
 }
+
+/* Tree block validation checker */
+int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf);
+int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node);
 #endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c8633f2abdf1..57a9055655d3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -543,284 +543,6 @@ static int check_tree_block_fsid(struct btrfs_fs_info 
*fs_info,
return ret;
 }
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
-static int check_extent_data_item(struct btrfs_root *root,
- struct extent_buffer *leaf,
- struct btrfs_key *key, int slot)
-{
-   struct btrfs_file_extent_item *fi;
-   u32 sectorsize = root->fs_info->sectorsize;
-   u32 item_size = btrfs_item_size_nr(leaf, slot);
-
-   if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
-
-   if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /*
-* Support for new compression/encrption must introduce incompat flag,
-* and must be caught in open_ctree().
-*/
-   if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
-   /* Inline extent must have 0 as key offset */
-   if (key->offset) {
-   CORRUPT("inline extent has non-zero key offset",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /* Compressed inline extent has no on-disk size, skip it */
-   if (btrfs_file_extent_compression(leaf, fi) !=
-   BTRFS_COMPRESS_NONE)
-   return 0;
-
-   /* Uncompressed inline extent size must match item size */
-   if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
-   btrfs_file_extent_ram_bytes(leaf, fi)) {
-   CORRUPT("plaintext inline extent has invalid size",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   return 0;
-   }
-
-   /* Regular or preallocated extent has fixed item size */
-   if (item_size != sizeof(*fi)) {
-   CORRUPT(
-   "regluar or preallocated extent data item size is invalid",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
-   

[PATCH v3 5/5] btrfs: tree-checker: Enhance output for check_extent_data_item

2017-09-28 Thread Qu Wenruo
Output the invalid member name and its bad value, along with its
expected value range or alignment.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 98 +++--
 1 file changed, 70 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index c0fd192f8140..d546c723069e 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -31,12 +31,6 @@
 #include "disk-io.h"
 #include "compression.h"
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
 /*
  * Error message should follow the format below:
  * corrupt : , [, ]
@@ -79,6 +73,47 @@ static void generic_err(const struct btrfs_root *root,
va_end(args);
 }
 
+/*
+ * Customized reporter for extent data item, since its key objectid and
+ * offset has its own meaning.
+ */
+__printf(4, 5)
+static void file_extent_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct btrfs_key key;
+   struct va_format vaf;
+   va_list args;
+
+   btrfs_item_key_to_cpu(eb, &key, slot);
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = &args;
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d ino=%llu 
file_offset=%llu, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   key.objectid, key.offset, &vaf);
+   va_end(args);
+}
+
+/*
+ * Return 0 if the btrfs_file_extent_##name is aligned to @align
+ * Else return 1
+ */
+#define CHECK_FI_ALIGN(root, leaf, slot, fi, name, align)  \
+({ \
+   if (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align)) \
+   file_extent_err(root, leaf, slot,   \
+   "invalid %s for file extent, have %llu, should be 
aligned to %u",\
+   #name, btrfs_file_extent_##name(leaf, fi),  \
+   align); \
+   (!IS_ALIGNED(btrfs_file_extent_##name(leaf, fi), align));   \
+})
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -88,15 +123,19 @@ static int check_extent_data_item(struct btrfs_root *root,
u32 item_size = btrfs_item_size_nr(leaf, slot);
 
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "unaligned file_offset for file extent, have %llu 
should be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
 
fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
 
if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid type for file extent, have %u expect range [0, 
%u]",
+   btrfs_file_extent_type(leaf, fi),
+   BTRFS_FILE_EXTENT_TYPES);
return -EUCLEAN;
}
 
@@ -105,18 +144,24 @@ static int check_extent_data_item(struct btrfs_root *root,
 * and must be caught in open_ctree().
 */
if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid compression for file extent, have %u expect 
range [0, %u]",
+   btrfs_file_extent_compression(leaf, fi),
+   BTRFS_COMPRESS_TYPES);
return -EUCLEAN;
}
if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid encryption for file extent, have %u expect 0",
+   btrfs_file_extent_encryption(leaf, fi));
return -EUCLEAN;
}
if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
/* Inline extent must have 0 as key offset */
if (key->offset) {
-   

[PATCH v3 0/5] Enhance tree block validation checker

2017-09-28 Thread Qu Wenruo
The patchset can be fetched from github:
https://github.com/adam900710/linux/tree/checker_enhance

It's based on David's misc-next branch, with following commit as base:
a5e50b4b444c ("btrfs: Add checker for EXTENT_CSUM")

According to David's suggestion, enhance the output format of tree block
validation checker.

And move them into one separate file: tree-checker.c.

Also added a output format rule to try to make all output message
follow the same format.

Some example output using btrfsck fsck-test images looks like:

For unagliend file extent member:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=7 
ino=257 file_offset=0, invalid disk_bytenr for file extent, have 755944791, 
should be aligned to 4096
---

For bad leaf holes:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=28, 
discontinious item end, have 9387 expect 15018
---

Changelog:
v2:
  Unify the error string format, so it should be easier to grep them
  from dmesg. Thanks Nikolay for pointing this out.
  Remove unused CORRUPT() macro.
v3:
  Replace EIO with EUCLEAN in 2nd patch. Thanks Nikolay for pointing
  this out.
  Correct "btrfs-progs:" to "btrfs:" for 1st patch.

Qu Wenruo (5):
  btrfs: Move leaf and node validation checker to tree-checker.c
  btrfs: tree-checker: Enhance btrfs_check_node output
  btrfs: tree-checker: Enhance output for btrfs_check_leaf
  btrfs: tree-checker: Enhance output for check_csum_item
  btrfs: tree-checker: Enhance output for check_extent_data_item

 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/ctree.h|   4 +
 fs/btrfs/disk-io.c  | 284 +---
 fs/btrfs/tree-checker.c | 429 
 4 files changed, 437 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c

-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/5] btrfs: tree-checker: Enhance output for btrfs_check_leaf

2017-09-28 Thread Qu Wenruo
Enhance the output to print:
1) Reason
2) Bad value
   If reason can't explain enough
3) Good value (range)

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 94acf3f5d6fd..183ff7faa218 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -232,8 +232,9 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
eb = btrfs_root_node(check_root);
/* if leaf is the root, then it's fine */
if (leaf != eb) {
-   CORRUPT("non-root leaf's nritems is 0",
-   leaf, check_root, 0);
+   generic_err(check_root, leaf, 0,
+   "invalid nritems, have %u shouldn't be 
0 for non-root leaf",
+   nritems);
free_extent_buffer(eb);
return -EUCLEAN;
}
@@ -264,7 +265,11 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 
/* Make sure the keys are in the right order */
if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) {
-   CORRUPT("bad key order", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "bad key order, prev key (%llu %u %llu) current 
key (%llu %u %llu)",
+   prev_key.objectid, prev_key.type,
+   prev_key.offset, key.objectid, key.type,
+   key.offset);
return -EUCLEAN;
}
 
@@ -279,7 +284,10 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
item_end_expected = btrfs_item_offset_nr(leaf,
 slot - 1);
if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
-   CORRUPT("slot offset bad", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "discontinious item end, have %u expect %u",
+   btrfs_item_end_nr(leaf, slot),
+   item_end_expected);
return -EUCLEAN;
}
 
@@ -290,14 +298,21 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 */
if (btrfs_item_end_nr(leaf, slot) >
BTRFS_LEAF_DATA_SIZE(fs_info)) {
-   CORRUPT("slot end outside of leaf", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot end outside of leaf, have %u expect range 
[0, %u]",
+   btrfs_item_end_nr(leaf, slot),
+   BTRFS_LEAF_DATA_SIZE(fs_info));
return -EUCLEAN;
}
 
/* Also check if the item pointer overlaps with btrfs item. */
if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
btrfs_item_ptr_offset(leaf, slot)) {
-   CORRUPT("slot overlap with its data", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot overlap with its data, item end %lu data 
start %lu",
+   btrfs_item_nr_offset(slot) +
+   sizeof(struct btrfs_item),
+   btrfs_item_ptr_offset(leaf, slot));
return -EUCLEAN;
}
 
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html