Re: [PATCH] Btrfs: remove transaction from send
On Fri, Mar 14, 2014 at 10:44:04PM +, Hugo Mills wrote: On Fri, Mar 14, 2014 at 02:51:22PM -0400, Josef Bacik wrote: On 03/13/2014 06:16 PM, Hugo Mills wrote: On Thu, Mar 13, 2014 at 03:42:13PM -0400, Josef Bacik wrote: Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, There's something still going on here. I managed to get about twice as far through my test as I had before, but I again got an unexpected EOF in stream, with btrfs send returning 1. As before, I have this in syslog: Mar 13 22:09:12 s_src@amelia kernel: BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040\x0a I just noticed that the offset you have there is freaking gigantic, like 700mb, which is way larger than what an extent should be. Here is a newer debug patch, just chuck the old on and put this instead and re-run http://paste.fedoraproject.org/85486/39482301 That last run, with the above patch, failed again, at approximately the same place again. The only output in dmesg is: [ 6488.168469] BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040, len=1294336 root@amelia:~# btrfs insp ino 1786631 / //srv/vm/armand.img root@amelia:~# ls -l /srv/vm/armand.img -rw-rw-r-- 1 root kvm 40 Jan 30 08:11 /srv/vm/armand.img root@amelia:~# filefrag /srv/vm/armand.img /srv/vm/armand.img: 17436 extents found This is a VM image, not currently operational. It probably has sparse extents in it somewhere. The full filefrag -ev output is at [1], but the offset it's complaining about is 825257984 = 201479 4k blocks: ext: logical_offset:physical_offset: length: expected: flags: 17200: 201478.. 201478:7220724.. 7220724: 1:8923002: 17201: 201479.. 201481:8912386.. 8912388: 3:7220725: 17202: 201482.. 201482:8923002.. 8923002: 1:8912389: This seems unexceptional. Hugo. [1] http://carfax.org.uk/files/temp/filefrag.txt -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Can I offer you anything? Tea? Seedcake? --- Glass of Amontillado? signature.asc Description: Digital signature
Re: [PATCH] Btrfs: remove transaction from send
On Thu, Mar 13, 2014 at 10:16:28PM +, Hugo Mills wrote: On Thu, Mar 13, 2014 at 03:42:13PM -0400, Josef Bacik wrote: Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, There's something still going on here. I managed to get about twice as far through my test as I had before, but I again got an unexpected EOF in stream, with btrfs send returning 1. As before, I have this in syslog: Mar 13 22:09:12 s_src@amelia kernel: BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040\x0a So, on the evidence of one data point (I'll have another one when I wake up tomorrow morning), this has made the problem harder to trigger but it's still possible. Data point two has arrived, and it's gone boom at about the same point. The first failed at: 2014-03-13 22:09:11,749INFO Read 7247356514 bytes total and the second at: 2014-03-14 03:53:46,990INFO Read 7247357071 bytes total at approximately 1h45 into the process. The boot and home subvols have been OK, and have been backing up happily all this time, but both are smaller than the (~10 GiB) root subvol. I can add a load of data to /home and see if the problem happens with a larger send size, or if it's just the process writing to a subvol that has the snapshot being sent that causes it. The interesting thing here is that the error seems to be fairly reliably in the same place (more or less). Before this patch, I was seeing lockups (or EOF, with the earlier version of this patch) at approximately 3.6-3.8 GB. Now it looks like it's going to be 7.2 GB. At least it's not locking up any more, just dying noisily (which is marginally preferable). Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Hail and greetings. We are a flat-pack invasion force from --- Planet Ikea. We come in pieces. signature.asc Description: Digital signature
Re: [PATCH] Btrfs: remove transaction from send
Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, Now btrfs send are alway searching commit root! Your codes only seems to protect backref codes, it reduce transaction blocked but make it not safe as we have discussed before. -Wang Reportedy-by: Hugo Mills h...@carfax.org.uk Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/backref.c | 33 +++ fs/btrfs/ctree.c | 88 -- fs/btrfs/ctree.h | 3 +- fs/btrfs/disk-io.c | 3 +- fs/btrfs/extent-tree.c | 20 ++-- fs/btrfs/inode-map.c | 14 fs/btrfs/send.c| 57 ++-- fs/btrfs/transaction.c | 45 -- fs/btrfs/transaction.h | 1 + 9 files changed, 77 insertions(+), 187 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 860f4f2..0be0e94 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -329,7 +329,10 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info, goto out; } - root_level = btrfs_old_root_level(root, time_seq); + if (path-search_commit_root) + root_level = btrfs_header_level(root-commit_root); + else + root_level = btrfs_old_root_level(root, time_seq); if (root_level + 1 == level) { srcu_read_unlock(fs_info-subvol_srcu, index); @@ -1092,9 +1095,9 @@ static int btrfs_find_all_leafs(struct btrfs_trans_handle *trans, * * returns 0 on success, 0 on error. */ -int btrfs_find_all_roots(struct btrfs_trans_handle *trans, - struct btrfs_fs_info *fs_info, u64 bytenr, - u64 time_seq, struct ulist **roots) +static int __btrfs_find_all_roots(struct btrfs_trans_handle *trans, + struct btrfs_fs_info *fs_info, u64 bytenr, + u64 time_seq, struct ulist **roots) { struct ulist *tmp; struct ulist_node *node = NULL; @@ -1130,6 +1133,20 @@ int btrfs_find_all_roots(struct btrfs_trans_handle *trans, return 0; } +int btrfs_find_all_roots(struct btrfs_trans_handle *trans, + struct btrfs_fs_info *fs_info, u64 bytenr, + u64 time_seq, struct ulist **roots) +{ + int ret; + + if (!trans) + down_read(fs_info-commit_root_sem); + ret = __btrfs_find_all_roots(trans, fs_info, bytenr, time_seq, roots); + if (!trans) + up_read(fs_info-commit_root_sem); + return ret; +} + /* * this makes the path point to (inum INODE_ITEM ioff) */ @@ -1509,6 +1526,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, if (IS_ERR(trans)) return PTR_ERR(trans); btrfs_get_tree_mod_seq(fs_info, tree_mod_seq_elem); + } else { + down_read(fs_info-commit_root_sem); } ret = btrfs_find_all_leafs(trans, fs_info, extent_item_objectid, @@ -1519,8 +1538,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, ULIST_ITER_INIT(ref_uiter); while (!ret (ref_node = ulist_next(refs, ref_uiter))) { - ret = btrfs_find_all_roots(trans, fs_info, ref_node-val, -tree_mod_seq_elem.seq, roots); + ret = __btrfs_find_all_roots(trans, fs_info, ref_node-val, + tree_mod_seq_elem.seq, roots); if (ret) break; ULIST_ITER_INIT(root_uiter); @@ -1542,6 +1561,8 @@ out: if (!search_commit_root) { btrfs_put_tree_mod_seq(fs_info, tree_mod_seq_elem); btrfs_end_transaction(trans, fs_info-extent_root); + } else { + up_read(fs_info-commit_root_sem); } return ret; diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 88d1b1e..9d89c16 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@
Re: [PATCH] Btrfs: remove transaction from send
On 03/13/2014 06:16 PM, Hugo Mills wrote: On Thu, Mar 13, 2014 at 03:42:13PM -0400, Josef Bacik wrote: Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, There's something still going on here. I managed to get about twice as far through my test as I had before, but I again got an unexpected EOF in stream, with btrfs send returning 1. As before, I have this in syslog: Mar 13 22:09:12 s_src@amelia kernel: BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040\x0a I just noticed that the offset you have there is freaking gigantic, like 700mb, which is way larger than what an extent should be. Here is a newer debug patch, just chuck the old on and put this instead and re-run http://paste.fedoraproject.org/85486/39482301 thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: remove transaction from send
On Fri, Mar 14, 2014 at 02:51:22PM -0400, Josef Bacik wrote: On 03/13/2014 06:16 PM, Hugo Mills wrote: On Thu, Mar 13, 2014 at 03:42:13PM -0400, Josef Bacik wrote: Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, There's something still going on here. I managed to get about twice as far through my test as I had before, but I again got an unexpected EOF in stream, with btrfs send returning 1. As before, I have this in syslog: Mar 13 22:09:12 s_src@amelia kernel: BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040\x0a I just noticed that the offset you have there is freaking gigantic, like 700mb, which is way larger than what an extent should be. Here is a newer debug patch, just chuck the old on and put this instead and re-run http://paste.fedoraproject.org/85486/39482301 That last run, with the above patch, failed again, at approximately the same place again. The only output in dmesg is: [ 6488.168469] BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040, len=1294336 as before. Definitely no kernel WARN, no backtraces. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- You're never alone with a rubber duck... --- signature.asc Description: Digital signature
[PATCH] Btrfs: remove transaction from send
Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, Reportedy-by: Hugo Mills h...@carfax.org.uk Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/backref.c | 33 +++ fs/btrfs/ctree.c | 88 -- fs/btrfs/ctree.h | 3 +- fs/btrfs/disk-io.c | 3 +- fs/btrfs/extent-tree.c | 20 ++-- fs/btrfs/inode-map.c | 14 fs/btrfs/send.c| 57 ++-- fs/btrfs/transaction.c | 45 -- fs/btrfs/transaction.h | 1 + 9 files changed, 77 insertions(+), 187 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 860f4f2..0be0e94 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -329,7 +329,10 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info, goto out; } - root_level = btrfs_old_root_level(root, time_seq); + if (path-search_commit_root) + root_level = btrfs_header_level(root-commit_root); + else + root_level = btrfs_old_root_level(root, time_seq); if (root_level + 1 == level) { srcu_read_unlock(fs_info-subvol_srcu, index); @@ -1092,9 +1095,9 @@ static int btrfs_find_all_leafs(struct btrfs_trans_handle *trans, * * returns 0 on success, 0 on error. */ -int btrfs_find_all_roots(struct btrfs_trans_handle *trans, - struct btrfs_fs_info *fs_info, u64 bytenr, - u64 time_seq, struct ulist **roots) +static int __btrfs_find_all_roots(struct btrfs_trans_handle *trans, + struct btrfs_fs_info *fs_info, u64 bytenr, + u64 time_seq, struct ulist **roots) { struct ulist *tmp; struct ulist_node *node = NULL; @@ -1130,6 +1133,20 @@ int btrfs_find_all_roots(struct btrfs_trans_handle *trans, return 0; } +int btrfs_find_all_roots(struct btrfs_trans_handle *trans, +struct btrfs_fs_info *fs_info, u64 bytenr, +u64 time_seq, struct ulist **roots) +{ + int ret; + + if (!trans) + down_read(fs_info-commit_root_sem); + ret = __btrfs_find_all_roots(trans, fs_info, bytenr, time_seq, roots); + if (!trans) + up_read(fs_info-commit_root_sem); + return ret; +} + /* * this makes the path point to (inum INODE_ITEM ioff) */ @@ -1509,6 +1526,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, if (IS_ERR(trans)) return PTR_ERR(trans); btrfs_get_tree_mod_seq(fs_info, tree_mod_seq_elem); + } else { + down_read(fs_info-commit_root_sem); } ret = btrfs_find_all_leafs(trans, fs_info, extent_item_objectid, @@ -1519,8 +1538,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, ULIST_ITER_INIT(ref_uiter); while (!ret (ref_node = ulist_next(refs, ref_uiter))) { - ret = btrfs_find_all_roots(trans, fs_info, ref_node-val, - tree_mod_seq_elem.seq, roots); + ret = __btrfs_find_all_roots(trans, fs_info, ref_node-val, +tree_mod_seq_elem.seq, roots); if (ret) break; ULIST_ITER_INIT(root_uiter); @@ -1542,6 +1561,8 @@ out: if (!search_commit_root) { btrfs_put_tree_mod_seq(fs_info, tree_mod_seq_elem); btrfs_end_transaction(trans, fs_info-extent_root); + } else { + up_read(fs_info-commit_root_sem); } return ret; diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 88d1b1e..9d89c16 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -5360,7 +5360,6 @@ int btrfs_compare_trees(struct btrfs_root *left_root, { int ret; int cmp; - struct btrfs_trans_handle *trans = NULL; struct btrfs_path
Re: [PATCH] Btrfs: remove transaction from send
On Thu, Mar 13, 2014 at 03:42:13PM -0400, Josef Bacik wrote: Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, There's something still going on here. I managed to get about twice as far through my test as I had before, but I again got an unexpected EOF in stream, with btrfs send returning 1. As before, I have this in syslog: Mar 13 22:09:12 s_src@amelia kernel: BTRFS error (device sda2): did not find backref in send_root. inode=1786631, offset=825257984, disk_byte=36504023040 found extent=36504023040\x0a So, on the evidence of one data point (I'll have another one when I wake up tomorrow morning), this has made the problem harder to trigger but it's still possible. Hugo. Reportedy-by: Hugo Mills h...@carfax.org.uk Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/backref.c | 33 +++ fs/btrfs/ctree.c | 88 -- fs/btrfs/ctree.h | 3 +- fs/btrfs/disk-io.c | 3 +- fs/btrfs/extent-tree.c | 20 ++-- fs/btrfs/inode-map.c | 14 fs/btrfs/send.c| 57 ++-- fs/btrfs/transaction.c | 45 -- fs/btrfs/transaction.h | 1 + 9 files changed, 77 insertions(+), 187 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 860f4f2..0be0e94 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -329,7 +329,10 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info, goto out; } - root_level = btrfs_old_root_level(root, time_seq); + if (path-search_commit_root) + root_level = btrfs_header_level(root-commit_root); + else + root_level = btrfs_old_root_level(root, time_seq); if (root_level + 1 == level) { srcu_read_unlock(fs_info-subvol_srcu, index); @@ -1092,9 +1095,9 @@ static int btrfs_find_all_leafs(struct btrfs_trans_handle *trans, * * returns 0 on success, 0 on error. */ -int btrfs_find_all_roots(struct btrfs_trans_handle *trans, - struct btrfs_fs_info *fs_info, u64 bytenr, - u64 time_seq, struct ulist **roots) +static int __btrfs_find_all_roots(struct btrfs_trans_handle *trans, + struct btrfs_fs_info *fs_info, u64 bytenr, + u64 time_seq, struct ulist **roots) { struct ulist *tmp; struct ulist_node *node = NULL; @@ -1130,6 +1133,20 @@ int btrfs_find_all_roots(struct btrfs_trans_handle *trans, return 0; } +int btrfs_find_all_roots(struct btrfs_trans_handle *trans, + struct btrfs_fs_info *fs_info, u64 bytenr, + u64 time_seq, struct ulist **roots) +{ + int ret; + + if (!trans) + down_read(fs_info-commit_root_sem); + ret = __btrfs_find_all_roots(trans, fs_info, bytenr, time_seq, roots); + if (!trans) + up_read(fs_info-commit_root_sem); + return ret; +} + /* * this makes the path point to (inum INODE_ITEM ioff) */ @@ -1509,6 +1526,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, if (IS_ERR(trans)) return PTR_ERR(trans); btrfs_get_tree_mod_seq(fs_info, tree_mod_seq_elem); + } else { + down_read(fs_info-commit_root_sem); } ret = btrfs_find_all_leafs(trans, fs_info, extent_item_objectid, @@ -1519,8 +1538,8 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, ULIST_ITER_INIT(ref_uiter); while (!ret (ref_node = ulist_next(refs, ref_uiter))) { - ret = btrfs_find_all_roots(trans, fs_info, ref_node-val, -tree_mod_seq_elem.seq, roots); + ret = __btrfs_find_all_roots(trans, fs_info, ref_node-val, + tree_mod_seq_elem.seq, roots); if (ret)