OOM killer invoked during btrfs send/recieve on otherwise idle machine

2016-07-30 Thread Markus Trippelsdorf
Tonight the OOM killer got invoked during backup of /:

[Jul31 01:56] kthreadd invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[  +0.04] CPU: 3 PID: 2 Comm: kthreadd Not tainted 
4.7.0-06816-g797cee982eef-dirty #37
[  +0.00] Hardware name: System manufacturer System Product Name/M4A78T-E, 
BIOS 350304/13/2011
[  +0.02]   813c2d58 8802168e7d48 
002ec4ea
[  +0.02]  8118eb9d 01b8 0440 
03b0
[  +0.02]  8802133fe400 002ec4ea 81b8ac9c 
0006
[  +0.01] Call Trace:
[  +0.04]  [] ? dump_stack+0x46/0x6e
[  +0.03]  [] ? dump_header.isra.11+0x4c/0x1a7
[  +0.02]  [] ? oom_kill_process+0x2ab/0x460
[  +0.01]  [] ? out_of_memory+0x2e3/0x380
[  +0.02]  [] ? 
__alloc_pages_slowpath.constprop.124+0x1d32/0x1e40
[  +0.01]  [] ? __alloc_pages_nodemask+0x10c/0x120
[  +0.02]  [] ? copy_process.part.72+0xea/0x17a0
[  +0.02]  [] ? pick_next_task_fair+0x915/0x1520
[  +0.01]  [] ? kthread_flush_work_fn+0x20/0x20
[  +0.01]  [] ? kernel_thread+0x7a/0x1c0
[  +0.01]  [] ? kthreadd+0xd2/0x120
[  +0.02]  [] ? ret_from_fork+0x1f/0x40
[  +0.01]  [] ? kthread_stop+0x100/0x100
[  +0.01] Mem-Info:
[  +0.03] active_anon:5882 inactive_anon:60307 isolated_anon:0
   active_file:1523729 inactive_file:223965 isolated_file:0
   unevictable:1970 dirty:130014 writeback:40735 unstable:0
   slab_reclaimable:179690 slab_unreclaimable:8041
   mapped:6771 shmem:3 pagetables:592 bounce:0
   free:11374 free_pcp:54 free_cma:0
[  +0.04] Node 0 active_anon:23528kB inactive_anon:241228kB 
active_file:6094916kB inactive_file:895860kB unevictable:7880kB 
isolated(anon):0kB isolated(file):0kB mapped:27084kB dirty:520056kB 
writeback:162940kB shmem:12kB writeback_tmp:0kB unstable:0kB pages_scanned:32 
all_unreclaimable? no
[  +0.02] DMA free:15908kB min:20kB low:32kB high:44kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
writepending:0kB present:15992kB managed:15908kB mlocked:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB 
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  +0.01] lowmem_reserve[]: 0 3486 7953 7953
[  +0.04] DMA32 free:23456kB min:4996kB low:8564kB high:12132kB 
active_anon:2480kB inactive_anon:10564kB active_file:2559792kB 
inactive_file:478680kB unevictable:0kB writepending:365292kB present:3652160kB 
managed:3574264kB mlocked:0kB slab_reclaimable:437456kB 
slab_unreclaimable:12304kB kernel_stack:144kB pagetables:28kB bounce:0kB 
free_pcp:212kB local_pcp:0kB free_cma:0kB
[  +0.01] lowmem_reserve[]: 0 0 4466 4466
[  +0.03] Normal free:6132kB min:6400kB low:10972kB high:15544kB 
active_anon:21048kB inactive_anon:230664kB active_file:3535124kB 
inactive_file:417312kB unevictable:7880kB writepending:318020kB 
present:4718592kB managed:4574096kB mlocked:7880kB slab_reclaimable:281304kB 
slab_unreclaimable:19860kB kernel_stack:2944kB pagetables:2340kB bounce:0kB 
free_pcp:0kB local_pcp:0kB free_cma:0kB
[  +0.00] lowmem_reserve[]: 0 0 0 0
[  +0.02] DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (M) = 15908kB
[  +0.05] DMA32: 4215*4kB (UMEH) 319*8kB (UMH) 5*16kB (H) 2*32kB (H) 2*64kB 
(H) 1*128kB (H) 0*256kB 1*512kB (H) 1*1024kB (H) 1*2048kB (H) 0*4096kB = 23396kB
[  +0.06] Normal: 650*4kB (UMH) 4*8kB (UH) 27*16kB (H) 23*32kB (H) 17*64kB 
(H) 11*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6296kB
[  +0.05] 1749526 total pagecache pages
[  +0.01] 150 pages in swap cache
[  +0.01] Swap cache stats: add 1222, delete 1072, find 2366/2401
[  +0.00] Free swap  = 4091520kB
[  +0.01] Total swap = 4095996kB
[  +0.00] 2096686 pages RAM
[  +0.01] 0 pages HighMem/MovableOnly
[  +0.00] 55619 pages reserved
[  +0.01] [ pid ]   uid  tgid total_vm  rss nr_ptes nr_pmds swapents 
oom_score_adj name
[  +0.04] [  153] 0   153 4087  406   9   3  104
 -1000 udevd
[  +0.01] [  181] 0   181 5718 1169  15   3  143
 0 syslog-ng
[  +0.01] [  187]   102   18788789 5137  53   3  663
 0 mpd
[  +0.02] [  188] 0   18822278 1956  16   30
 0 ntpd
[  +0.01] [  189] 0   189 4973  859  14   3  188
 0 cupsd
[  +0.01] [  192] 0   192 2680  391  10   3   21
 0 fcron
[  +0.01] [  219] 0   219 4449  506  13   30
 0 login
[  +0.02] [  220] 0   220 2876  368   9   30
 0 agetty
[  +0.01] [  222]31   2222719320995  57   

Re: btrfs: relocation: Fix leaking qgroups numbers on data extents

2016-07-30 Thread Qu Wenruo



On 07/30/2016 11:57 PM, Goldwyn Rodrigues wrote:



On 07/29/2016 08:06 PM, Qu Wenruo wrote:

Hi, Goldwyn,

This patch is replaced by the following patchset:
https://patchwork.kernel.org/patch/9213915/
https://patchwork.kernel.org/patch/9213913/

Would you mind testing the new patch?




Sorry, it fails. Actually, the previous patch fails on one of the more
aggressive tests by Mark. So, I would revoke my "Tested-by: " there.

Attached is Mark's test case. Make sure your /boot and /usr directories
have enough files.



Thanks for the info.

I'll check it further.

Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Balance and subvolume delete causes deadlock and metadata corruption

2016-07-30 Thread Hans van Kranenburg
On 07/31/2016 02:13 AM, Hans van Kranenburg wrote:
> On 07/31/2016 01:18 AM, Hans van Kranenburg wrote:
>> blahblahblahblahblalahba
>>
>> Doing a btrfs check now on the block device, will fup with the output.
>>
> 
> Output so far:
> 
> -# btrfs check /dev/xvdc 2>&1 | tee btrfs-check-xvdc
> checking extents
> ref mismatch on [3649516437504 16384] extent item 0, found 1
> Backref 3649516437504 parent 4100606377984 root 4100606377984 not found
> in extent tree
> backpointer mismatch on [3649516437504 16384]
> owner ref check failed [3649516437504 16384]
> ref mismatch on [4091477114880 16384] extent item 0, found 1
[...]
> ref mismatch on [4103438221312 16384] extent item 0, found 1
> Backref 4103438221312 parent 4100606377984 root 4100606377984 not found
> in extent tree
> backpointer mismatch on [4103438221312 16384]
> owner ref check failed [4103438221312 16384]
> checking free space tree
> checking fs roots
> [...]
> 

Yay...

checking free space tree
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/xvdc
UUID: 37c37071-4080-418b-a27a-29d7d15ec00d
cache and super generation don't match, space cache will be invalidated
found 1877500653751 bytes used err is 0
total csum bytes: 1805473324
total tree bytes: 28709109760
total fs tree bytes: 26028736512
total extent tree bytes: 589660160
btree space waste bytes: 4777447505
file data blocks allocated: 2809995116544
 referenced 2809992523776

-- 
Hans van Kranenburg - System / Network Engineer
T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-30 Thread Chris Murphy
On Sat, Jul 30, 2016 at 2:02 PM, Chris Murphy  wrote:
> Short version: When systemd-logind login.conf KillUserProcesses=yes,
> and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and

Same thing with Xfce, so it's not DE specific. (Unsuprising.)

I inflated the size of the test volume, and it seems pretty clear that
the scrub is not completing, as the kernel threads stop sooner when
logging out vs not logging out. So the status reporting an
interruption appears to be valid for the net operation, not merely the
user space tool being interrupted.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Balance and subvolume delete causes deadlock and metadata corruption

2016-07-30 Thread Hans van Kranenburg
On 07/31/2016 01:18 AM, Hans van Kranenburg wrote:
> blahblahblahblahblalahba
> 
> Doing a btrfs check now on the block device, will fup with the output.
> 

Output so far:

-# btrfs check /dev/xvdc 2>&1 | tee btrfs-check-xvdc
checking extents
ref mismatch on [3649516437504 16384] extent item 0, found 1
Backref 3649516437504 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [3649516437504 16384]
owner ref check failed [3649516437504 16384]
ref mismatch on [4091477114880 16384] extent item 0, found 1
Backref 4091477114880 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4091477114880 16384]
ref mismatch on [4092213280768 16384] extent item 0, found 1
Backref 4092213280768 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4092213280768 16384]
owner ref check failed [4092213280768 16384]
ref mismatch on [4093204430848 16384] extent item 0, found 1
Backref 4093204430848 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4093204430848 16384]
owner ref check failed [4093204430848 16384]
ref mismatch on [4093217013760 16384] extent item 0, found 1
Backref 4093217013760 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4093217013760 16384]
owner ref check failed [4093217013760 16384]
ref mismatch on [4093640835072 16384] extent item 0, found 1
Backref 4093640835072 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4093640835072 16384]
owner ref check failed [4093640835072 16384]
ref mismatch on [4094577901568 16384] extent item 0, found 1
Backref 4094577901568 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4094577901568 16384]
ref mismatch on [4095055790080 16384] extent item 0, found 1
Backref 4095055790080 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095055790080 16384]
owner ref check failed [4095055790080 16384]
ref mismatch on [4095118868480 16384] extent item 0, found 1
Backref 4095118868480 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095118868480 16384]
owner ref check failed [4095118868480 16384]
ref mismatch on [4095127683072 16384] extent item 0, found 1
Backref 4095127683072 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095127683072 16384]
owner ref check failed [4095127683072 16384]
ref mismatch on [4095172657152 16384] extent item 0, found 1
Backref 4095172657152 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095172657152 16384]
owner ref check failed [4095172657152 16384]
ref mismatch on [4095363858432 16384] extent item 0, found 1
Backref 4095363858432 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095363858432 16384]
owner ref check failed [4095363858432 16384]
ref mismatch on [4095556337664 16384] extent item 0, found 1
Backref 4095556337664 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095556337664 16384]
owner ref check failed [4095556337664 16384]
ref mismatch on [4095593791488 16384] extent item 0, found 1
Backref 4095593791488 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095593791488 16384]
ref mismatch on [4095625068544 16384] extent item 0, found 1
Backref 4095625068544 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095625068544 16384]
ref mismatch on [4095656804352 16384] extent item 0, found 1
Backref 4095656804352 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095656804352 16384]
owner ref check failed [4095656804352 16384]
ref mismatch on [4095848955904 16384] extent item 0, found 1
Backref 4095848955904 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095848955904 16384]
owner ref check failed [4095848955904 16384]
ref mismatch on [4095949471744 16384] extent item 0, found 1
Backref 4095949471744 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095949471744 16384]
owner ref check failed [4095949471744 16384]
ref mismatch on [4095954468864 16384] extent item 0, found 1
Backref 4095954468864 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095954468864 16384]
owner ref check failed [4095954468864 16384]
ref mismatch on [4095955189760 16384] extent item 0, found 1
Backref 4095955189760 parent 4100606377984 root 4100606377984 not found
in extent tree
backpointer mismatch on [4095955189760 16384]
owner ref check failed [4095955189760 16384]
ref mismatch on [4096263684096 16384] extent item 0, found 1
Backref 4096263684096 parent 4100606377984 root 4100606377984 not found
in extent tree

Balance and subvolume delete causes deadlock and metadata corruption

2016-07-30 Thread Hans van Kranenburg
Hi,

tl;dr: concurrent metadata balance and subvol delete caused deadlock and
metadata corruption. could mount ro, copied data off of the filesystem.
filesystem still available for science, can possibly mount rw, but
crashes when space_tree v2 bails out when it sees the metadata corruption.

Yesterday, I changed a btrfs filesystem to start using skinny metadata,
and did a metadata balance for fun to convert all backrefs to skinny ones.

-# btrfstune -x /dev/xvdb
-# mount /mnt

-# btrfs fi df /mnt
Data, single: total=2.05TiB, used=1.68TiB
System, single: total=32.00MiB, used=368.00KiB
Metadata, single: total=30.00GiB, used=26.70GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

-# btrfs balance start -musage=100 /mnt

Balance on '/mnt' is running
1 out of about 31 chunks balanced (299 considered),  97% left
Data, single: total=2101.01GiB, used=1722.78GiB
System, single: total=0.03GiB, used=0.00GiB
Metadata, single: total=30.00GiB, used=26.70GiB
GlobalReserve, single: total=0.50GiB, used=0.05GiB

This went fine, until somewhere on the filesystem, a btrfs subvolume
delete happened. I already try to avoid all possible non-trivial things
from happening when using balance because of bad experiences with
crashes and deadlocks with balance and subvolume operations at the same
time... but I apparently forgot a very specific cron job creeping around
in a corner that was awaking and causing a subvolume delete...

The btrfs subvol delete hung in state D. I lost the ps axfu output.

This happened with kernel 4.5.4(-1~bpo8+1):

INFO: task kworker/u8:1:16983 blocked for more than 120 seconds.
  Tainted: GW   E   4.5.0-0.bpo.2-amd64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:1D 8801f5c15e00 0 16983  2 0x
Workqueue: writeback wb_workfn (flush-btrfs-2)
 8801f29f23c0 8801f2f30440 8800179f 8800132b3cc0
 8800132b3cd8 8800179ef9f8 8800132b3cb8 0001
 815b6451 8800132b3c58 c00c6664 8800
Call Trace:
 [] ? schedule+0x31/0x80
 [] ? btrfs_tree_lock+0x74/0x210 [btrfs]
 [] ? wait_woken+0x90/0x90
 [] ? lock_extent_buffer_for_io+0x1dc/0x1f0 [btrfs]
 [] ? btree_write_cache_pages+0x285/0x3a0 [btrfs]
 [] ? __writeback_single_inode+0x3d/0x320
 [] ? writeback_sb_inodes+0x23d/0x470
 [] ? __writeback_inodes_wb+0x87/0xb0
 [] ? wb_writeback+0x280/0x310
 [] ? wb_workfn+0x213/0x3e0
 [] ? process_one_work+0x14b/0x400
 [] ? worker_thread+0x65/0x4a0
 [] ? rescuer_thread+0x340/0x340
 [] ? kthread+0xdf/0x100
 [] ? kthread_park+0x50/0x50
 [] ? ret_from_fork+0x3f/0x70
 [] ? kthread_park+0x50/0x50
INFO: task kworker/u8:2:7762 blocked for more than 120 seconds.
  Tainted: GW   E   4.5.0-0.bpo.2-amd64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:2D 8801f5d15e00 0  7762  2 0x
Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
 880151d85140 8801f4dfe180 88012aac4000 8801c9ccf9f0
 8801c9ccf800 8801c9ccf9f0 0001 880151d85140
 815b6451 880160aad050 c0089d0d 8801
Call Trace:
 [] ? schedule+0x31/0x80
 [] ? wait_current_trans.isra.21+0xcd/0x110 [btrfs]
 [] ? wait_woken+0x90/0x90
 [] ? start_transaction+0x286/0x4d0 [btrfs]
 [] ? delayed_ref_async_start+0x13/0x80 [btrfs]
 [] ? normal_work_helper+0xc6/0x2c0 [btrfs]
 [] ? process_one_work+0x14b/0x400
 [] ? worker_thread+0x65/0x4a0
 [] ? rescuer_thread+0x340/0x340
 [] ? kthread+0xdf/0x100
 [] ? kthread_park+0x50/0x50
 [] ? ret_from_fork+0x3f/0x70
 [] ? kthread_park+0x50/0x50
INFO: task btrfs-transacti:8607 blocked for more than 120 seconds.
  Tainted: GW   E   4.5.0-0.bpo.2-amd64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D 8801f5d95e00 0  8607  2 0x
 8800fe2ff100 8801f4e08f80 88005ad38000 88002ab9d978
 88002ab9d990 88005ad37c50 88002ab9d970 0001
 815b6451 88002ab9d910 c00c6664 8801
Call Trace:
 [] ? schedule+0x31/0x80
 [] ? btrfs_tree_lock+0x74/0x210 [btrfs]
 [] ? wait_woken+0x90/0x90
 [] ? btrfs_search_slot+0x6cd/0x9e0 [btrfs]
 [] ? btrfs_update_root+0x5e/0x340 [btrfs]
 [] ? commit_fs_roots.isra.19+0x110/0x160 [btrfs]
 [] ? btrfs_commit_transaction+0x4fc/0xa30 [btrfs]
 [] ? wait_woken+0x90/0x90
 [] ? transaction_kthread+0x1d2/0x240 [btrfs]
 [] ? btrfs_cleanup_transaction+0x590/0x590 [btrfs]
 [] ? kthread+0xdf/0x100
 [] ? kthread_park+0x50/0x50
 [] ? ret_from_fork+0x3f/0x70
 [] ? kthread_park+0x50/0x50
INFO: task btrfs-uuid:8608 blocked for more than 120 seconds.
  Tainted: GW   E   4.5.0-0.bpo.2-amd64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-uuid  D 8801f5c15e00 0  8608  2 0x
 8800b5744f00 81a13540 8800df94 8800df93fbe8
 8801756f6d18 

Fixup direct bi_rw modifiers

2016-07-30 Thread Shaun Tancheff
bi_rw should be using bio_set_op_attrs to set bi_rw.

Signed-off-by: Shaun Tancheff 

Cc: Chris Mason 
Cc: Josef Bacik 
Cc: David Sterba 
Cc: Mike Christie 
---
Patch is against linux-next tag next-20160729

NOTE: In 4.7 this was not including the 'WRITE' macro so may have
  it may not have been operating as intended.
---
 fs/btrfs/extent_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index f67d6a1..720e6ef 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2050,7 +2050,7 @@ int repair_io_failure(struct inode *inode, u64 start, u64 
length, u64 logical,
return -EIO;
}
bio->bi_bdev = dev->bdev;
-   bio->bi_rw = WRITE_SYNC;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, WRITE_SYNC);
bio_add_page(bio, page, length, pg_offset);
 
if (btrfsic_submit_bio_wait(bio)) {
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


systemd KillUserProcesses=yes and btrfs scrub

2016-07-30 Thread Chris Murphy
Short version: When systemd-logind login.conf KillUserProcesses=yes,
and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and
then logs out of the shell, the user space operation is killed, and
btrfs scrub status reports that the scrub was aborted. [1]

I think what's going on is the user space stuff is what's tracking the
status and statistics, so when that process goes to status Z on
logout, all of that accounting stops. But I can't tell if the kernel
scrub code finishes. Those threads continue some time after btrfs user
space process goes Z, but seems like not quite long enough to actually
finish the scrub.

Neither 'btrfs balance &' (I have not tried the background balance
code in progs 4.7) nor 'btrfs replace start' appear to be likewise
affected. The user process remains after logout and the task appears
to complete without problems. So at the moment I'm only thinking scrub
is affected, but I'm not sure why.


[1]
systemd KillUserProcesses=yes and btrfs scrub
https://bugzilla.kernel.org/show_bug.cgi?id=150781


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs send to send out metadata and data separately

2016-07-30 Thread g . btrfs
On 29/07/16 13:40, Qu Wenruo wrote:
> Cons:
> 1) Not full fs clone detection
>Such clone detection is only inside the send snapshot.
> 
>For case that one extent is referred only once in the send snapshot,
>but also referred by source subvolume, then in the received
>subvolume, it will be a new extent, but not a clone.
> 
>Only extent that is referred twice by send snapshot, that extent
>will be shared.
> 
>(Although much better than disabling the whole clone detection)

Qu,

Does that mean that the following, extremely common, use of send would
be impacted?

Create many snapshots of a large and fairly busy sub-volume (say,
hourly) with few changes between each one. Send all the snapshots as
incremental sends to a second (backup) disk either as soon as they are
created, or maybe in bunches later.

With this change, would each of the snapshots require separate space
usage on the backup disk, with duplicates of unchanged files?  If so,
that would completely destroy the concept of keeping frequent snapshots
on a backup disk (and force us to keep the snapshots on the original
disk, causing **many** more problems with backref walks on the data disk).

(Does the answer change if we do non-incremental sends?)

I moved to this approach after the problems I had running balance on my
(very busy, and also large) data disk because of the number of snapshots
I was keeping on it.  My data disk has about 4TB in use, and I have just
bought a 10TB backup disk but I would need about 50 more of them if the
hourly snapshots were no longer sharing space! If that is the case, the
cure seems much worse than the disease.

Apologies if I have misunderstood the proposal.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: relocation: Fix leaking qgroups numbers on data extents

2016-07-30 Thread Goldwyn Rodrigues


On 07/29/2016 08:06 PM, Qu Wenruo wrote:
> Hi, Goldwyn,
> 
> This patch is replaced by the following patchset:
> https://patchwork.kernel.org/patch/9213915/
> https://patchwork.kernel.org/patch/9213913/
> 
> Would you mind testing the new patch?
> 


Sorry, it fails. Actually, the previous patch fails on one of the more
aggressive tests by Mark. So, I would revoke my "Tested-by: " there.

Attached is Mark's test case. Make sure your /boot and /usr directories
have enough files.


-- 
Goldwyn


reproduce-1.sh
Description: application/shellscript


bad tree blcok start & faild to read chunk root

2016-07-30 Thread Hendrik Friedel

Hello,

from this https://www.spinics.net/lists/linux-btrfs/msg57405.html
I still have damaged btrf file system (the partition was recovered. 
Thanks Chris).


When mounting, I get:
[15681.255356] BTRFS info (device sda1): disk space caching is enabled
[15681.255690] BTRFS error (device sda1): bad tree block start 0 
20987904
[15681.255786] BTRFS error (device sda1): bad tree block start 0 
20987904

[15681.255805] BTRFS: failed to read chunk root on

So I ran chunk-recover (output below).
Total Chunks:   5
 Recoverable:  0
 Unrecoverable:5

Is there anything else, I can do, or is that it?

Regards,
Hendrik

root@homeserver:~# btrfs rescue chunk-recover -v /dev/sda1
All Devices:
   Device: id = 1, name = /dev/sda1

Scanning: DONE in dev0
DEVICE SCAN RESULT:
Filesystem Information:
   sectorsize: 4096
   leafsize: 16384
   tree root generation: 22444670
   chunk root generation: 233419

All Devices:
   Device: id = 1, name = /dev/sda1

All Block Groups:
   Block Group: start = 7776239616, len = 1073741824, flag = 1
   Block Group: start = 61496885248, len = 536870912, flag = 24
   Block Group: start = 62033756160, len = 536870912, flag = 24
   Block Group: start = 71697432576, len = 1073741824, flag = 1
   Block Group: start = 102835945472, len = 1073741824, flag = 1

All Chunks:

All Device Extents:

CHECK RESULT:
Recoverable Chunks:
Unrecoverable Chunks:
 Chunk: start = 7776239616, len = 1073741824, type = 1, num_stripes = 0
 Stripes list:
 Block Group: start = 7776239616, len = 1073741824, flag = 1
 No device extent.
 Chunk: start = 62033756160, len = 536870912, type = 24, num_stripes = 0
 Stripes list:
 Block Group: start = 62033756160, len = 536870912, flag = 24
 No device extent.
 Chunk: start = 61496885248, len = 536870912, type = 24, num_stripes = 0
 Stripes list:
 Block Group: start = 61496885248, len = 536870912, flag = 24
 No device extent.
 Chunk: start = 71697432576, len = 1073741824, type = 1, num_stripes = 0
 Stripes list:
 Block Group: start = 71697432576, len = 1073741824, flag = 1
 No device extent.
 Chunk: start = 102835945472, len = 1073741824, type = 1, num_stripes = 
0

 Stripes list:
 Block Group: start = 102835945472, len = 1073741824, flag = 1
 No device extent.

Total Chunks:   5
 Recoverable:  0
 Unrecoverable:5

Orphan Block Groups:

Orphan Device Extents:

Couldn't map the block 62034313216
No mapping for 62034313216-62034329600
Couldn't map the block 62034313216
bytenr mismatch, want=62034313216, have=0
Couldn't read tree root
open with broken chunk error
Chunk tree recovery failed


---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Change RAID stripesize to a user-configurable option

2016-07-30 Thread Sanidhya Solanki
Any comments?

On Thu, 28 Jul 2016 13:32:27 +0200
David Sterba  wrote:

> I'll comment on the overall approach and skip code-specific comments.
> 
> The changelog does not explain why there's a need for a new blockgroup
> type and what's the relation to the existing types. It seems that it
> extends the data/metadata/system group, but I think this is totally
> wrong.

I agree in principle, but I did not want to modify the existing balance
targets, but, instead, piggyback on the existing balance implementation
to re-balance the data.
This approach was recommended to be by an experienced BTrFS developer
on the IRC as the right way to implement the change. My previous
implementation before asking on the IRC used a new ioctl call to
change the hard coded values and then re-write the data (not a good
approach in hindsight.)
 
> The proposed changes modify part of the on-disk format, that would
> require a incompat bit and brings the usual load of unpleasant issues
> with backward compatibility. The current data structures should be
> enough for configurable stripe size.  If you want to make stripe size
> configurable, then replace all hardcoded values of BTRFS_STRIPE_LEN.

No re-balance required after passing the stripe size change command?
What about the on-disk metadata, that relies on the "stripesize" and
"stripe_len" as variables for calculations and the basis of pre-set
metadata?

Thanks
Sanidhya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs on sparc64 results in kernel stack trace in 1 minute test

2016-07-30 Thread Anatoly Pugachev
On Sat, Jul 30, 2016 at 12:52 AM, Jeff Mahoney  wrote:
>> On Jul 29, 2016, at 5:11 PM, Anatoly Pugachev  wrote:
>> and in logs:
>>
>> Jul 30 00:05:48 nvg5120 kernel: BTRFS info (device loop0): inode
>> 227514 still on the orphan list
>> Jul 30 00:06:01 nvg5120 kernel: [ cut here ]
>> Jul 30 00:06:01 nvg5120 kernel: WARNING: CPU: 36 PID: 3110 at
>> fs/btrfs/inode.c:3215 btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel: Modules linked in: loop btrfs
>> zlib_deflate sg e1000e ptp pps_core n2_crypto(+) flash sha256_generic
>> des_generic n2_rng rng_core sunrpc autofs4 ext4 crc16 jbd2 mbcache
>> raid10 raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy
>> async_pq raid6_pq async_xor xor async_tx raid0 multipath linear dm_mod
>> raid1 md_mod sd_mod mptsas scsi_transport_sas mptscsih scsi_mod
>> mptbase
>> Jul 30 00:06:02 nvg5120 kernel: CPU: 36 PID: 3110 Comm:
>> btrfs-transacti Tainted: G  D 4.7.0+ #51
>> Jul 30 00:06:02 nvg5120 kernel: Call Trace:
>> Jul 30 00:06:02 nvg5120 kernel:  [00463e44] __warn+0xa4/0xc0
>> Jul 30 00:06:02 nvg5120 kernel:  [10a2ae48]
>> btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a214c0]
>> commit_fs_roots+0xa0/0x180 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a242d0]
>> btrfs_commit_transaction+0x4b0/0xd00 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a1cc30]
>> transaction_kthread+0xf0/0x1c0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [00480ff0] kthread+0xb0/0xe0
>> Jul 30 00:06:02 nvg5120 kernel:  [00406044] ret_from_fork+0x1c/0x2c
>> Jul 30 00:06:02 nvg5120 kernel:  []   (null)
>> Jul 30 00:06:02 nvg5120 kernel: ---[ end trace ee8374e54a090229 ]---
>>
> This is tainted D, which means there's an Oops above this in the log.  Can 
> you provide that?


Jeff,

it is another kernel OOPS, which i need to investigate:

Jul 29 21:25:35 nvg5120 kernel: e1000e :09:00.1 enp9s0f1: renamed from eth3
Jul 29 21:25:35 nvg5120 systemd-udevd[1488]: worker [1654] terminated
by signal 9 (Killed)
Jul 29 21:25:35 nvg5120 systemd-udevd[1488]: worker [1654] failed
while handling '/devices/root/f0283a50/f028681c'
Jul 29 21:25:36 nvg5120 systemd[1]: Found device ST914602SSUN146G 1.
Jul 29 21:25:40 nvg5120 kernel: e1000e :08:00.1 enp8s0f1: renamed from eth1
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: md5 alg registration failed
Jul 29 21:25:40 nvg5120 kernel: n2cp f028681c:
/virtual-devices@100/n2cp@7: Unable to register algorithms.
Jul 29 21:25:40 nvg5120 kernel: sha1_sparc64: sparc64 sha1 opcode not available.
Jul 29 21:25:40 nvg5120 kernel: n2cp: probe of f028681c failed with error -22
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: Found NCP at
/virtual-devices@100/ncp@6
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: Registered NCS HVAPI version 2.0
Jul 29 21:25:40 nvg5120 kernel: Kernel unaligned access at TPC[577b68]
kmem_cache_alloc+0xa8/0x1a0
Jul 29 21:25:40 nvg5120 kernel: Unable to handle kernel paging request
in mna handler
Jul 29 21:25:40 nvg5120 kernel:  at virtual address 6b6aeb6f69f2cb6b
Jul 29 21:25:41 nvg5120 kernel: current->{active_,}mm->context =
07a2
Jul 29 21:25:41 nvg5120 kernel: current->{active_,}mm->pgd = 8003e9c72000
Jul 29 21:25:41 nvg5120 kernel:   \|/  \|/
  "@'/ .. \`@"
  /_| \__/ |_\
 \__U_/
Jul 29 21:25:41 nvg5120 kernel: systemd-udevd(1654): Oops [#1]
Jul 29 21:25:41 nvg5120 kernel: CPU: 56 PID: 1654 Comm: systemd-udevd
Not tainted 4.7.0+ #51
Jul 29 21:25:41 nvg5120 kernel: task: 8003ecf90a20 ti:
8003edcd4000 task.ti: 8003edcd4000
Jul 29 21:25:41 nvg5120 kernel: TSTATE: 004411e01605 TPC:
00577b68 TNPC: 00577b6c Y: Not tainted
Jul 29 21:25:41 nvg5120 kernel: TPC: 
Jul 29 21:25:41 nvg5120 kernel: g0:  g1:
6b6b6b6b6b6b6b6b g2:  g3: 
Jul 29 21:25:41 nvg5120 kernel: g4: 8003ecf90a20 g5:
8003fe876000 g6: 8003edcd4000 g7: cee0
Jul 29 21:25:41 nvg5120 kernel: o0:  o1:
03ff o2:  o3: 8003eee883c0
Jul 29 21:25:41 nvg5120 kernel: o4: 0080 o5:
0011 sp: 8003edcd6b51 ret_pc: 00577b34
Jul 29 21:25:41 nvg5120 kernel: RPC: 
Jul 29 21:25:41 nvg5120 kernel: l0: 8003ffa28040 l1:
8003ffa28030 l2: d5c0 l3: 009f4800
Jul 29 21:25:41 nvg5120 kernel: l4:  l5:
009f4c00 l6: 00ab2968 l7: 00ab296a
Jul 29 21:25:41 nvg5120 kernel: i0: 8003f1dad580 i1:
024080c0 i2: 106230e8 i3: 
Jul 29 21:25:41 nvg5120 kernel: i4: 10621d90 i5:
024080c0 i6: