Re: experiences running btrfs on external USB disks?

2018-12-03 Thread Tomasz Chmielewski

On 2018-12-04 14:59, Chris Murphy wrote:

Running 4.19.6 right now, but was experiencing the issue also with 
4.18

kernels.



# btrfs device stats /data
[/dev/sda1].write_io_errs0
[/dev/sda1].read_io_errs 0
[/dev/sda1].flush_io_errs0
[/dev/sda1].corruption_errs  0
[/dev/sda1].generation_errs  0



Hard to say without a complete dmesg; but errno=-5 IO failure is
pretty much some kind of hardware problem in my experience. I haven't
seen it be a bug.


It is a complete dmesg - in sense:

# grep -i btrfs -A5 -B5 /var/log/syslog
Dec  4 05:06:56 step snapd[747]: udevmon.go:184: udev monitor observed 
remove event for unknown device 
"/sys/skbuff_head_cache(1481:anacron.service)"
Dec  4 05:06:56 step snapd[747]: udevmon.go:184: udev monitor observed 
remove event for unknown device "/sys/buffer_head(1481:anacron.service)"
Dec  4 05:06:56 step snapd[747]: udevmon.go:184: udev monitor observed 
remove event for unknown device 
"/sys/ext4_inode_cache(1481:anacron.service)"
Dec  4 05:15:01 step CRON[9352]: (root) CMD (command -v debian-sa1 > 
/dev/null && debian-sa1 1 1)
Dec  4 05:17:01 step CRON[9358]: (root) CMD (   cd / && run-parts 
--report /etc/cron.hourly)
Dec  4 05:23:13 step kernel: [77760.444607] BTRFS error (device sdb1): 
bad tree block start, want 378372096 have 0
Dec  4 05:23:13 step kernel: [77760.550933] BTRFS error (device sdb1): 
bad tree block start, want 378372096 have 0
Dec  4 05:23:13 step kernel: [77760.550972] BTRFS: error (device sdb1) 
in __btrfs_free_extent:6804: errno=-5 IO failure
Dec  4 05:23:13 step kernel: [77760.550979] BTRFS info (device sdb1): 
forced readonly
Dec  4 05:23:13 step kernel: [77760.551003] BTRFS: error (device sdb1) 
in btrfs_run_delayed_refs:2935: errno=-5 IO failure
Dec  4 05:23:13 step kernel: [77760.553223] BTRFS error (device sdb1): 
pending csums is 4096
Dec  4 05:23:14 step postfix/pickup[8993]: 13BBE460F86: uid=0 
from=
Dec  4 05:23:14 step postfix/cleanup[9398]: 13BBE460F86: 
message-id=<20181204052314.13BBE460F86@step>
Dec  4 05:23:14 step postfix/qmgr[2745]: 13BBE460F86: from=, 
size=404, nrcpt=1 (queue active)
Dec  4 05:23:14 step postfix/pickup[8993]: 40A964603EC: uid=0 
from=


[...some emails follow, usual CRON messages etc., but noting at all 
generated by the kernel, no hardware issue reported...]




Tomasz Chmielewski


experiences running btrfs on external USB disks?

2018-12-03 Thread Tomasz Chmielewski

I'm trying to use btrfs on an external USB drive, without much success.

When the drive is connected for 2-3+ days, the filesystem gets remounted 
readonly, with BTRFS saying "IO failure":


[77760.444607] BTRFS error (device sdb1): bad tree block start, want 
378372096 have 0
[77760.550933] BTRFS error (device sdb1): bad tree block start, want 
378372096 have 0
[77760.550972] BTRFS: error (device sdb1) in __btrfs_free_extent:6804: 
errno=-5 IO failure

[77760.550979] BTRFS info (device sdb1): forced readonly
[77760.551003] BTRFS: error (device sdb1) in 
btrfs_run_delayed_refs:2935: errno=-5 IO failure

[77760.553223] BTRFS error (device sdb1): pending csums is 4096


Note that there are no other kernel messages (i.e. that would indicate a 
problem with disk, cable disconnection etc.).


The load on the drive itself can be quite heavy at times (i.e. 100% IO 
for 1-2 h and more) - can it contribute to the problem (i.e. btrfs 
thinks there is some timeout somewhere)?


Running 4.19.6 right now, but was experiencing the issue also with 4.18 
kernels.




# btrfs device stats /data
[/dev/sda1].write_io_errs0
[/dev/sda1].read_io_errs 0
[/dev/sda1].flush_io_errs0
[/dev/sda1].corruption_errs  0
[/dev/sda1].generation_errs  0



Tomasz Chmielewski


Re: btrfs-cleaner 100% busy on an idle filesystem with 4.19.3

2018-11-22 Thread Tomasz Chmielewski

On 2018-11-22 21:46, Nikolay Borisov wrote:


# echo w > /proc/sysrq-trigger

# dmesg -c
[  931.585611] sysrq: SysRq : Show Blocked State
[  931.585715]   task    PC stack   pid father
[  931.590168] btrfs-cleaner   D    0  1340  2 0x8000
[  931.590175] Call Trace:
[  931.590190]  __schedule+0x29e/0x840
[  931.590195]  schedule+0x2c/0x80
[  931.590199]  schedule_timeout+0x258/0x360
[  931.590204]  io_schedule_timeout+0x1e/0x50
[  931.590208]  wait_for_completion_io+0xb7/0x140
[  931.590214]  ? wake_up_q+0x80/0x80
[  931.590219]  submit_bio_wait+0x61/0x90
[  931.590225]  blkdev_issue_discard+0x7a/0xd0
[  931.590266]  btrfs_issue_discard+0x123/0x160 [btrfs]
[  931.590299]  btrfs_discard_extent+0xd8/0x160 [btrfs]
[  931.590335]  btrfs_finish_extent_commit+0xe2/0x240 [btrfs]
[  931.590382]  btrfs_commit_transaction+0x573/0x840 [btrfs]
[  931.590415]  ? btrfs_block_rsv_check+0x25/0x70 [btrfs]
[  931.590456]  __btrfs_end_transaction+0x2be/0x2d0 [btrfs]
[  931.590493]  btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
[  931.590530]  btrfs_drop_snapshot+0x489/0x800 [btrfs]
[  931.590567]  btrfs_clean_one_deleted_snapshot+0xbb/0xf0 [btrfs]
[  931.590607]  cleaner_kthread+0x136/0x160 [btrfs]
[  931.590612]  kthread+0x120/0x140
[  931.590646]  ? btree_submit_bio_start+0x20/0x20 [btrfs]
[  931.590658]  ? kthread_bind+0x40/0x40
[  931.590661]  ret_from_fork+0x22/0x40



It seems your filesystem is mounted with the DSICARD option meaning
every delete will result in discard this is highly suboptimal for 
ssd's.

Try remounting the fs without the discard option see if it helps.
Generally for discard you want to submit it in big batches (what fstrim
does) so that the ftl on the ssd could apply any optimisations it might
have up its sleeve.


Spot on!

Removed "discard" from fstab and added "ssd", rebooted - no more 
btrfs-cleaner running.


Do you know if the issue you described ("discard this is highly 
suboptimal for ssd") affects other filesystems as well to a similar 
extent? I.e. if using ext4 on ssd?




Would you finally care to share the smart data + the model and make of
the ssd?


2x these:

Model Family: Samsung based SSDs
Device Model: SAMSUNG MZ7LM1T9HCJM-5
Firmware Version: GXT1103Q
User Capacity:1,920,383,410,176 bytes [1.92 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device

1x this:

Device Model: Micron_5200_MTFDDAK1T9TDC
Firmware Version: D1MU004
User Capacity:1,920,383,410,176 bytes [1.92 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:Solid State Device
Form Factor:  2.5 inches


But - seems the issue was unneeded discard option, so not pasting 
unnecessary SMART data, thanks for finding this out.



Tomasz Chmielewski


btrfs-cleaner 100% busy on an idle filesystem with 4.19.3

2018-11-22 Thread Tomasz Chmielewski
eserve, single: total=512.00MiB, used=0.00B


# btrfs fi show /data/lxd
Label: 'lxd5'  uuid: 2b77b498-a644-430b-9dd9-2ad3d381448a
Total devices 3 FS bytes used 987.12GiB
devid1 size 1.73TiB used 804.03GiB path /dev/sda2
devid2 size 1.73TiB used 804.06GiB path /dev/sdb2
devid3 size 1.73TiB used 804.03GiB path /dev/sdc2


Tomasz Chmielewski
https://lxadm.com


Re: unable to mount btrfs after upgrading from 4.16.1 to 4.19.1

2018-11-09 Thread Tomasz Chmielewski

On 2018-11-10 04:20, Tomasz Chmielewski wrote:

On 2018-11-10 04:15, Tomasz Chmielewski wrote:

On 2018-11-10 03:20, Roman Mamedov wrote:

On Sat, 10 Nov 2018 03:08:01 +0900
Tomasz Chmielewski  wrote:

After upgrading from kernel 4.16.1 to 4.19.1 and a clean restart, 
the fs

no longer mounts:


Did you try rebooting back to 4.16.1 to see if it still mounts there?


Yes, just did.

Interestingly, it does mount when I boot back to 4.16.1 - side note -
it takes some 50 (!) minutes and ~8 GB of reads (according to iostat
-m) to mount... device size is 16 TB, on HDD.


Also - it did mount with 4.18.17.

Way faster, in some 2 min.


A few more clean reboot cycles with 4.18.17 and got:

[  113.677829] BTRFS error (device md2): open_ctree failed
[  113.692298] BTRFS info (device md2): force zstd compression, level 0
[  113.692302] BTRFS info (device md2): using free space tree
[  113.692304] BTRFS info (device md2): has skinny extents
[  113.897681] BTRFS error (device md2): super_total_bytes 
17920974913536 mismatch with fs_devices total_rw_bytes 35841949827072

[  113.897751] BTRFS error (device md2): failed to read chunk tree: -22
[  113.935149] BTRFS error (device md2): open_ctree failed


Another "mount /data" (without rebooting) mounted it fine.

Why are btrfs mounts so irregular here?

# btrfs device stats /data
[/dev/md2].write_io_errs0
[/dev/md2].read_io_errs 0
[/dev/md2].flush_io_errs0
[/dev/md2].corruption_errs  0
[/dev/md2].generation_errs  0


# btrfs fi usage /data
Overall:
Device size:  16.30TiB
Device allocated: 14.26TiB
Device unallocated:2.04TiB
Device missing:  0.00B
Used:  7.99TiB
Free (estimated):  8.27TiB  (min: 8.27TiB)
Data ratio:   1.00
Metadata ratio:   1.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:14.15TiB, Used:7.92TiB
   /dev/md2   14.15TiB

Metadata,single: Size:111.00GiB, Used:74.78GiB
   /dev/md2  111.00GiB

System,single: Size:32.00MiB, Used:1.81MiB
   /dev/md2   32.00MiB

Unallocated:
   /dev/md22.04TiB



Tomasz Chmielewski
https://lxadm.com


Re: unable to mount btrfs after upgrading from 4.16.1 to 4.19.1

2018-11-09 Thread Tomasz Chmielewski

On 2018-11-10 04:15, Tomasz Chmielewski wrote:

On 2018-11-10 03:20, Roman Mamedov wrote:

On Sat, 10 Nov 2018 03:08:01 +0900
Tomasz Chmielewski  wrote:

After upgrading from kernel 4.16.1 to 4.19.1 and a clean restart, the 
fs

no longer mounts:


Did you try rebooting back to 4.16.1 to see if it still mounts there?


Yes, just did.

Interestingly, it does mount when I boot back to 4.16.1 - side note -
it takes some 50 (!) minutes and ~8 GB of reads (according to iostat
-m) to mount... device size is 16 TB, on HDD.


Also - it did mount with 4.18.17.

Way faster, in some 2 min.


Tomasz Chmielewski
https://lxadm.com


Re: unable to mount btrfs after upgrading from 4.16.1 to 4.19.1

2018-11-09 Thread Tomasz Chmielewski

On 2018-11-10 03:20, Roman Mamedov wrote:

On Sat, 10 Nov 2018 03:08:01 +0900
Tomasz Chmielewski  wrote:

After upgrading from kernel 4.16.1 to 4.19.1 and a clean restart, the 
fs

no longer mounts:


Did you try rebooting back to 4.16.1 to see if it still mounts there?


Yes, just did.

Interestingly, it does mount when I boot back to 4.16.1 - side note - it 
takes some 50 (!) minutes and ~8 GB of reads (according to iostat -m) to 
mount... device size is 16 TB, on HDD.




Tomasz Chmielewski
https://lxadm.com


unable to mount btrfs after upgrading from 4.16.1 to 4.19.1

2018-11-09 Thread Tomasz Chmielewski

btrfs sits on md RAID-5:

/dev/md2 /data btrfs noatime,compress-force=zstd,space_cache=v2,noauto 0 
0



After upgrading from kernel 4.16.1 to 4.19.1 and a clean restart, the fs 
no longer mounts:



# mount /data
mount: wrong fs type, bad option, bad superblock on /dev/md2,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

# dmesg
[  322.877321] BTRFS info (device md2): force zstd compression, level 0
[  322.877326] BTRFS info (device md2): enabling free space tree
[  322.877329] BTRFS info (device md2): using free space tree
[  322.877330] BTRFS info (device md2): has skinny extents
[  367.832058] BTRFS error (device md2): bad tree block start, want 
21120019922944 have 620757027

[  367.832116] BTRFS error (device md2): failed to read block groups: -5
[  367.891339] BTRFS error (device md2): open_ctree failed



The error (bad tree block start) in dmesg changes every time as I re-run 
"mount /data":


[  589.425362] BTRFS error (device md2): bad tree block start, want 
22008094932992 have 3378573292323635748

[  589.425423] BTRFS error (device md2): failed to read block groups: -5
[  589.469979] BTRFS error (device md2): open_ctree failed


[  680.585908] BTRFS error (device md2): bad tree block start, want 
21058406105088 have 18446616720224032488

[  680.585991] BTRFS error (device md2): failed to read block groups: -5
[  680.625419] BTRFS error (device md2): open_ctree failed


Any advice how to recover?


Tomasz Chmielewski
https://lxadm.com


very poor performance / a lot of writes to disk with space_cache (but not with space_cache=v2)

2018-09-19 Thread Tomasz Chmielewski
I have a mysql slave which writes to a RAID-1 btrfs filesystem (with 
4.17.14 kernel) on 3 x ~1.9 TB SSD disks; filesystem is around 40% full.


The slave receives around 0.5-1 MB/s of data from the master over the 
network, which is then saved to MySQL's relay log and executed. In ideal 
conditions (i.e. no filesystem overhead) we should expect some 1-3 MB/s 
of data written to disk.


MySQL directory and files in it are chattr +C (since the directory was 
created, so all files are really +C); there are no snapshots.



Now, an interesting thing.

When the filesystem is mounted with these options in fstab:

defaults,noatime,discard


We can see a *constant* write of 25-100 MB/s to each disk. The system is 
generally unresponsive and it sometimes takes long seconds for a simple 
command executed in bash to return.



However, as soon as we remount the filesystem with space_cache=v2 - 
writes drop to just around 3-10 MB/s to each disk. If we remount to 
space_cache - lots of writes, system unresponsive. Again remount to 
space_cache=v2 - low writes, system responsive.



That's a huuge, 10x overhead! Is it expected? Especially that 
space_cache=v1 is still the default mount option?



Tomasz Chmielewski
https://lxadm.com


Re: fatal database corruption with btrfs "out of space" with ~50 GB left

2018-02-19 Thread Tomasz Chmielewski

On 2018-02-19 13:29, Anand Jain wrote:

On 02/14/2018 10:19 PM, Tomasz Chmielewski wrote:
Just FYI, how dangerous running btrfs can be - we had a fatal, 
unrecoverable MySQL corruption when btrfs decided to do one of these 
"I have ~50 GB left, so let's do out of space (and corrupt some files 
at the same time, ha ha!)".


 Thanks for reporting.


Running btrfs RAID-1 with kernel 4.14.


 Can you pls let us know..
 1. What tool cli/reported/identified that data is corrupted?


mysqld log - mysqld would refuse to start because of database 
corruption.


And, the database wouldn't start even when "innodb_force_recovery = " 
was set to a high/max value.



In the past, with lower kernel versions, we had a similar issue with 
mongod - it wouldn't start anymore due to some corruption which happened 
when we hit "out of space" (again, with dozens of GBs free space).




 2. Disk error stat using.. btrfs dev stat 
(dev stat is stored on disk)


# btrfs dev stat /var/lib/lxd
[/dev/sda3].write_io_errs0
[/dev/sda3].read_io_errs 0
[/dev/sda3].flush_io_errs0
[/dev/sda3].corruption_errs  0
[/dev/sda3].generation_errs  0
[/dev/sdb3].write_io_errs0
[/dev/sdb3].read_io_errs 0
[/dev/sdb3].flush_io_errs0
[/dev/sdb3].corruption_errs  0
[/dev/sdb3].generation_errs  0



 3. Wheather the disk was mounted as degraded any time before?


No. Everything healthy with the disks.


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fatal database corruption with btrfs "out of space" with ~50 GB left

2018-02-14 Thread Tomasz Chmielewski

On 2018-02-15 16:02, Tomasz Chmielewski wrote:

On 2018-02-15 13:32, Qu Wenruo wrote:


Is there any kernel message like kernel warning or backtrace?


I see there was this one:

Feb 13 13:53:32 lxd01 kernel: [9351710.878404] [ cut here
]
Feb 13 13:53:32 lxd01 kernel: [9351710.878430] WARNING: CPU: 9 PID:
7780 at /home/kernel/COD/linux/fs/btrfs/tree-log.c:3361
log_dir_items+0x54b/0x560 [btrfs]

(...)

Feb 13 13:53:32 lxd01 kernel: [9351710.878707] ---[ end trace
81aeb3fb0c68ce00 ]---


BTW we've updated to the latest 4.15 kernel after that.


Also, we were just running a balance there and it printed this in dmesg.

Not sure if it's something serious or not. The filesystem still runs 
correctly.


[60082.349447] WARNING: CPU: 2 PID: 780 at 
/home/kernel/COD/linux/fs/btrfs/extent-tree.c:124 
btrfs_put_block_group+0x4f/0x60 [btrfs]
[60082.349449] Modules linked in: xt_nat xt_REDIRECT nf_nat_redirect 
tcp_diag inet_diag sunrpc xt_NFLOG cfg80211 xt_conntrack nfnetlink_log 
nfnetlink ipt_REJECT nf_reject_ipv4 binfmt_misc veth ebtable_filter 
ebtables ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 xt_comment nf_log_ipv4 
nf_log_common xt_LOG ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip_vs nf_conntrack 
ip6table_filter ip6_tables iptable_filter xt_CHECKSUM xt_tcpudp 
iptable_mangle ip_tables x_tables intel_rapl sb_edac 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btrfs bridge pcbc 
zstd_compress stp llc aesni_intel aes_x86_64 crypto_simd glue_helper 
input_leds
[60082.349471]  cryptd intel_cstate intel_rapl_perf serio_raw shpchp 
lpc_ich acpi_pad mac_hid autofs4 raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 
raid0 multipath linear ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt fb_sys_fops igb drm dca ahci ptp libahci pps_core i2c_algo_bit 
wmi
[60082.349488] CPU: 2 PID: 780 Comm: btrfs-cleaner Tainted: GW   
 4.15.3-041503-generic #201802120730
[60082.349489] Hardware name: ASUSTeK COMPUTER INC. Z10PA-U8 
Series/Z10PA-U8 Series, BIOS 0601 06/26/2015

[60082.349497] RIP: 0010:btrfs_put_block_group+0x4f/0x60 [btrfs]
[60082.349497] RSP: 0018:c1468d3afe48 EFLAGS: 00010206
[60082.349498] RAX:  RBX: 9d072f74 RCX: 

[60082.349499] RDX: 0001 RSI: 0246 RDI: 
9d072bcb8c00
[60082.349499] RBP: c1468d3afe50 R08: 9d072bcbfc00 R09: 
000180200019
[60082.349500] R10: c1468d3afe38 R11: 0100 R12: 
9d072b87ce00
[60082.349500] R13: 9d072bcb8c00 R14: 9d072b87ced0 R15: 
9d072bcb8d20
[60082.349501] FS:  () GS:9d073f28() 
knlGS:

[60082.349502] CS:  0010 DS:  ES:  CR0: 80050033
[60082.349502] CR2: 7f9221375000 CR3: 0019a660a005 CR4: 
001606e0

[60082.349503] Call Trace:
[60082.349513]  btrfs_delete_unused_bgs+0x243/0x3c0 [btrfs]
[60082.349521]  cleaner_kthread+0x159/0x170 [btrfs]
[60082.349524]  kthread+0x121/0x140
[60082.349531]  ? __btree_submit_bio_start+0x20/0x20 [btrfs]
[60082.349533]  ? kthread_create_worker_on_cpu+0x70/0x70
[60082.349535]  ret_from_fork+0x35/0x40
[60082.349536] Code: e8 01 00 00 48 85 c0 75 1a 48 89 fb 48 8b bf d8 00 
00 00 e8 14 3a 66 d7 48 89 df e8 0c 3a 66 d7 5b 5d c3 0f ff eb e2 0f ff 
eb cb <0f> ff eb ce 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00

[60082.349554] ---[ end trace 9492ee1b902c858d ]---


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fatal database corruption with btrfs "out of space" with ~50 GB left

2018-02-14 Thread Tomasz Chmielewski
 ORIG_RAX: 004b
Feb 13 13:53:32 lxd01 kernel: [9351710.878676] RAX: ffda 
RBX: 307d6f5d1070 RCX: 7f99461437dd
Feb 13 13:53:32 lxd01 kernel: [9351710.878677] RDX: 005c 
RSI: 0008 RDI: 005c
Feb 13 13:53:32 lxd01 kernel: [9351710.878678] RBP:  
R08:  R09: 
Feb 13 13:53:32 lxd01 kernel: [9351710.878679] R10:  
R11: 0293 R12: 1000
Feb 13 13:53:32 lxd01 kernel: [9351710.878679] R13: 307d6f550b00 
R14:  R15: 1000
Feb 13 13:53:32 lxd01 kernel: [9351710.878681] Code: 89 85 6c ff ff ff 
4c 8b 95 70 ff ff ff 74 23 4c 89 f7 e8 a9 dc f8 ff 48 8b 7d 88 e8 a0 dc 
f8 ff 8b 85 6c ff ff ff e9 d8 fb ff ff <0f> ff e9 35 fe ff ff 4c 89 55 
18 e9 56 fc ff ff e8 60 65 61 eb
Feb 13 13:53:32 lxd01 kernel: [9351710.878707] ---[ end trace 
81aeb3fb0c68ce00 ]---



BTW we've updated to the latest 4.15 kernel after that.



Not sure if the removal of 80G has anything to do with this, but this
seems that your metadata (along with data) is quite scattered.

It's really recommended to keep some unallocated device space, and one
of the method to do that is to use balance to free such scattered space
from data/metadata usage.

And that's why balance routine is recommened for btrfs.


The balance might work on that server - it's less than 0.5 TB SSD disks.

However, on multi-terabyte servers with terabytes of data on HDD disks, 
running balance is not realistic. We have some servers where balance was 
taking 2 months or so, and was not even 50% done. And the IO load the 
balance was adding was slowing the things down a lot.



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fatal database corruption with btrfs "out of space" with ~50 GB left

2018-02-14 Thread Tomasz Chmielewski

On 2018-02-15 10:47, Qu Wenruo wrote:

On 2018年02月14日 22:19, Tomasz Chmielewski wrote:

Just FYI, how dangerous running btrfs can be - we had a fatal,
unrecoverable MySQL corruption when btrfs decided to do one of these 
"I

have ~50 GB left, so let's do out of space (and corrupt some files at
the same time, ha ha!)".


I'm recently looking into unexpected corruption problem of btrfs.

Would you please provide some extra info about how the corruption 
happened?


1) Is there any power reset?
   Btrfs should be bullet proof, but in fact it's not, so I'm here to
   get some clue.


No power reset.



2) Are MySQL files set with nodatacow?
   If so, data corruption is more or less expected, but should be
   handled by checkpoint of MySQL.


Yes, MySQL files were using "nodatacow".

I've seen many cases of "filesystem full" with ext4, but none lead to 
database corruption (i.e. the database would always recover after 
releasing some space)


On the other hand, I've seen a handful of "out of space" with gigabytes 
of free space with btrfs, which lead to some light, heavy or 
unrecoverable MySQL or mongo corruption.



Can it be because of of how "predictable" out of space situations are 
with btrfs and other filesystems?


- in short, ext4 will report out of space when there is 0 bytes left 
(perhaps slightly faster for non-root users) - the application trying to 
write data will see "out of space" at some point, and it can stay like 
this for hours (i.e. until some data is removed manually)


- on the other hand, btrfs can report out of space when there is still 
10, 50 or 100 GB left, meaning, any capacity planning is close to 
impossible; also, the application trying to write data can be seeing the 
fs as transitioning between "out of space" and "data written 
successfully" many times per minute/second?



3) Is the filesystem metadata corrupted? (AKA, btrfs check report 
error)

   If so, that should be the problem I'm looking into.


I don't think so, there are no scary things in dmesg. However, I didn't 
unmount the filesystem to run btrfs check.




4) Metadata/data ratio?
   "btrfs fi usage" could have quite good result about it.
   And "btrfs fi df" also helps.


Here it is - however, that's after removing some 80 GB data, so most 
likely doesn't reflect when the failure happened.


# btrfs fi usage /var/lib/lxd
Overall:
Device size: 846.25GiB
Device allocated:840.05GiB
Device unallocated:6.20GiB
Device missing:  0.00B
Used:498.26GiB
Free (estimated):167.96GiB  (min: 167.96GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID1: Size:411.00GiB, Used:246.14GiB
   /dev/sda3 411.00GiB
   /dev/sdb3 411.00GiB

Metadata,RAID1: Size:9.00GiB, Used:2.99GiB
   /dev/sda3   9.00GiB
   /dev/sdb3   9.00GiB

System,RAID1: Size:32.00MiB, Used:80.00KiB
   /dev/sda3  32.00MiB
   /dev/sdb3  32.00MiB

Unallocated:
   /dev/sda3   3.10GiB
   /dev/sdb3   3.10GiB



# btrfs fi df /var/lib/lxd
Data, RAID1: total=411.00GiB, used=246.15GiB
System, RAID1: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=9.00GiB, used=2.99GiB
GlobalReserve, single: total=512.00MiB, used=0.00B



# btrfs fi show /var/lib/lxd
Label: 'btrfs'  uuid: f5f30428-ec5b-4497-82de-6e20065e6f61
Total devices 2 FS bytes used 249.15GiB
devid1 size 423.13GiB used 420.03GiB path /dev/sda3
devid2 size 423.13GiB used 420.03GiB path /dev/sdb3



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


fatal database corruption with btrfs "out of space" with ~50 GB left

2018-02-14 Thread Tomasz Chmielewski
Just FYI, how dangerous running btrfs can be - we had a fatal, 
unrecoverable MySQL corruption when btrfs decided to do one of these "I 
have ~50 GB left, so let's do out of space (and corrupt some files at 
the same time, ha ha!)".


Running btrfs RAID-1 with kernel 4.14.



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: again "out of space" and remount read only, with 4.14

2017-11-26 Thread Tomasz Chmielewski

On 2017-11-27 00:37, Martin Raiber wrote:

On 26.11.2017 08:46 Tomasz Chmielewski wrote:

Got this one on a 4.14-rc7 filesystem with some 400 GB left:


Hi,

I guess it is too late now, but I guess the "btrfs fi usage" output of
the file system (especially after it went ro) would be useful.


It was more or less similar as it went ro:


# btrfs fi df /srv
Data, RAID1: total=2.21TiB, used=2.15TiB
System, RAID1: total=32.00MiB, used=352.00KiB
Metadata, RAID1: total=16.00GiB, used=13.09GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


# btrfs fi show /srv
Label: 'btrfs'  uuid: 105b2e0c-8af2-45ee-b4c8-14ff0a3ca899
Total devices 2 FS bytes used 2.16TiB
devid1 size 2.63TiB used 2.22TiB path /dev/sda4
devid2 size 2.63TiB used 2.22TiB path /dev/sdb4


# btrfs fi usage /srv
Overall:
Device size:   5.25TiB
Device allocated:  4.45TiB
Device unallocated:  823.97GiB
Device missing:  0.00B
Used:  4.33TiB
Free (estimated):471.91GiB  (min: 471.91GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID1: Size:2.21TiB, Used:2.15TiB
   /dev/sda4   2.21TiB
   /dev/sdb4   2.21TiB

Metadata,RAID1: Size:16.00GiB, Used:13.09GiB
   /dev/sda4  16.00GiB
   /dev/sdb4  16.00GiB

System,RAID1: Size:32.00MiB, Used:352.00KiB
   /dev/sda4  32.00MiB
   /dev/sdb4  32.00MiB

Unallocated:
   /dev/sda4 411.99GiB
   /dev/sdb4 411.99GiB



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


again "out of space" and remount read only, with 4.14

2017-11-25 Thread Tomasz Chmielewski

Got this one on a 4.14-rc7 filesystem with some 400 GB left:

[2217513.502016] BTRFS: Transaction aborted (error -28)
[2217513.502038] [ cut here ]
[2217513.502064] WARNING: CPU: 6 PID: 29325 at 
/home/kernel/COD/linux/fs/btrfs/extent-tree.c:3089 
btrfs_run_delayed_refs+0x244/0x250 [btrfs]
[2217513.502064] Modules linked in: vhost_net vhost tap nf_log_ipv4 
nf_log_common xt_LOG xt_REDIRECT nf_nat_redirect xt_NFLOG nfnetlink_log 
nfnetlink xt_conntrack ipt_REJECT nf_reject_ipv4 binfmt_misc veth 
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6table_filter ip6_tables xt_comment xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc btrfs 
zstd_compress intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc kvm_intel kvm 
irqbypass aesni_intel eeepc_wmi aes_x86_64 asus_wmi sparse_keymap 
crypto_simd wmi_bmof input_leds glue_helper ie31200_edac shpchp lpc_ich
[2217513.502099]  cryptd serio_raw mac_hid intel_cstate intel_rapl_perf 
tpm_infineon nfsd auth_rpcgss nfs_acl lockd grace sunrpc lp parport 
autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 e1000e ahci 
libahci ptp pps_core wmi video
[2217513.502120] CPU: 6 PID: 29325 Comm: postdrop Not tainted 
4.14.0-041400rc7-generic #201710292231
[2217513.502121] Hardware name: System manufacturer System Product 
Name/P8B WS, BIOS 0904 10/24/2011

[2217513.502122] task: 9fc681f54500 task.stack: c313cab74000
[2217513.502137] RIP: 0010:btrfs_run_delayed_refs+0x244/0x250 [btrfs]
[2217513.502138] RSP: 0018:c313cab77be8 EFLAGS: 00010282
[2217513.502140] RAX: 0026 RBX: ffe4 RCX: 

[2217513.502141] RDX:  RSI: 9fce2fb8dc98 RDI: 
9fce2fb8dc98
[2217513.502142] RBP: c313cab77c40 R08: 0001 R09: 
41dc
[2217513.502143] R10: c313cab77ad8 R11:  R12: 
9fce0967b258
[2217513.502144] R13: 9fce0768 R14: 9fcd27aa7000 R15: 

[2217513.502145] FS:  7fd929cb9800() GS:9fce2fb8() 
knlGS:

[2217513.502146] CS:  0010 DS:  ES:  CR0: 80050033
[2217513.502147] CR2: 7ffc6a7aae98 CR3: 0001e8e10006 CR4: 
000606e0

[2217513.502148] Call Trace:
[2217513.502167]  create_pending_snapshot+0x5b7/0xf70 [btrfs]
[2217513.502183]  create_pending_snapshots+0x88/0xb0 [btrfs]
[2217513.502197]  ? create_pending_snapshots+0x88/0xb0 [btrfs]
[2217513.502212]  btrfs_commit_transaction+0x3a7/0x8d0 [btrfs]
[2217513.502215]  ? wait_woken+0x80/0x80
[2217513.502232]  btrfs_sync_file+0x348/0x410 [btrfs]
[2217513.502235]  vfs_fsync_range+0x4b/0xb0
[2217513.502236]  do_fsync+0x3d/0x70
[2217513.502238]  SyS_fsync+0x10/0x20
[2217513.502240]  do_syscall_64+0x61/0x120
[2217513.502243]  entry_SYSCALL64_slow_path+0x25/0x25
[2217513.502244] RIP: 0033:0x7fd927955ba0
[2217513.502245] RSP: 002b:7fff6576e748 EFLAGS: 0246 ORIG_RAX: 
004a
[2217513.502247] RAX: ffda RBX: 56504aaecbd0 RCX: 
7fd927955ba0
[2217513.502248] RDX: 022a RSI: 01e4 RDI: 
0004
[2217513.502249] RBP:  R08:  R09: 
0101010101010101
[2217513.502250] R10: 7fff6576e510 R11: 0246 R12: 
5650493433f8
[2217513.502251] R13: 56504aaee120 R14: 0045 R15: 
565049344d44
[2217513.502252] Code: fe ff 89 d9 ba 11 0c 00 00 48 c7 c6 40 28 8b c0 
4c 89 e7 e8 c5 bc 09 00 e9 b5 fe ff ff 89 de 48 c7 c7 f8 94 8b c0 e8 dd 
47 cd da <0f> ff eb d3 e8 3a be 09 00 0f 1f 00 66 66 66 66 90 55 48 89 
e5

[2217513.502283] ---[ end trace cf63b0489605e465 ]---
[2217513.502342] BTRFS: error (device sda4) in 
btrfs_run_delayed_refs:3089: errno=-28 No space left

[2217513.502400] BTRFS info (device sda4): forced readonly
[2217513.502402] BTRFS: error (device sda4) in 
create_pending_snapshot:1625: errno=-28 No space left
[2217513.502458] BTRFS warning (device sda4): Skipping commit of aborted 
transaction.
[2217513.502460] BTRFS: error (device sda4) in cleanup_transaction:1873: 
errno=-28 No space left
[2217524.419368] mail[2522]: segfault at c0 ip 7f0c5487c33b sp 
7ffdfac77030 error 4 in libmailutils.so.4.0.0[7f0c547f2000+b]

[2217605.435864] BTRFS error (device sda4): pending csums is 22380544



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!

2017-11-17 Thread Tomasz Chmielewski

On 2017-11-18 10:08, Hans van Kranenburg wrote:

On 11/18/2017 01:49 AM, Tomasz Chmielewski wrote:
I'm getting the following BUG when running balance on one of my 
systems:



[ 3458.698704] BTRFS info (device sdb3): relocating block group
306045779968 flags data|raid1
[ 3466.892933] BTRFS info (device sdb3): found 2405 extents
[ 3495.408630] BTRFS info (device sdb3): found 2405 extents
[ 3498.161144] [ cut here ]
[ 3498.161150] kernel BUG at 
/home/kernel/COD/linux/fs/btrfs/ctree.c:1856!

[ 3498.161264] invalid opcode:  [#1] SMP


(...)


[ 3498.164523] Call Trace:
[ 3498.164694]  tree_advance+0x16e/0x1d0 [btrfs]
[ 3498.164874]  btrfs_compare_trees+0x2da/0x6a0 [btrfs]
[ 3498.165078]  ? process_extent+0x1580/0x1580 [btrfs]
[ 3498.165264]  btrfs_ioctl_send+0xe94/0x1120 [btrfs]


It's using send + balance at the same time. There's something that 
makes

btrfs explode when you do that.

It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
different explosions in kernel log. Since that happened, I made sure I
never did those two things at the same time.


Indeed, send was started when balance was running.

Thanks for the hint.


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!

2017-11-17 Thread Tomasz Chmielewski

I'm getting the following BUG when running balance on one of my systems:


[ 3458.698704] BTRFS info (device sdb3): relocating block group 
306045779968 flags data|raid1

[ 3466.892933] BTRFS info (device sdb3): found 2405 extents
[ 3495.408630] BTRFS info (device sdb3): found 2405 extents
[ 3498.161144] [ cut here ]
[ 3498.161150] kernel BUG at 
/home/kernel/COD/linux/fs/btrfs/ctree.c:1856!

[ 3498.161264] invalid opcode:  [#1] SMP
[ 3498.161363] Modules linked in: nf_log_ipv6 nf_log_ipv4 nf_log_common 
xt_LOG xt_multiport xt_conntrack xt_nat binfmt_misc veth ip6table_filter 
xt_CHECKSUM iptable_mangle xt_tcpudp ip6t_MASQUERADE 
nf_nat_masquerade_ipv6 ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack iptable_filter ip_tables x_tables bridge stp llc intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel 
aes_x86_64 crypto_simd glue_helper cryptd intel_cstate hci_uart 
intel_rapl_perf btbcm input_leds serdev serio_raw btqca btintel 
bluetooth intel_pch_thermal intel_lpss_acpi intel_lpss mac_hid acpi_pad
[ 3498.162060]  ecdh_generic acpi_als kfifo_buf industrialio autofs4 
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath 
linear raid1 e1000e psmouse ptp ahci pps_core libahci wmi 
pinctrl_sunrisepoint i2c_hid video pinctrl_intel hid
[ 3498.162386] CPU: 7 PID: 29041 Comm: btrfs Not tainted 
4.14.0-041400-generic #201711122031
[ 3498.162545] Hardware name: FUJITSU  /D3401-H2, BIOS V5.0.0.12 R1.5.0 
for D3401-H2x 02/27/2017

[ 3498.162723] task: 8d7858e82f00 task.stack: b4ee47d5c000
[ 3498.162890] RIP: 0010:read_node_slot+0xd7/0xe0 [btrfs]
[ 3498.163027] RSP: 0018:b4ee47d5fb88 EFLAGS: 00010246
[ 3498.163156] RAX: 8d78c8bb7000 RBX: 8d8124abd380 RCX: 
0001
[ 3498.163290] RDX: 0048 RSI: 8d7ae1fef6f8 RDI: 
8d8124aa
[ 3498.163422] RBP: b4ee47d5fba8 R08: 0001 R09: 
8d8124abd384
[ 3498.163555] R10: 0001 R11: 00114000 R12: 
0002
[ 3498.163689] R13: b4ee47d5fc66 R14: b4ee47d5fc50 R15: 

[ 3498.163825] FS:  7fa4c9a998c0() GS:8d816e5c() 
knlGS:

[ 3498.163990] CS:  0010 DS:  ES:  CR0: 80050033
[ 3498.164120] CR2: 56410155a028 CR3: 0009c194c002 CR4: 
003606e0
[ 3498.164255] DR0:  DR1:  DR2: 

[ 3498.164390] DR3:  DR6: fffe0ff0 DR7: 
0400

[ 3498.164523] Call Trace:
[ 3498.164694]  tree_advance+0x16e/0x1d0 [btrfs]
[ 3498.164874]  btrfs_compare_trees+0x2da/0x6a0 [btrfs]
[ 3498.165078]  ? process_extent+0x1580/0x1580 [btrfs]
[ 3498.165264]  btrfs_ioctl_send+0xe94/0x1120 [btrfs]
[ 3498.165450]  btrfs_ioctl+0x93c/0x1f00 [btrfs]
[ 3498.165587]  ? enqueue_task_fair+0xa8/0x6c0
[ 3498.165724]  do_vfs_ioctl+0xa5/0x600
[ 3498.165854]  ? do_vfs_ioctl+0xa5/0x600
[ 3498.165979]  ? _do_fork+0x144/0x3a0
[ 3498.166103]  SyS_ioctl+0x79/0x90
[ 3498.166234]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 3498.166368] RIP: 0033:0x7fa4c8b17f07
[ 3498.166488] RSP: 002b:7ffd33644e38 EFLAGS: 0202 ORIG_RAX: 
0010
[ 3498.166653] RAX: ffda RBX: 7fa4c8a1a700 RCX: 
7fa4c8b17f07
[ 3498.166787] RDX: 7ffd33644f30 RSI: 40489426 RDI: 
0004
[ 3498.166921] RBP: 7ffd33644dc0 R08:  R09: 
7fa4c8a1a700
[ 3498.167055] R10: 7fa4c8a1a9d0 R11: 0202 R12: 

[ 3498.167190] R13: 7ffd33644dbf R14: 7fa4c8a1a9c0 R15: 
0129f020
[ 3498.167326] Code: 48 c7 c3 fb ff ff ff e8 f8 5c 05 00 48 89 d8 5b 41 
5c 41 5d 41 5e 5d c3 48 c7 c3 fe ff ff ff 48 89 d8 5b 41 5c 41 5d 41 5e 
5d c3 <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41
[ 3498.167690] RIP: read_node_slot+0xd7/0xe0 [btrfs] RSP: 
b4ee47d5fb88

[ 3498.167892] ---[ end trace 6a751a3020dd3086 ]---
[ 3499.572729] BTRFS info (device sdb3): relocating block group 
304972038144 flags data|raid1

[ 3504.068432] BTRFS info (device sdb3): found 2037 extents
[ 3538.281808] BTRFS info (device sdb3): found 2037 extents



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-11-09 Thread Tomasz Chmielewski

On 2017-11-07 23:49, E V wrote:


Hmm, I used to see these phantom no space issues quite a bit on older
4.x kernels, and haven't seen them since switching to space_cache=v2.
So it could be space cache corruption. You might try either clearing
you space cache, or mounting with nospace_cache, or try converting to
space_cache=v2 after reading up on it's caveats.


We have space_cache=v2.

Unfortunately yet one more system running 4.14-rc8 with "No space left" 
during balance:



[68443.535664] BTRFS info (device sdb3): relocating block group 
591771009024 flags data|raid1

[68463.203330] BTRFS info (device sdb3): found 8578 extents
[68492.238676] BTRFS info (device sdb3): found 8559 extents
[68500.751792] BTRFS info (device sdb3): 1 enospc errors during balance


# btrfs balance start /var/lib/lxd
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the balanced data.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/var/lib/lxd': No space left on device
There may be more info in syslog - try dmesg | tail


# btrfs fi usage /var/lib/lxd
Overall:
Device size: 846.26GiB
Device allocated:622.27GiB
Device unallocated:  223.99GiB
Device missing:  0.00B
Used:606.40GiB
Free (estimated):116.68GiB  (min: 116.68GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID1: Size:306.00GiB, Used:301.31GiB
   /dev/sda3 306.00GiB
   /dev/sdb3 306.00GiB

Metadata,RAID1: Size:5.10GiB, Used:1.89GiB
   /dev/sda3   5.10GiB
   /dev/sdb3   5.10GiB

System,RAID1: Size:32.00MiB, Used:80.00KiB
   /dev/sda3  32.00MiB
   /dev/sdb3  32.00MiB

Unallocated:
   /dev/sda3 112.00GiB
   /dev/sdb3 112.00GiB


# btrfs fi show /var/lib/lxd
Label: 'btrfs'  uuid: 6340f5de-f635-4d09-bbb2-1e03b1e1b160
Total devices 2 FS bytes used 303.20GiB
devid1 size 423.13GiB used 311.13GiB path /dev/sda3
devid2 size 423.13GiB used 311.13GiB path /dev/sdb3


# btrfs fi df /var/lib/lxd
Data, RAID1: total=306.00GiB, used=301.32GiB
System, RAID1: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=5.10GiB, used=1.89GiB
GlobalReserve, single: total=512.00MiB, used=0.00B



So far out of all systems which were giving us "No space left on device" 
with 4.13.x, all but one are still giving us "No space left on device" 
during balance with 4.14-rc7 and later.
We've seen it on a mix of servers with SSD or HDD disks, with 
filesystems ranging from 0.5 TB to 20 TB, and use % from 30% to 90%.


Combined with evidence that "No space left on device" during balance can 
lead to various file corruption (we've witnessed it with MySQL), I'd day 
btrfs balance is a dangerous operation and decision to use it should be 
considered very thoroughly.



Shouldn't "Balance" be marked as "mostly OK" or "Unstable" here? Giving 
it "OK" status is misleading.


https://btrfs.wiki.kernel.org/index.php/Status


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-11-06 Thread Tomasz Chmielewski

On 2017-10-31 23:18, Tomasz Chmielewski wrote:

On 2017-09-18 17:20, Tomasz Chmielewski wrote:

# df -h /var/lib/lxd

FWIW, standard (aka util-linux) df is effectively useless in a 
situation
such as this, as it really doesn't give you the information you need 
(it

can say you have lots of space available, but if btrfs has all of it
allocated into chunks, even if the chunks have space in them still, 
there

can be problems).


I see here on RAID-1, "df -h" it shows pretty much the same amount of
free space as "btrfs fi show":

- "df -h" shows 105G free
- "btrfs fi show" says: Free (estimated):104.28GiB
(min: 104.28GiB)




But chances are pretty good that one you get that patch integrated,
whether by integrating it yourself to what you have currently, or by
trying 4.14-rc1 or waiting until it hits release or stable, that bug 
will

have been squashed! =:^)


OK, will wait for 4.14.


So I've tried to run balance with 4.14-rc6.


I've also tried with 4.14-rc7 on a server which was failing with "no 
space left" - unfortunately, it's still failing:



# time btrfs balance start /srv
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the scope of balance.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/srv': No space left on device
There may be more info in syslog - try dmesg | tail

real8731m13.424s
user0m0.000s
sys 560m36.363s



# dmesg -c
(...)
[546228.496902] BTRFS info (device sda4): relocating block group 
297455845376 flags data|raid1

[546251.393541] BTRFS info (device sda4): found 107799 extents
[546512.346360] BTRFS info (device sda4): found 107799 extents
[546529.407077] BTRFS info (device sda4): relocating block group 
296382103552 flags metadata|raid1

[546692.465746] BTRFS info (device sda4): found 35202 extents
[546733.294172] BTRFS info (device sda4): found 2586 extents
[546738.487556] BTRFS info (device sda4): relocating block group 
295308361728 flags data|raid1

[546770.474409] BTRFS info (device sda4): found 140906 extents
[547037.744023] BTRFS info (device sda4): found 140906 extents
[547065.840993] BTRFS info (device sda4): 117 enospc errors during 
balance



# btrfs fi df /srv
Data, RAID1: total=2.46TiB, used=2.35TiB
System, RAID1: total=32.00MiB, used=416.00KiB
Metadata, RAID1: total=19.00GiB, used=12.92GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


# btrfs fi show /srv
Label: 'btrfs'  uuid: 105b2e0c-8af2-45ee-b4c8-14ff0a3ca899
Total devices 2 FS bytes used 2.36TiB
devid1 size 2.63TiB used 2.48TiB path /dev/sda4
devid2 size 2.63TiB used 2.48TiB path /dev/sdb4


# btrfs fi usage /srv
Overall:
Device size:   5.25TiB
Device allocated:  4.96TiB
Device unallocated:  302.00GiB
Device missing:  0.00B
Used:  4.72TiB
Free (estimated):268.66GiB  (min: 268.66GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID1: Size:2.46TiB, Used:2.35TiB
   /dev/sda4   2.46TiB
   /dev/sdb4   2.46TiB

Metadata,RAID1: Size:19.00GiB, Used:12.92GiB
   /dev/sda4  19.00GiB
   /dev/sdb4  19.00GiB

System,RAID1: Size:32.00MiB, Used:416.00KiB
   /dev/sda4  32.00MiB
   /dev/sdb4  32.00MiB

Unallocated:
   /dev/sda4 151.00GiB
   /dev/sdb4 151.00GiB


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-10-31 Thread Tomasz Chmielewski

On 2017-10-31 23:18, Tomasz Chmielewski wrote:


On a different server, however, it failed badly:

# time btrfs balance start /srv
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the scope of balance.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/srv': Read-only file system
There may be more info in syslog - try dmesg | tail

[312304.050731] BTRFS info (device sda4): found 15073 extents
[313555.971253] BTRFS info (device sda4): relocating block group
1208022466560 flags data|raid1
[314963.506580] BTRFS: Transaction aborted (error -28)
[314963.506608] [ cut here ]
[314963.506639] WARNING: CPU: 2 PID: 27854 at
/home/kernel/COD/linux/fs/btrfs/extent-tree.c:3089
btrfs_run_delayed_refs+0x244/0x250 [btrfs]


(...)


[314963.506955] BTRFS: error (device sda4) in
btrfs_run_delayed_refs:3089: errno=-28 No space left
[314963.507032] BTRFS info (device sda4): forced readonly
[314963.510570] BTRFS warning (device sda4): Skipping commit of
aborted transaction.
[314963.510577] BTRFS: error (device sda4) in
cleanup_transaction:1873: errno=-28 No space left
[314970.954768] mail[32290]: segfault at c0 ip 7f6b507ae33b sp
7ffec4849ac0 error 4 in libmailutils.so.4.0.0[7f6b50724000+b]
[314983.475988] BTRFS error (device sda4): pending csums is 167936



And btrfs balance can be a real database killer :(


root@backupslave01:/var/log/mysql# tail -f mysql-error.log
InnoDB: Doing recovery: scanned up to log sequence number 2206178343424
InnoDB: Doing recovery: scanned up to log sequence number 2206183586304
InnoDB: Doing recovery: scanned up to log sequence number 2206188829184
InnoDB: Doing recovery: scanned up to log sequence number 2206194072064
InnoDB: Doing recovery: scanned up to log sequence number 2206199314944
InnoDB: Doing recovery: scanned up to log sequence number 2206204557824
InnoDB: Doing recovery: scanned up to log sequence number 2206209800704
InnoDB: Doing recovery: scanned up to log sequence number 2206215043584
InnoDB: Doing recovery: scanned up to log sequence number 2206220286464
InnoDB: Doing recovery: scanned up to log sequence number 2206220752384

InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 21145843968
2017-10-31 14:46:59 4359 [Note] InnoDB: Starting an apply batch of log 
records to the database...

InnoDB: Progress in percent: 14:46:59 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this 
binary
or one of the libraries it was linked against is corrupt, improperly 
built,
or misconfigured. This error can also be caused by malfunctioning 
hardware.

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=0
max_threads=502
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 
232495 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x4
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0x8d444b]
/usr/sbin/mysqld(handle_fatal_signal+0x49a)[0x649b0a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f74b90bf390]
/usr/sbin/mysqld[0x99fcae]
/usr/sbin/mysqld[0x9a17ed]
/usr/sbin/mysqld[0x9881ea]
/usr/sbin/mysqld[0x989fc7]
/usr/sbin/mysqld[0xa6dd87]
/usr/sbin/mysqld[0xab8cd8]
/usr/sbin/mysqld[0xa08300]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f74b90b56ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f74b854a3dd]



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-10-31 Thread Tomasz Chmielewski

On 2017-09-18 17:20, Tomasz Chmielewski wrote:

# df -h /var/lib/lxd

FWIW, standard (aka util-linux) df is effectively useless in a 
situation
such as this, as it really doesn't give you the information you need 
(it

can say you have lots of space available, but if btrfs has all of it
allocated into chunks, even if the chunks have space in them still, 
there

can be problems).


I see here on RAID-1, "df -h" it shows pretty much the same amount of
free space as "btrfs fi show":

- "df -h" shows 105G free
- "btrfs fi show" says: Free (estimated):104.28GiB
(min: 104.28GiB)




But chances are pretty good that one you get that patch integrated,
whether by integrating it yourself to what you have currently, or by
trying 4.14-rc1 or waiting until it hits release or stable, that bug 
will

have been squashed! =:^)


OK, will wait for 4.14.


So I've tried to run balance with 4.14-rc6.

It succeeded on one server where it was failing with 4.13.x.


On a different server, however, it failed badly:

# time btrfs balance start /srv
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the scope of balance.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/srv': Read-only file system
There may be more info in syslog - try dmesg | tail

real5194m41.749s
user0m0.000s
sys 301m10.928s


[312304.050731] BTRFS info (device sda4): found 15073 extents
[313555.971253] BTRFS info (device sda4): relocating block group 
1208022466560 flags data|raid1

[314963.506580] BTRFS: Transaction aborted (error -28)
[314963.506608] [ cut here ]
[314963.506639] WARNING: CPU: 2 PID: 27854 at 
/home/kernel/COD/linux/fs/btrfs/extent-tree.c:3089 
btrfs_run_delayed_refs+0x244/0x250 [btrfs]
[314963.506640] Modules linked in: vhost_net vhost tap xt_REDIRECT 
nf_nat_redirect xt_NFLOG nfnetlink_log nfnetlink xt_conntrack veth 
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6table_filter ip6_tables xt_comment xt_CHECKSUM 
binfmt_misc iptable_mangle nf_log_ipv4 nf_log_common xt_LOG 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 
xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc btrfs 
zstd_compress shpchp intel_rapl lpc_ich x86_pkg_temp_thermal 
intel_powerclamp input_leds tpm_infineon ie31200_edac serio_raw coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm_intel pcbc kvm 
aesni_intel irqbypass aes_x86_64 mac_hid crypto_simd glue_helper cryptd 
intel_cstate
[314963.506684]  eeepc_wmi asus_wmi sparse_keymap intel_rapl_perf 
wmi_bmof nfsd auth_rpcgss nfs_acl lockd grace sunrpc lp parport autofs4 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 e1000e ahci 
libahci ptp pps_core wmi video
[314963.506710] CPU: 2 PID: 27854 Comm: sadc Tainted: GW   
4.14.0-041400rc6-generic #201710230731
[314963.506711] Hardware name: System manufacturer System Product 
Name/P8B WS, BIOS 0904 10/24/2011

[314963.506713] task: 8bc0fd39ae00 task.stack: b28d4949
[314963.506732] RIP: 0010:btrfs_run_delayed_refs+0x244/0x250 [btrfs]
[314963.506734] RSP: 0018:b28d49493d30 EFLAGS: 00010286
[314963.506736] RAX: 0026 RBX: ffe4 RCX: 

[314963.506737] RDX:  RSI: 8bc8afa8dc98 RDI: 
8bc8afa8dc98
[314963.506738] RBP: b28d49493d88 R08: 0001 R09: 
242b
[314963.506740] R10: b28d49493c20 R11:  R12: 
8bc883a81078
[314963.506741] R13: 8bc887eb R14: 8bc1876ec400 R15: 
0018ba90
[314963.506743] FS:  7f62a12d9700() GS:8bc8afa8() 
knlGS:

[314963.506744] CS:  0010 DS:  ES:  CR0: 80050033
[314963.506746] CR2: 7f25f6f53880 CR3: 0003cf4f7004 CR4: 
000626e0

[314963.506747] Call Trace:
[314963.506773]  btrfs_commit_transaction+0x9b/0x8d0 [btrfs]
[314963.506799]  ? btrfs_wait_ordered_range+0x9c/0x110 [btrfs]
[314963.506821]  btrfs_sync_file+0x348/0x410 [btrfs]
[314963.506826]  vfs_fsync_range+0x4b/0xb0
[314963.506828]  do_fsync+0x3d/0x70
[314963.506831]  SyS_fdatasync+0x13/0x20
[314963.506834]  do_syscall_64+0x61/0x120
[314963.506838]  entry_SYSCALL64_slow_path+0x25/0x25
[314963.506840] RIP: 0033:0x7f62a0dfec30
[314963.506841] RSP: 002b:7fffca89f288 EFLAGS: 0246 ORIG_RAX: 
004b
[314963.506844] RAX: ffda RBX: 0001 RCX: 
7f62a0dfec30
[314963.506845] RDX: 00

Re: yet another "out of space" on a filesystem with >100 GB free space, and strange files which exist but don't exist

2017-10-04 Thread Tomasz Chmielewski

On 2017-10-04 20:20, Austin S. Hemmelgarn wrote:

On 2017-10-04 07:13, Tomasz Chmielewski wrote:

Kernel: 4.13.4, btrfs RAID-1.

Disk usage more or less like below (yes, I know about btrfs fi df / 
show / usage):


Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   424G  262G  161G  62% /var/lib/lxd


Balance would exit immediately with "out of space", but continues to 
run after I've removed a few gigabytes from the filesystem.



Now, I'm seeing some files which exist, but don't. Strange, I know.


root@lxd02 
/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb # ls *set

ls: cannot access 'WiredTiger.turtle.set': No such file or directory

root@lxd02 
/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb # ls 
-l|grep set

ls: cannot access 'WiredTiger.turtle.set': No such file or directory
-? ? ?  ?    ?    ? 
WiredTiger.turtle.set


root@lxd02 
/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb # mv 
WiredTiger.turtle.set WiredTiger.turtle.set.Ghost.File

mv: cannot stat 'WiredTiger.turtle.set': No such file or directory

root@lxd02 
/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb # rm -v 
WiredTiger.turtle.set

rm: cannot remove 'WiredTiger.turtle.set': No such file or directory



What is this file, and why does it exist if it doesn't? How do I 
remove it?

It's got corrupted metadata, probably the inode itself (IIRC, the
dentry in BTRFS just matches the inode to the file name, and all the
other data reported by ls -l is stored in the inode).  If you're
running with a replicated metadata profile (dup, raid1, or raid10),
run a scrub, and it may fix things.  If not, you will likely have to
run a check in repair mode (though I would suggest waiting to hear
from one of the developers before doing so).  Alternatively, if that's
in a subvolume, and you can afford to just nuke the subvolume and
recreate it, deleting the subvolume should get rid of it (though you
should still run a check).

Either way, this is likely related to the balance issues you're seeing.


Unfortunately scrub didn't help:

# btrfs scrub status /var/lib/lxd
scrub status for 6340f5de-f635-4d09-bbb2-1e03b1e1b160
scrub started at Wed Oct  4 14:12:29 2017 and finished after 
00:10:32

total bytes scrubbed: 525.57GiB with 0 errors


"Ghost file" is still there:

# ls -l 
/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb|grep set
ls: cannot access 
'/var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb/WiredTiger.turtle.set': 
No such file or directory
-? ? ?  ??? 
WiredTiger.turtle.set




Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


yet another "out of space" on a filesystem with >100 GB free space, and strange files which exist but don't exist

2017-10-04 Thread Tomasz Chmielewski

Kernel: 4.13.4, btrfs RAID-1.

Disk usage more or less like below (yes, I know about btrfs fi df / show 
/ usage):


Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   424G  262G  161G  62% /var/lib/lxd


Balance would exit immediately with "out of space", but continues to run 
after I've removed a few gigabytes from the filesystem.



Now, I'm seeing some files which exist, but don't. Strange, I know.


root@lxd02 /var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb 
# ls *set

ls: cannot access 'WiredTiger.turtle.set': No such file or directory

root@lxd02 /var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb 
# ls -l|grep set

ls: cannot access 'WiredTiger.turtle.set': No such file or directory
-? ? ?  ??? 
WiredTiger.turtle.set


root@lxd02 /var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb 
# mv WiredTiger.turtle.set WiredTiger.turtle.set.Ghost.File

mv: cannot stat 'WiredTiger.turtle.set': No such file or directory

root@lxd02 /var/lib/lxd/containers/mongo-repl04b/rootfs/var/lib/mongodb 
# rm -v WiredTiger.turtle.set

rm: cannot remove 'WiredTiger.turtle.set': No such file or directory



What is this file, and why does it exist if it doesn't? How do I remove 
it?



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-09-18 Thread Tomasz Chmielewski

On 2017-09-18 22:44, Peter Becker wrote:

i'm not sure if it would help, but maybe you could try adding an 8GB
(or more) USB flash drive to the pool and try to start balance.
if it works out, you can throw him out of the pool after that.


I really can't, it's an "online server".

But I've removed some 65 GB data, so now it's 171 GB free, or, 60% used 
filesystem.


The balance still fails.


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-09-18 Thread Tomasz Chmielewski

On 2017-09-18 17:29, Andrei Borzenkov wrote:
On Mon, Sep 18, 2017 at 11:20 AM, Tomasz Chmielewski <man...@wpkg.org> 
wrote:

# df -h /var/lib/lxd

FWIW, standard (aka util-linux) df is effectively useless in a 
situation
such as this, as it really doesn't give you the information you need 
(it

can say you have lots of space available, but if btrfs has all of it
allocated into chunks, even if the chunks have space in them still, 
there

can be problems).



I see here on RAID-1, "df -h" it shows pretty much the same amount of 
free

space as "btrfs fi show":

- "df -h" shows 105G free
- "btrfs fi show" says: Free (estimated):104.28GiB  
(min:

104.28GiB)



I think both use the same algorithm to compute free space (df at the
end just shows what kernel returns). The problem is that this
algorithm itself is just approximation in general case. For uniform
RAID1 profile it should be correct though.


And perhaps more important - can I assume that right now, with the 
latest stable kernel (4.13.2 right now), running "btrfs balance" is not 
safe and can lead to data corruption or loss?



Consider the following case:

- system admin runs btrfs balance on a filesystem with 100 GB free and 
assumes it is enough space to complete successfully


- btrfs balance fails due to some bug with "No space left on device"

- at the same time, a database using this filesystem will fail with "No 
space left on device", apt/rpm will fail a package upgrade, some program 
using temp space will fail, log collector will fail to catch some data, 
because of "No space left on device" and so on?




Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to run balance successfully (No space left on device)?

2017-09-18 Thread Tomasz Chmielewski

# df -h /var/lib/lxd

FWIW, standard (aka util-linux) df is effectively useless in a 
situation
such as this, as it really doesn't give you the information you need 
(it

can say you have lots of space available, but if btrfs has all of it
allocated into chunks, even if the chunks have space in them still, 
there

can be problems).


I see here on RAID-1, "df -h" it shows pretty much the same amount of 
free space as "btrfs fi show":


- "df -h" shows 105G free
- "btrfs fi show" says: Free (estimated):104.28GiB  
(min: 104.28GiB)





But chances are pretty good that one you get that patch integrated,
whether by integrating it yourself to what you have currently, or by
trying 4.14-rc1 or waiting until it hits release or stable, that bug 
will

have been squashed! =:^)


OK, will wait for 4.14.


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how to run balance successfully (No space left on device)?

2017-09-17 Thread Tomasz Chmielewski

I'm trying to run balance on a 4.13.2 kernel without much luck:

# time btrfs balance start -v /var/lib/lxd -dusage=5 -musage=5
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=5
  METADATA (flags 0x2): balancing, usage=5
  SYSTEM (flags 0x2): balancing, usage=5
Done, had to relocate 1 out of 353 chunks

real0m2.356s
user0m0.005s
sys 0m0.175s


# time btrfs balance start -v /var/lib/lxd -dusage=0 -musage=0
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=0
  METADATA (flags 0x2): balancing, usage=0
  SYSTEM (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 353 chunks

real0m0.076s
user0m0.004s
sys 0m0.008s


# time btrfs balance start -v /var/lib/lxd
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the balanced data.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/var/lib/lxd': No space left on device
There may be more info in syslog - try dmesg | tail

real284m58.541s
user0m0.000s
sys 47m39.037s




# df -h /var/lib/lxd
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda3   424G  318G  105G  76% /var/lib/lxd


# btrfs fi df /var/lib/lxd
Data, RAID1: total=318.00GiB, used=313.82GiB
System, RAID1: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=5.00GiB, used=3.17GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


# btrfs fi show /var/lib/lxd
Label: 'btrfs'  uuid: f5f30428-ec5b-4497-82de-6e20065e6f61
Total devices 2 FS bytes used 316.98GiB
devid1 size 423.13GiB used 323.03GiB path /dev/sda3
devid2 size 423.13GiB used 323.03GiB path /dev/sdb3


# btrfs fi usage /var/lib/lxd
Overall:
Device size: 846.25GiB
Device allocated:646.06GiB
Device unallocated:  200.19GiB
Device missing:  0.00B
Used:633.97GiB
Free (estimated):104.28GiB  (min: 104.28GiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,RAID1: Size:318.00GiB, Used:313.82GiB
   /dev/sda3 318.00GiB
   /dev/sdb3 318.00GiB

Metadata,RAID1: Size:5.00GiB, Used:3.17GiB
   /dev/sda3   5.00GiB
   /dev/sdb3   5.00GiB

System,RAID1: Size:32.00MiB, Used:80.00KiB
   /dev/sda3  32.00MiB
   /dev/sdb3  32.00MiB

Unallocated:
   /dev/sda3 100.10GiB
   /dev/sdb3 100.10GiB


Mount flags in /etc/fstab are:

LABEL=btrfs /var/lib/lxd btrfs 
defaults,noatime,space_cache=v2,device=/dev/sda3,device=/dev/sdb3,discard 
0 0




Last pieces logged in dmesg:

[46867.225334] BTRFS info (device sda3): relocating block group 
2996254998528 flags data|raid1

[46874.563631] BTRFS info (device sda3): found 9250 extents
[46894.827895] BTRFS info (device sda3): found 9250 extents
[46898.463053] BTRFS info (device sda3): found 201 extents
[46898.562564] BTRFS info (device sda3): relocating block group 
2995181256704 flags data|raid1

[46903.555976] BTRFS info (device sda3): found 7299 extents
[46914.188044] BTRFS info (device sda3): found 7299 extents
[46914.303476] BTRFS info (device sda3): relocating block group 
2947936616448 flags metadata|raid1

[46939.570810] BTRFS info (device sda3): found 42022 extents
[46945.053488] BTRFS info (device sda3): 2 enospc errors during balance



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.13: No space left with plenty of free space (/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989 __btrfs_free_extent.isra.62+0xc2c/0xdb0)

2017-09-08 Thread Tomasz Chmielewski

So the numbers that matter are:


Data,single: Size:12.84TiB, Used:7.13TiB
/dev/md2   12.84TiB
Metadata,DUP: Size:79.00GiB, Used:77.87GiB
/dev/md2  158.00GiB
Unallocated:
/dev/md23.31TiB




* If you are using the 'space_cache' it has a known issue:
  https://btrfs.wiki.kernel.org/index.php/Gotchas#Free_space_cache


# mount | grep btrfs
/dev/md2 on /data type btrfs 
(rw,noatime,compress-force=zlib,space_cache,subvolid=5,subvol=/)



Citing from the URL you pasted:

 Free space cache

Currently sometimes the free space cache v1 and v2 lose track of 
free space and a volume can be reported as not having free space when it 
obviously does.
Fix: disable use of the free space cache with mount option 
nospace_cache.

Fix: remount the volume with -o remount,clear_cache.
Switch to to new free space tree.


What does "switch to to new free space tree" mean / how to do it?



I also notice that your volume's data free space seems to be
extremely fragmented, as the large difference here shows
"Data,single: Size:12.84TiB, Used:7.13TiB".


Yes, it's possible it will be very fragmented: lots of rsync + inplace 
and many snapshots. Also - not sure if it matters - IO load is 100% or 
close for most of the day.




Which may mean that it is mounted with 'ssd' and/or has gone a
long time without a 'balance', and conceivably this can make it
easier for the free space cache to fail finding space (some
handwaving here).


It's using HDDs, not mounted with "ssd" option.

I think there wasn't ever a balance run there. Since full balance may 
take a few months to finish (!) and causes even more IO, I'm not a big 
fan of running it.


Still, it does seem like a bug to me to error with "no space left", when 
there is a lot of space left?



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.13: No space left with plenty of free space (/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989 __btrfs_free_extent.isra.62+0xc2c/0xdb0)

2017-09-07 Thread Tomasz Chmielewski

On 2017-09-08 13:33, Tomasz Chmielewski wrote:
Just got this one in dmesg with btrfs RAID-1 on top of Linux software 
RAID-5.


Should say: with btrfs _single_ on top of Linux software RAID-5.




Why does it say "No space left" if we have 9 TB free there?

[233787.920933] BTRFS: Transaction aborted (error -28)
[233787.920953] [ cut here ]
[233787.920971] WARNING: CPU: 1 PID: 2235 at
/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989
__btrfs_free_extent.isra.62+0xc2c/0xdb0 [btrfs]
[233787.920971] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6
xt_NFLOG xt_conntrack ip6table_filter ip6_tables xt_CHECKSUM
iptable_mangle xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4
xt_comment iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_filter ip_tables x_tables nfnetlink_log
nfnetlink rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache
sunrpc bluetooth ecdh_generic binfmt_misc veth bridge stp llc btrfs
intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc
ppdev aesni_intel aes_x86_64 crypto_simd glue_helper cryptd eeepc_wmi
intel_cstate intel_rapl_perf input_leds asus_wmi sparse_keymap
serio_raw wmi_bmof parport_pc shpchp ie31200_edac tpm_infineon lpc_ich
parport
[233787.920992]  mac_hid autofs4 raid0 multipath linear raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid10
raid6_pq libcrc32c raid1 ahci r8169 libahci mii wmi video
[233787.921001] CPU: 1 PID: 2235 Comm: btrfs-transacti Not tainted
4.13.0-041300-generic #201709031731
[233787.921002] Hardware name: System manufacturer System Product
Name/P8H77-M PRO, BIOS 9002 05/30/2014
[233787.921002] task: 943b0a779740 task.stack: b1c4491a4000
[233787.921012] RIP: 0010:__btrfs_free_extent.isra.62+0xc2c/0xdb0 
[btrfs]

[233787.921013] RSP: 0018:b1c4491a7b08 EFLAGS: 00010286
[233787.921013] RAX: 0026 RBX: 0cdf3dddc000 RCX:

[233787.921014] RDX:  RSI: 943b5fa4dc78 RDI:
943b5fa4dc78
[233787.921014] RBP: b1c4491a7bb0 R08: 0001 R09:
04c5
[233787.921015] R10: 0013dd7ec000 R11:  R12:
943b0d1c
[233787.921015] R13: ffe4 R14:  R15:
943b0cadcee0
[233787.921016] FS:  () GS:943b5fa4()
knlGS:
[233787.921016] CS:  0010 DS:  ES:  CR0: 80050033
[233787.921017] CR2: 7ffc9230dde8 CR3: 00075a009000 CR4:
001406e0
[233787.921018] Call Trace:
[233787.921031]  ? btrfs_merge_delayed_refs+0x62/0x550 [btrfs]
[233787.921039]  __btrfs_run_delayed_refs+0x6f0/0x1380 [btrfs]
[233787.921047]  btrfs_run_delayed_refs+0x6b/0x250 [btrfs]
[233787.921054]  btrfs_write_dirty_block_groups+0x158/0x390 [btrfs]
[233787.921063]  commit_cowonly_roots+0x221/0x2c0 [btrfs]
[233787.921071]  btrfs_commit_transaction+0x46e/0x8d0 [btrfs]
[233787.921079]  transaction_kthread+0x1a2/0x1c0 [btrfs]
[233787.921081]  kthread+0x125/0x140
[233787.921088]  ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
[233787.921089]  ? kthread_create_on_node+0x70/0x70
[233787.921091]  ret_from_fork+0x25/0x30
[233787.921092] Code: 3e d3 0f ff eb d0 44 89 ee 48 c7 c7 40 53 7a c0
e8 0b ba 3e d3 0f ff e9 76 fb ff ff 44 89 ee 48 c7 c7 40 53 7a c0 e8
f5 b9 3e d3 <0f> ff e9 f7 f4 ff ff 8b 55 20 48 89 c1 49 89 d8 48 c7 c6
20 54
[233787.921107] ---[ end trace f4e71e70fbc200d2 ]---
[233787.921132] BTRFS: error (device md2) in __btrfs_free_extent:6989:
errno=-28 No space left
[233787.921189] BTRFS info (device md2): forced readonly
[233787.921191] BTRFS: error (device md2) in
btrfs_run_delayed_refs:3009: errno=-28 No space left
[233789.507669] BTRFS warning (device md2): Skipping commit of aborted
transaction.
[233789.507672] BTRFS: error (device md2) in cleanup_transaction:1873:
errno=-28 No space left




# df -h /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/md2 17T  7.3T  9.1T  45% /data


# btrfs fi show /data
Label: 'data'  uuid: fddbd057-4fa6-4b2e-a9ca-993829bab4b9
Total devices 1 FS bytes used 7.21TiB
devid1 size 16.30TiB used 12.99TiB path /dev/md2

# btrfs fi df /data
Data, single: total=12.84TiB, used=7.13TiB
System, DUP: total=8.00MiB, used=1.48MiB
Metadata, DUP: total=79.00GiB, used=77.87GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


root@srv8 ~ # btrfs fi usage /data
Overall:
Device size:  16.30TiB
Device allocated: 12.99TiB
Device unallocated:3.31TiB
Device missing:  0.00B
Used:  7.29TiB
Free (estimated):  9.01TiB  (min: 7.36TiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:12.84TiB, Used:7.13TiB
   /dev/md2   12.84TiB

Me

4.13: No space left with plenty of free space (/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989 __btrfs_free_extent.isra.62+0xc2c/0xdb0)

2017-09-07 Thread Tomasz Chmielewski
ze:8.00MiB, Used:1.48MiB
   /dev/md2   16.00MiB

Unallocated:
   /dev/md2    3.31TiB



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs hang with 4.11.1 kernel

2017-05-17 Thread Tomasz Chmielewski
007f4d0effca38 RDI: 0003
May 17 07:47:53 lxd02 kernel: [43865.595777] RBP: 7f4d0effc9d0 R08: 
7f4d0effca00 R09: 7f4c39f5
May 17 07:47:53 lxd02 kernel: [43865.595778] R10: 7f4d0effc9e0 R11: 
0293 R12: 
May 17 07:47:53 lxd02 kernel: [43865.595779] R13: 51eb851f R14: 
0001 R15: 01207fe0
May 17 07:47:53 lxd02 kernel: [43865.595782] INFO: task mysqld:7292 
blocked for more than 120 seconds.
May 17 07:47:53 lxd02 kernel: [43865.595820]   Not tainted 
4.11.1-041101-generic #201705140931
May 17 07:47:53 lxd02 kernel: [43865.595857] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 17 07:47:53 lxd02 kernel: [43865.595915] mysqld  D0  
7292   5295 0x0100

May 17 07:47:53 lxd02 kernel: [43865.595918] Call Trace:
May 17 07:47:53 lxd02 kernel: [43865.595921]  __schedule+0x3c6/0x8c0
May 17 07:47:53 lxd02 kernel: [43865.595941]  ? 
btrfs_releasepage+0x20/0x20 [btrfs]

May 17 07:47:53 lxd02 kernel: [43865.595945]  schedule+0x36/0x80
May 17 07:47:53 lxd02 kernel: [43865.595965]  
wait_current_trans+0xc2/0x100 [btrfs]
May 17 07:47:53 lxd02 kernel: [43865.595968]  ? 
wake_atomic_t_function+0x60/0x60
May 17 07:47:53 lxd02 kernel: [43865.595987]  
start_transaction+0x2d4/0x460 [btrfs]
May 17 07:47:53 lxd02 kernel: [43865.596006]  
btrfs_start_transaction+0x1e/0x20 [btrfs]
May 17 07:47:53 lxd02 kernel: [43865.596027]  
btrfs_sync_file+0x24b/0x3e0 [btrfs]

May 17 07:47:53 lxd02 kernel: [43865.596031]  vfs_fsync_range+0x4b/0xb0
May 17 07:47:53 lxd02 kernel: [43865.596034]  do_fsync+0x3d/0x70
May 17 07:47:53 lxd02 kernel: [43865.596037]  SyS_fsync+0x10/0x20
May 17 07:47:53 lxd02 kernel: [43865.596040]  do_syscall_64+0x5b/0xc0
May 17 07:47:53 lxd02 kernel: [43865.596042]  
entry_SYSCALL64_slow_path+0x25/0x25

May 17 07:47:53 lxd02 kernel: [43865.596043] RIP: 0033:0x7f869cad5b2d
May 17 07:47:53 lxd02 kernel: [43865.596044] RSP: 002b:7f86700ac550 
EFLAGS: 0293 ORIG_RAX: 004a
May 17 07:47:53 lxd02 kernel: [43865.596046] RAX: ffda RBX: 
7f86700ac630 RCX: 7f869cad5b2d
May 17 07:47:53 lxd02 kernel: [43865.596047] RDX: 7f86700ada30 RSI: 
000a RDI: 043c
May 17 07:47:53 lxd02 kernel: [43865.596049] RBP: 7f86700ac620 R08: 
 R09: 3df7938e
May 17 07:47:53 lxd02 kernel: [43865.596050] R10: 7f4cf401ca80 R11: 
0293 R12: 01207fe0
May 17 07:47:53 lxd02 kernel: [43865.596051] R13: 043c R14: 
0030 R15: 0013





Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to understand "btrfs fi show" output? "No space left" issues

2016-11-13 Thread Tomasz Chmielewski
 offset 868720050176 len 1073741824 used 434302976 
chunk_objectid 256 flags 17 usage 0.40
block group offset 869793792000 len 1073741824 used 668499968 
chunk_objectid 256 flags 17 usage 0.62
block group offset 870867533824 len 1073741824 used 221671424 
chunk_objectid 256 flags 17 usage 0.21
block group offset 871941275648 len 1073741824 used 272408576 
chunk_objectid 256 flags 17 usage 0.25
block group offset 873015017472 len 1073741824 used 386215936 
chunk_objectid 256 flags 17 usage 0.36
block group offset 874088759296 len 1073741824 used 118960128 
chunk_objectid 256 flags 17 usage 0.11
block group offset 875162501120 len 1073741824 used 238624768 
chunk_objectid 256 flags 17 usage 0.22
block group offset 876236242944 len 1073741824 used 268787712 
chunk_objectid 256 flags 17 usage 0.25
block group offset 877309984768 len 1073741824 used 461127680 
chunk_objectid 256 flags 17 usage 0.43
block group offset 878383726592 len 1073741824 used 245559296 
chunk_objectid 256 flags 17 usage 0.23
block group offset 879457468416 len 1073741824 used 552534016 
chunk_objectid 256 flags 17 usage 0.51
block group offset 880531210240 len 1073741824 used 492670976 
chunk_objectid 256 flags 17 usage 0.46
block group offset 881604952064 len 1073741824 used 607686656 
chunk_objectid 256 flags 17 usage 0.57
block group offset 882678693888 len 1073741824 used 425488384 
chunk_objectid 256 flags 17 usage 0.40
block group offset 883752435712 len 1073741824 used 259645440 
chunk_objectid 256 flags 17 usage 0.24
block group offset 884826177536 len 1073741824 used 425963520 
chunk_objectid 256 flags 17 usage 0.40
block group offset 885899919360 len 1073741824 used 232914944 
chunk_objectid 256 flags 17 usage 0.22
block group offset 886973661184 len 1073741824 used 170930176 
chunk_objectid 256 flags 17 usage 0.16
block group offset 888047403008 len 1073741824 used 247267328 
chunk_objectid 256 flags 17 usage 0.23
block group offset 889121144832 len 1073741824 used 205602816 
chunk_objectid 256 flags 17 usage 0.19
block group offset 890194886656 len 1073741824 used 323842048 
chunk_objectid 256 flags 17 usage 0.30
block group offset 891268628480 len 1073741824 used 646483968 
chunk_objectid 256 flags 17 usage 0.60
block group offset 892342370304 len 1073741824 used 335949824 
chunk_objectid 256 flags 17 usage 0.31
block group offset 893416112128 len 1073741824 used 247644160 
chunk_objectid 256 flags 17 usage 0.23
block group offset 894489853952 len 1073741824 used 393486336 
chunk_objectid 256 flags 17 usage 0.37
block group offset 895563595776 len 1073741824 used 352370688 
chunk_objectid 256 flags 17 usage 0.33
block group offset 896637337600 len 1073741824 used 563159040 
chunk_objectid 256 flags 17 usage 0.52
block group offset 897711079424 len 1073741824 used 290377728 
chunk_objectid 256 flags 17 usage 0.27
block group offset 898784821248 len 1073741824 used 483008512 
chunk_objectid 256 flags 17 usage 0.45
block group offset 899858563072 len 1073741824 used 312786944 
chunk_objectid 256 flags 17 usage 0.29
block group offset 900932304896 len 1073741824 used 248545280 
chunk_objectid 256 flags 17 usage 0.23
block group offset 902006046720 len 1073741824 used 189079552 
chunk_objectid 256 flags 17 usage 0.18
block group offset 904153530368 len 1073741824 used 115830784 
chunk_objectid 256 flags 17 usage 0.11
block group offset 905227272192 len 1073741824 used 350433280 
chunk_objectid 256 flags 17 usage 0.33
block group offset 906301014016 len 1073741824 used 306683904 
chunk_objectid 256 flags 17 usage 0.29
block group offset 907374755840 len 1073741824 used 471134208 
chunk_objectid 256 flags 17 usage 0.44
block group offset 908448497664 len 1073741824 used 230105088 
chunk_objectid 256 flags 17 usage 0.21
block group offset 909522239488 len 1073741824 used 363417600 
chunk_objectid 256 flags 17 usage 0.34
block group offset 910595981312 len 1073741824 used 302993408 
chunk_objectid 256 flags 17 usage 0.28
block group offset 912743464960 len 1040187392 used 102686720 
chunk_objectid 256 flags 17 usage 0.10
block group offset 913783652352 len 1073741824 used 206684160 
chunk_objectid 256 flags 17 usage 0.19
block group offset 914993512448 len 107806720 used 30445568 
chunk_objectid 256 flags 17 usage 0.28
block group offset 915101319168 len 19922944 used 692224 chunk_objectid 
256 flags 17 usage 0.03
block group offset 915121242112 len 8388608 used 409600 chunk_objectid 
256 flags 17 usage 0.05
total_free 255213842432 min_used 409600 free_of_min_used 7979008 
block_group_of_min_used 915121242112
balance block group (915121242112) can reduce the number of data block 
group




Tomasz Chmielewski
https://lxadm.com

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to understand "btrfs fi show" output? "No space left" issues

2016-09-26 Thread Tomasz Chmielewski

On 2016-09-21 11:51, Chris Murphy wrote:



So if it happens again, first capture the above two bits of
information, and then if  you feel like testing kernel 4.8rc7 do that.
It has a massive pile of enoscp related rework and I bet Josef would
like to know if the problem reproduces with that kernel. As in, just
change kernels, don't try to fix it with balance first.


Looks like 4.8 helped (running 4.8rc8 now).

With 4.7, after balance, the "used" value continued to grow, to around 
300 GB, although used space shown by "df" was more or less constant at 
130-140 GB:


# btrfs fi show /var/lib/lxd
Label: 'btrfs'  uuid: f5f30428-ec5b-4497-82de-6e20065e6f61
Total devices 2 FS bytes used 135.40GiB <- was growing
devid1 size 423.13GiB used 277.03GiB path /dev/sda3
devid2 size 423.13GiB used 277.03GiB path /dev/sdb3


After upgrading to 4.8rc8, "used" value dropped, so hopefully it's fixed 
now.



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


WARNING: CPU: 7 PID: 23122 at /home/kernel/COD/linux/lib/list_debug.c:59 btrfs_clear_bit_hook+0x2b9/0x350 [btrfs]

2016-08-24 Thread Tomasz Chmielewski
s+0x4c/0x140 [btrfs]
Aug 24 23:37:42 srv8 kernel: [15833.456865]  [] ? 
finish_wait+0x3b/0x70
Aug 24 23:37:42 srv8 kernel: [15833.456874]  [] ? 
btrfs_mksubvol+0x26f/0x5b0 [btrfs]
Aug 24 23:37:42 srv8 kernel: [15833.456876]  [] ? 
prepare_to_wait_event+0xe0/0xe0
Aug 24 23:37:42 srv8 kernel: [15833.456885]  [] ? 
btrfs_ioctl_snap_create_transid+0x17e/0x190 [btrfs]
Aug 24 23:37:42 srv8 kernel: [15833.456894]  [] ? 
btrfs_ioctl_snap_create_v2+0x113/0x170 [btrfs]
Aug 24 23:37:42 srv8 kernel: [15833.456903]  [] ? 
btrfs_ioctl+0x5ed/0x1fd0 [btrfs]
Aug 24 23:37:42 srv8 kernel: [15833.456904]  [] ? 
handle_pte_fault+0x8dc/0x16f0
Aug 24 23:37:42 srv8 kernel: [15833.456906]  [] ? 
cp_new_stat+0x14d/0x180
Aug 24 23:37:42 srv8 kernel: [15833.456908]  [] ? 
do_vfs_ioctl+0x9e/0x5e0
Aug 24 23:37:42 srv8 kernel: [15833.456909]  [] ? 
handle_mm_fault+0x29a/0x5a0
Aug 24 23:37:42 srv8 kernel: [15833.456910]  [] ? 
SyS_ioctl+0x74/0x80
Aug 24 23:37:42 srv8 kernel: [15833.456912]  [] ? 
entry_SYSCALL_64_fastpath+0x1e/0xa8
Aug 24 23:37:42 srv8 kernel: [15833.456922] ---[ end trace 
36156eeb236e9c12 ]---




Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski

On 2016-08-06 00:45, Tomasz Chmielewski wrote:


And, miracle cure O_o

# file ./2016-08-02/serverX/syslog.log
ERROR: cannot read `./2016-08-02/serverX/syslog.log' (Input/output 
error)


# echo 3 > /proc/sys/vm/drop_caches

# file 2016-08-02/serverX/syslog.log
2016-08-02/serverX/syslog.log: ASCII text, with very long lines

# cat 2016-08-02/serverX/syslog.log
(...)


A few mins after the previous "echo 3 > /proc/sys/vm/drop_caches" (this 
file is around 1.5 MB and wasn't touched since 2016-06-21):


# file ./2016-06-21/serverY/nginx-dashboard-error.log
./2016-06-21/serverY/nginx-dashboard-error.log: ERROR: cannot read 
`./2016-06-21/serverY/nginx-dashboard-error.log' (Input/output error)


# echo 3 > /proc/sys/vm/drop_caches

# file ./2016-06-21/serverY/nginx-dashboard-error.log
./2016-06-21/serverY/nginx-dashboard-error.log: ASCII text, with very 
long lines


# cat ./2016-06-21/serverY/nginx-dashboard-error.log
(...works OK, no corruption...)


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski

On 2016-08-06 00:40, Chris Mason wrote:

Too big for the known problem though.  Still, can you 
btrfs-debug-tree

and just make sure it doesn't have inline items?


Hmmm

# btrfs-debug-tree /dev/xvdb > /root/debug.tree
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301

Ignoring transid failure
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303

Ignoring transid failure
print-tree.c:1105: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x418d99]
btrfs-debug-tree(btrfs_print_tree+0x26a)[0x41acf6]
btrfs-debug-tree(main+0x9a5)[0x432589]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f2369de0f45]
btrfs-debug-tree[0x4070e9]


Looks like the FS is mounted?


It is mounted, yes. Does btrfs-debug-tree need an unmounted FS?

I'm not able to unmount it unfortunately (in sense, the system has to 
work).



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski

On 2016-08-06 00:38, Tomasz Chmielewski wrote:


Too big for the known problem though.  Still, can you btrfs-debug-tree
and just make sure it doesn't have inline items?


Hmmm

# btrfs-debug-tree /dev/xvdb > /root/debug.tree
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301

Ignoring transid failure
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303

Ignoring transid failure
print-tree.c:1105: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x418d99]
btrfs-debug-tree(btrfs_print_tree+0x26a)[0x41acf6]
btrfs-debug-tree(main+0x9a5)[0x432589]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f2369de0f45]
btrfs-debug-tree[0x4070e9]


And, miracle cure O_o

# file ./2016-08-02/serverX/syslog.log
ERROR: cannot read `./2016-08-02/serverX/syslog.log' (Input/output 
error)


# echo 3 > /proc/sys/vm/drop_caches

# file 2016-08-02/serverX/syslog.log
2016-08-02/serverX/syslog.log: ASCII text, with very long lines

# cat 2016-08-02/serverX/syslog.log
(...)


Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski

On 2016-08-06 00:15, Chris Mason wrote:


# cat 2016-08-02/serverX/syslog.log
cat: 2016-08-02/serverX/syslog.log: Input/output error


How big is the file?  We had one bug with inline files that might 
have

caused this.


This one's tiny, 158137 bytes.


Too big for the known problem though.  Still, can you btrfs-debug-tree
and just make sure it doesn't have inline items?


Hmmm

# btrfs-debug-tree /dev/xvdb > /root/debug.tree
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301
parent transid verify failed on 355229302784 wanted 49943295 found 
49943301

Ignoring transid failure
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303
parent transid verify failed on 355233251328 wanted 49943299 found 
49943303

Ignoring transid failure
print-tree.c:1105: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x418d99]
btrfs-debug-tree(btrfs_print_tree+0x26a)[0x41acf6]
btrfs-debug-tree(main+0x9a5)[0x432589]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f2369de0f45]
btrfs-debug-tree[0x4070e9]



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski

On 2016-08-05 23:26, Chris Mason wrote:

On 08/05/2016 07:42 AM, Tomasz Chmielewski wrote:
I'm getting occasional (every few weeks) input/output errors on a 
btrfs

filesystem with compress-force=zlib, running on Amazon EC2, with 4.5.2
kernel:

# cat 2016-08-02/serverX/syslog.log
cat: 2016-08-02/serverX/syslog.log: Input/output error


How big is the file?  We had one bug with inline files that might have
caused this.


This one's tiny, 158137 bytes.



Tomasz Chmielewski
https://lxadm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Input/output error, nothing appended in dmesg

2016-08-05 Thread Tomasz Chmielewski
I'm getting occasional (every few weeks) input/output errors on a btrfs 
filesystem with compress-force=zlib, running on Amazon EC2, with 4.5.2 
kernel:


# cat 2016-08-02/serverX/syslog.log
cat: 2016-08-02/serverX/syslog.log: Input/output error


Strangely, nothing gets appended in dmesg:

# dmesg -c
#


The filesystem stores mostly remote syslog files (so, all text files, 
appended to).


Expected?



# btrfs fi show /var/log/remote/
Label: none  uuid: 5cec93a8-7894-41f6-94a4-9d9b58216dd4
Total devices 1 FS bytes used 146.55GiB
devid1 size 200.00GiB used 153.01GiB path /dev/xvdb


# btrfs fi df /var/log/remote/
Data, single: total=149.00GiB, used=144.50GiB
System, single: total=4.00MiB, used=48.00KiB
Metadata, single: total=4.01GiB, used=2.05GiB
GlobalReserve, single: total=512.00MiB, used=0.00B



Tomasz Chmielewski
https://lxadm.com

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs won't mount to /home

2016-07-11 Thread Tomasz Chmielewski

On 2016-07-11 22:56, Roman Mamedov wrote:

On Mon, 11 Jul 2016 22:45:13 +0900
Tomasz Chmielewski <t...@virtall.com> wrote:


So, weird, isn't it?

What's wrong there?


Your systemd unmounts it immediately from /home, search the archives 
there's

been a funny story like that recently.


Yes, could be similar to this one: https://bugs.archlinux.org/task/44658

In my case, the entry did exist in /etc/fstab, but was different than at 
the time of booting the system.




Tomasz Chmielewski
https://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs won't mount to /home

2016-07-11 Thread Tomasz Chmielewski

This is kind of strange (kernel 4.6.3, Ubuntu 16.04):

# mount -t btrfs /dev/sda4 /home

# echo $?
0

# dmesg -c
[382148.588847] BTRFS info (device sda4): disk space caching is enabled
[382148.588851] BTRFS: has skinny extents

So it worked?

# ls /home

<empty, should be showing data>

# df | grep home

# mount | grep /home

# lsof -n | grep /home


All give no output.


So, weird, isn't it?

Now, let's try to mount to /home2:

# mkdir /home2

# mount /dev/sda4 /home2

# mount | grep home
/dev/sda4 on /home2 type btrfs 
(rw,relatime,space_cache,subvolid=5,subvol=/)


# dmesg -c
[382190.199363] BTRFS info (device sda4): disk space caching is enabled
[382190.199370] BTRFS: has skinny extents



What's wrong there?


Tomasz Chmielewski
https://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: filesystem read only after power outage

2016-07-09 Thread Tomasz Chmielewski

On 2016-07-05 15:55, Tomasz Chmielewski wrote:

On 2016-07-05 14:56, Tomasz Chmielewski wrote:

Getting this lengthy output logged, and the fs mounter read-only after
a power outage.


Tried also 4.6.3, but it ends just alike.

Jul  5 02:04:20 bkp011 kernel: [  799.298303] [ cut here
]
Jul  5 02:04:20 bkp011 kernel: [  799.298335] WARNING: CPU: 0 PID:
1896 at /home/kernel/COD/linux/fs/btrfs/extent-tree.c:6599
__btrfs_free_extent.isra.70+0x877/0xd50 [btrfs]



FYI, 4.7-rc6 breaks just like 4.6.0 and 4.6.3.


I've tried running btrfs check --repair -p, but after some 2 days it 
ended with:


reset nbytes for ino 2418526 root 12989
reset nbytes for ino 2634616 root 12989
reset nbytes for ino 2746243 root 12989
reset nbytes for ino 2824005 root 12989
ctree.c:195: update_ref_for_cow: Assertion `ret` failed.
btrfs(__btrfs_cow_block+0x399)[0x43a719]
btrfs(btrfs_cow_block+0x102)[0x43afac]
btrfs(btrfs_search_slot+0x1cc)[0x43da5d]
btrfs[0x423daa]
btrfs(cmd_check+0x29bf)[0x42c68c]
btrfs(main+0x155)[0x409f63]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f96b3271830]
btrfs(_start+0x29)[0x409b59]
reset nbytes for ino 3217513 root 12989
Failed to find [90569834496, 168, 16384]



Tomasz Chmielewski
https://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: filesystem read only after power outage

2016-07-05 Thread Tomasz Chmielewski

On 2016-07-05 14:56, Tomasz Chmielewski wrote:

Getting this lengthy output logged, and the fs mounter read-only after
a power outage.


Tried also 4.6.3, but it ends just alike.

Jul  5 02:04:20 bkp011 kernel: [  799.298303] [ cut here
]
Jul  5 02:04:20 bkp011 kernel: [  799.298335] WARNING: CPU: 0 PID:
1896 at /home/kernel/COD/linux/fs/btrfs/extent-tree.c:6599
__btrfs_free_extent.isra.70+0x877/0xd50 [btrfs]



FYI, 4.7-rc6 breaks just like 4.6.0 and 4.6.3.


Tomasz Chmielewski
https://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


filesystem read only after power outage

2016-07-05 Thread Tomasz Chmielewski
Getting this lengthy output logged, and the fs mounter read-only after a 
power outage.



Tried also 4.6.3, but it ends just alike.

Jul  5 02:04:20 bkp011 kernel: [  799.298303] [ cut here 
]
Jul  5 02:04:20 bkp011 kernel: [  799.298335] WARNING: CPU: 0 PID: 1896 
at /home/kernel/COD/linux/fs/btrfs/extent-tree.c:6599 
__btrfs_free_extent.isra.70+0x877/0xd50 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298337] Modules linked in: 
xt_CHECKSUM iptable_mangle xt_nat xt_tcpudp ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables bridge 
stp llc btrfs intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp 
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel aesni_intel ppdev eeepc_wmi asus_wmi aes_x86_64 
sparse_keymap lrw gf128mul input_leds serio_raw glue_helper lpc_ich 
ablk_helper shpchp 8250_fintek cryptd mac_hid tpm_infineon parport_pc lp 
parport autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 
ahci r8169 libahci mii wmi video fjes
Jul  5 02:04:20 bkp011 kernel: [  799.298388] CPU: 0 PID: 1896 Comm: 
btrfs-cleaner Not tainted 4.6.0-040600-generic #201605151930
Jul  5 02:04:20 bkp011 kernel: [  799.298389] Hardware name: System 
manufacturer System Product Name/P8H67-M PRO, BIOS 1106 10/17/2011
Jul  5 02:04:20 bkp011 kernel: [  799.298392]  0286 
a9a73565 880418a97b20 813f1dd3
Jul  5 02:04:20 bkp011 kernel: [  799.298395]   
 880418a97b60 810827eb
Jul  5 02:04:20 bkp011 kernel: [  799.298398]  19c7c02d62c6 
fffe 02ab7a2a 02d091318000

Jul  5 02:04:20 bkp011 kernel: [  799.298401] Call Trace:
Jul  5 02:04:20 bkp011 kernel: [  799.298407]  [] 
dump_stack+0x63/0x90
Jul  5 02:04:20 bkp011 kernel: [  799.298412]  [] 
__warn+0xcb/0xf0
Jul  5 02:04:20 bkp011 kernel: [  799.298415]  [] 
warn_slowpath_null+0x1d/0x20
Jul  5 02:04:20 bkp011 kernel: [  799.298432]  [] 
__btrfs_free_extent.isra.70+0x877/0xd50 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298456]  [] ? 
btrfs_merge_delayed_refs+0x5f/0x6a0 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298474]  [] 
__btrfs_run_delayed_refs+0xa7d/0x11e0 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298492]  [] 
btrfs_run_delayed_refs+0x8e/0x2b0 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298512]  [] 
btrfs_should_end_transaction+0x5a/0x60 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298529]  [] 
btrfs_drop_snapshot+0x43c/0x810 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298547]  [] 
btrfs_clean_one_deleted_snapshot+0xbb/0x110 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298565]  [] 
cleaner_kthread+0xcb/0x1c0 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298583]  [] ? 
btrfs_destroy_pinned_extent+0xe0/0xe0 [btrfs]
Jul  5 02:04:20 bkp011 kernel: [  799.298586]  [] 
kthread+0xd8/0xf0
Jul  5 02:04:20 bkp011 kernel: [  799.298590]  [] 
ret_from_fork+0x22/0x40
Jul  5 02:04:20 bkp011 kernel: [  799.298593]  [] ? 
kthread_create_on_node+0x1c0/0x1c0
Jul  5 02:04:20 bkp011 kernel: [  799.298596] ---[ end trace 
9957b92aaab5820e ]---
Jul  5 02:04:20 bkp011 kernel: [  799.298600] BTRFS info (device sda4): 
leaf 361415409664 total ptrs 131 free space 3195
Jul  5 02:04:20 bkp011 kernel: [  799.298602] 	item 0 key (2935511810048 
169 0) itemoff 16025 itemsize 258
Jul  5 02:04:20 bkp011 kernel: [  799.298604] 		extent refs 26 gen 4082 
flags 258
Jul  5 02:04:20 bkp011 kernel: [  799.298606] 		shared block backref 
parent 3786607509504
Jul  5 02:04:20 bkp011 kernel: [  799.298607] 		shared block backref 
parent 3786134093824
Jul  5 02:04:20 bkp011 kernel: [  799.298608] 		shared block backref 
parent 3785932996608
Jul  5 02:04:20 bkp011 kernel: [  799.298610] 		shared block backref 
parent 3785484910592
Jul  5 02:04:20 bkp011 kernel: [  799.298611] 		shared block backref 
parent 3785154478080
Jul  5 02:04:20 bkp011 kernel: [  799.298612] 		shared block backref 
parent 3784106180608
Jul  5 02:04:20 bkp011 kernel: [  799.298613] 		shared block backref 
parent 3783058407424
Jul  5 02:04:20 bkp011 kernel: [  799.298615] 		shared block backref 
parent 3782999293952
Jul  5 02:04:20 bkp011 kernel: [  799.298616] 		shared block backref 
parent 3404037406720
Jul  5 02:04:20 bkp011 kernel: [  799.298617] 		shared block backref 
parent 3095471308800
Jul  5 02:04:20 bkp011 kernel: [  799.298618] 		shared block backref 
parent 3095125835776
Jul  5 02:04:20 bkp011 kernel: [  799.298620] 		shared block backref 
parent 3095058219008
Jul  5 02:04:20 bkp011 kernel: [  799.298621] 		shared block backref 
parent 3094976118784
Jul  5 02:04:20 bkp011 kernel: [  799.298622] 		shared block backref 
parent 3094686679040
Jul  5 02:04:20 bkp011 kernel: [  799.298623] 		shared block backref 
parent 3094557196288
Jul  5 02:04:20 bkp011 kernel: [  799.298624] 		shared block 

can't use btrfs on USB-stick (write errors)

2016-06-21 Thread Tomasz Chmielewski
tiv kernel: [57362.033870] Call Trace:
Jun 14 07:50:26 ativ kernel: [57362.033884]  [] 
dump_stack+0x63/0x90
Jun 14 07:50:26 ativ kernel: [57362.033893]  [] 
warn_slowpath_common+0x82/0xc0
Jun 14 07:50:26 ativ kernel: [57362.033899]  [] 
warn_slowpath_fmt+0x5c/0x80
Jun 14 07:50:26 ativ kernel: [57362.033937]  [] 
cleanup_transaction+0x92/0x300 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.033945]  [] ? 
wake_atomic_t_function+0x60/0x60
Jun 14 07:50:26 ativ kernel: [57362.033983]  [] 
btrfs_commit_transaction+0x2e4/0xa90 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.034020]  [] 
btrfs_commit_super+0x8f/0xa0 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.034056]  [] 
close_ctree+0x2ae/0x360 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.034088]  [] 
btrfs_put_super+0x19/0x20 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.034097]  [] 
generic_shutdown_super+0x6f/0x100
Jun 14 07:50:26 ativ kernel: [57362.034102]  [] 
kill_anon_super+0x12/0x20
Jun 14 07:50:26 ativ kernel: [57362.034133]  [] 
btrfs_kill_super+0x18/0x120 [btrfs]
Jun 14 07:50:26 ativ kernel: [57362.034140]  [] 
deactivate_locked_super+0x43/0x70
Jun 14 07:50:26 ativ kernel: [57362.034145]  [] 
deactivate_super+0x5c/0x60
Jun 14 07:50:26 ativ kernel: [57362.034151]  [] 
cleanup_mnt+0x3f/0x90
Jun 14 07:50:26 ativ kernel: [57362.034156]  [] 
__cleanup_mnt+0x12/0x20
Jun 14 07:50:26 ativ kernel: [57362.034163]  [] 
task_work_run+0x73/0x90
Jun 14 07:50:26 ativ kernel: [57362.034170]  [] 
exit_to_usermode_loop+0xc2/0xd0
Jun 14 07:50:26 ativ kernel: [57362.034176]  [] 
syscall_return_slowpath+0x4e/0x60
Jun 14 07:50:26 ativ kernel: [57362.034185]  [] 
int_ret_from_sys_call+0x25/0x8f
Jun 14 07:50:26 ativ kernel: [57362.034189] ---[ end trace 
43a0a9df7507537d ]---
Jun 14 07:50:26 ativ kernel: [57362.034194] BTRFS: error (device sdb1) 
in cleanup_transaction:1746: errno=-5 IO failure
Jun 14 07:50:26 ativ kernel: [57362.034200] BTRFS info (device sdb1): 
delayed_refs has NO entry
Jun 14 07:50:26 ativ kernel: [57362.034220] BTRFS error (device sdb1): 
commit super ret -5
Jun 14 07:50:26 ativ kernel: [57362.034339] BTRFS error (device sdb1): 
cleaner transaction attach returned -30




Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs RAID-1 vs md RAID-1?

2016-05-15 Thread Tomasz Chmielewski
I'm trying to read two large files in parallel from a 2-disk RAID-1 
btrfs setup (using kernel 4.5.3).


According to iostat, one of the disks is 100% saturated, while the other 
disk is around 0% busy.


Is it expected?

With two readers from the same disk, each file is being read with ~50 
MB/s from disk (with just one reader from disk, the speed goes up to 
around ~150 MB/s).



In md RAID, with many readers, it will try to distribute the reads - 
after md manual on http://linux.die.net/man/4/md:


Raid1
(...)
Data is read from any one device. The driver attempts to distribute 
read requests across all devices

to maximise performance.

Raid5
(...)
This also allows more parallelism when reading, as read requests are 
distributed over all the devices

in the array instead of all but one.


Are there any plans to improve this is btrfs?


Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4.0 - no space left with >1.7 TB free space left

2016-05-12 Thread Tomasz Chmielewski

On 2016-05-12 15:03, Tomasz Chmielewski wrote:


FYI, I'm still getting this with 4.5.3, which probably means the fix
was not yet included ("No space left" at snapshot time):

/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
LOG:  could not close temporary statistics file
"pg_stat_tmp/db_0.tmp": No space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
LOG:  could not close temporary statistics file
"pg_stat_tmp/global.tmp": No space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
LOG:  could not close temporary statistics file
"pg_stat_tmp/db_0.tmp": No space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC
LOG:  could not close temporary statistics file
"pg_stat_tmp/global.tmp": No space left on device


I've tried mounting with space_cache=v2, but it didn't help.


On the good side, I see it's in 4.6-rc7.


Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4.0 - no space left with >1.7 TB free space left

2016-05-12 Thread Tomasz Chmielewski

On 2016-04-08 20:53, Roman Mamedov wrote:


> Do you snapshot the parent subvolume which holds the databases? Can you
> correlate that perhaps ENOSPC occurs at the time of snapshotting? If
> yes, then
> you should try the patch https://patchwork.kernel.org/patch/7967161/
>
> (Too bad this was not included into 4.4.1.)

By the way - was it included in any later kernel? I'm running 4.4.5 on
that server, but still hitting the same issue.


It's not in 4.4.6 either. I don't know why it doesn't get included, or 
what

we need to do. Last time I asked, it was queued:
http://www.spinics.net/lists/linux-btrfs/msg52478.html
But maybe that meant 4.5 or 4.6 only? While the bug is affecting people 
on

4.4.x today.


FYI, I'm still getting this with 4.5.3, which probably means the fix was 
not yet included ("No space left" at snapshot time):


/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
 could not close temporary statistics file "pg_stat_tmp/db_0.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
 could not close temporary statistics file "pg_stat_tmp/global.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
 could not close temporary statistics file "pg_stat_tmp/db_0.tmp": No 
space left on device
/var/log/postgresql/postgresql-9.3-main.log:2016-05-11 06:06:10 UTC LOG: 
 could not close temporary statistics file "pg_stat_tmp/global.tmp": No 
space left on device



I've tried mounting with space_cache=v2, but it didn't help.



Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4.0 - no space left with >1.7 TB free space left

2016-04-08 Thread Tomasz Chmielewski

On 2016-04-08 20:53, Roman Mamedov wrote:


> Do you snapshot the parent subvolume which holds the databases? Can you
> correlate that perhaps ENOSPC occurs at the time of snapshotting? If
> yes, then
> you should try the patch https://patchwork.kernel.org/patch/7967161/
>
> (Too bad this was not included into 4.4.1.)

By the way - was it included in any later kernel? I'm running 4.4.5 on
that server, but still hitting the same issue.


It's not in 4.4.6 either. I don't know why it doesn't get included, or 
what

we need to do. Last time I asked, it was queued:
http://www.spinics.net/lists/linux-btrfs/msg52478.html
But maybe that meant 4.5 or 4.6 only? While the bug is affecting people 
on

4.4.x today.


Does it mean 4.5 also doesn't have it yet?


Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4.0 - no space left with >1.7 TB free space left

2016-04-08 Thread Tomasz Chmielewski

On 2016-02-08 20:24, Roman Mamedov wrote:


Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
often snapshots, and at times, there is heavy IO in many of them for
extended periods of time. btrfs is on HDDs.


Every few days I'm getting "no space left" in a container running 
mongo

3.2.1 database. Interestingly, haven't seen this issue in containers
with MySQL. All databases have chattr +C set on their directories.


Hello,

Do you snapshot the parent subvolume which holds the databases? Can you
correlate that perhaps ENOSPC occurs at the time of snapshotting? If 
yes, then

you should try the patch https://patchwork.kernel.org/patch/7967161/

(Too bad this was not included into 4.4.1.)


By the way - was it included in any later kernel? I'm running 4.4.5 on 
that server, but still hitting the same issue.



Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and containers

2016-03-10 Thread Tomasz Chmielewski

I have been running systemd-nspawn containers on top of a btrfs
filesystem for a while now.

This works great: Snapshots are a huge help to manage containers!

But today I ran btrfs subvol list . *inside* a container. To my
surprise I got a list of *all* subvolumes on that drive. That is
basically a complete list of containers running on the machine. I do
not want to have that kind of information exposed to my containers.


You seem to be running a privileged container, i.e. container's root is 
the same UID as host root. This is typically undesired and means that 
your containers have full access to data on host and on other 
containers.


For the record, with a privileged container you can not only list the 
subvolumes, but also list disk data (i.e. dd if=/dev/sda) or even 
destroy that data (dd if=/dev/zero of = / dev / sda).


So, think twice if the container setup you have is what you want!

LXD is particularly easy to run unprivileged containers: 
https://linuxcontainers.org/ (starts containers as unprivileged by 
default, and has lots of many goodies in general).



Tomasz Chmielewski
http://wpkg.org


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.4.0 - no space left with >1.7 TB free space left

2016-02-08 Thread Tomasz Chmielewski
Linux 4.4.0 - btrfs is mainly used to host lots of test containers, 
often snapshots, and at times, there is heavy IO in many of them for 
extended periods of time. btrfs is on HDDs.



Every few days I'm getting "no space left" in a container running mongo 
3.2.1 database. Interestingly, haven't seen this issue in containers 
with MySQL. All databases have chattr +C set on their directories.


Why would it fail, if there is so much space left?


2016-02-07T06:06:14.648+ E STORAGE  [thread1] WiredTiger (28) 
[1454825174:633585][9105:0x7f2b7e33e700], 
file:collection-33-7895599108848542105.wt, WT_SESSION.checkpoint: 
collection-33-7895599108848542105.wt write error: failed to write 4096 
bytes at offset 20480: No space left on device
2016-02-07T06:06:14.648+ E STORAGE  [thread1] WiredTiger (28) 
[1454825174:648740][9105:0x7f2b7e33e700], checkpoint-server: checkpoint 
server error: No space left on device
2016-02-07T06:06:14.648+ E STORAGE  [thread1] WiredTiger (-31804) 
[1454825174:648766][9105:0x7f2b7e33e700], checkpoint-server: the process 
must exit and restart: WT_PANIC: WiredTiger library panic

2016-02-07T06:06:14.648+ I -[thread1] Fatal Assertion 28558
2016-02-07T06:06:14.648+ I -[thread1]

***aborting after fassert() failure


2016-02-07T06:06:14.694+ I -[WTJournalFlusher] Fatal 
Assertion 28559

2016-02-07T06:06:14.694+ I -[WTJournalFlusher]

***aborting after fassert() failure


2016-02-07T06:06:15.203+ F -[WTJournalFlusher] Got signal: 6 
(Aborted).







# df -h /srv
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda4   2.7T  1.1T  1.7T  39% /srv

# btrfs fi df /srv
Data, RAID1: total=1.25TiB, used=1014.01GiB
System, RAID1: total=32.00MiB, used=240.00KiB
Metadata, RAID1: total=15.00GiB, used=13.13GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs fi show /srv
Label: 'btrfs'  uuid: 105b2e0c-8af2-45ee-b4c8-14ff0a3ca899
Total devices 2 FS bytes used 1.00TiB
devid1 size 2.63TiB used 1.26TiB path /dev/sda4
devid2 size 2.63TiB used 1.26TiB path /dev/sdb4

btrfs-progs v4.0.1



Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4.0 - no space left with >1.7 TB free space left

2016-02-08 Thread Tomasz Chmielewski

On 2016-02-08 20:24, Roman Mamedov wrote:

On Mon, 08 Feb 2016 18:22:34 +0900
Tomasz Chmielewski <t...@virtall.com> wrote:


Linux 4.4.0 - btrfs is mainly used to host lots of test containers,
often snapshots, and at times, there is heavy IO in many of them for
extended periods of time. btrfs is on HDDs.


Every few days I'm getting "no space left" in a container running 
mongo

3.2.1 database. Interestingly, haven't seen this issue in containers
with MySQL. All databases have chattr +C set on their directories.


Hello,

Do you snapshot the parent subvolume which holds the databases? Can you
correlate that perhaps ENOSPC occurs at the time of snapshotting?


Not sure.

With the last error, a snapshot was made at around 06:06, while "no 
space left" was reported on 06:14. Suspiciously close to each other, but 
still, a few minutes away.


Unfortunately I don't have error log for previous cases.



If yes, then
you should try the patch https://patchwork.kernel.org/patch/7967161/

(Too bad this was not included into 4.4.1.)


I'll keep an eye on it, thanks.


Tomasz Chmielewski
http://www.ptraveler.com

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


compression disk space saving - what are your results?

2015-12-02 Thread Tomasz Chmielewski

What are your disk space savings when using btrfs with compression?

I have a 200 GB btrfs filesystem which uses compress=zlib, only stores 
text files (logs), mostly multi-gigabyte files.



It's a "single" filesystem, so "df" output matches "btrfs fi df":

# df -h
Filesystem  Size  Used Avail Use% Mounted on
(...)
/dev/xvdb   200G  124G   76G  62% /var/log/remote


# du -sh /var/log/remote/
153G/var/log/remote/


From these numbers (124 GB used where data size is 153 GB), it appears 
that we save around 20% with zlib compression enabled.
Is 20% reasonable saving for zlib? Typically text compresses much better 
with that algorithm, although I understand that we have several 
limitations when applying that on a filesystem level.



Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compression disk space saving - what are your results?

2015-12-02 Thread Tomasz Chmielewski

On 2015-12-02 22:03, Austin S Hemmelgarn wrote:

 From these numbers (124 GB used where data size is 153 GB), it 
appears

that we save around 20% with zlib compression enabled.
Is 20% reasonable saving for zlib? Typically text compresses much 
better

with that algorithm, although I understand that we have several
limitations when applying that on a filesystem level.


This is actually an excellent question.  A couple of things to note
before I share what I've seen:
1. Text compresses better with any compression algorithm.  It is by
nature highly patterned and moderately redundant data, which is what
benefits the most from compression.


It looks that compress=zlib does not compress very well. Following 
Duncan's suggestion, I've changed it to compress-force=zlib, and 
re-copied the data to make sure the file are compressed.


Compression ratio is much much better now (on a slightly changed data 
set):


# df -h
/dev/xvdb   200G   24G  176G  12% /var/log/remote


# du -sh /var/log/remote/
138G/var/log/remote/


So, 138 GB files use just 24 GB on disk - nice!

However, I would still expect that compress=zlib has almost the same 
effect as compress-force=zlib, for 100% text files/logs.



Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compression disk space saving - what are your results?

2015-12-02 Thread Tomasz Chmielewski

On 2015-12-02 23:03, Wang Shilong wrote:

Compression ratio is much much better now (on a slightly changed data 
set):


# df -h
/dev/xvdb   200G   24G  176G  12% /var/log/remote


# du -sh /var/log/remote/
138G/var/log/remote/


So, 138 GB files use just 24 GB on disk - nice!

However, I would still expect that compress=zlib has almost the same 
effect

as compress-force=zlib, for 100% text files/logs.


btw, what is your kernel version? there was a bug that detected inode
compression
ration wrong.

http://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?id=68bb462d42a963169bf7acbe106aae08c17129a5
http://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?id=4bcbb33255131adbe481c0467df26d654ce3bc78


Linux 4.3.0.


Tomasz Chmielewski
http://wpkg.org/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes with btrfs and busy database IO - how to debug?

2015-06-15 Thread Tomasz Chmielewski

Hello,

Do you want me to produce one more crash / hang? I had to restart the 
hanged server (via echo b  /proc/sysrq-trigger).



Tomasz


On 2015-06-15 17:10, Qu Wenruo wrote:

Now we can get the full backtrace.
That's a step forward

[45705.854778] BUG: unable to handle kernel NULL pointer dereference
at 0008
[45705.854824] IP: [c0158b8e]
btrfs_wait_pending_ordered+0x5e/0x110 [btrfs]
[45705.855615] Call Trace:
[45705.855637]  [c015addb]
btrfs_commit_transaction+0x40b/0xb60 [btrfs]
[45705.855671]  [810c0700] ? 
prepare_to_wait_event+0x100/0x100
[45705.855698]  [c0171973] btrfs_sync_file+0x313/0x380 
[btrfs]

[45705.855721]  [81236c26] vfs_fsync_range+0x46/0xc0
[45705.855740]  [81236cbc] vfs_fsync+0x1c/0x20
[45705.855758]  [81236cf8] do_fsync+0x38/0x70
[45705.855777]  [812370d0] SyS_fsync+0x10/0x20
[45705.855796]  [8180cbb2] system_call_fastpath+0x16/0x75

Also the hang seems to be highly related to the bug,
would you please send a new mail reporting the hang?

Thanks,
Qu

在 2015年06月14日 15:58, Tomasz Chmielewski 写道:

On 2015-06-14 09:30, Tomasz Chmielewski wrote:

On 2015-06-13 08:23, Tomasz Chmielewski wrote:

I did get it from /var/crash/ though - is it more useful? I don't 
have

vmlinux for this kernel though, but have just built 4.1-rc7 with the
same config, can try to get the crash there.


I've uploaded a crash dump and vmlinux here:

http://www.virtall.com/files/temp/201506132321/

Let me know if it's anything useful or if you need more info.


I've tried running it the same procedure to get one more crash, but it
didn't crash this time.

Instead, btrfs is hanged on any writes - any processes trying to write
get into D state and never return; there is no write activity when
checking for example with iostat. sync command does not return.

Reads from this btrfs filesystem are OK.

I've uploaded the output of echo w  /proc/sysrq-trigger here:

http://www.virtall.com/files/temp/dmesg.txt


Tomasz Chmielewski


Hello,

sorry, not sure what you mean.

Do you want me to produce one more crash / or hang?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes with btrfs and busy database IO - how to debug?

2015-06-14 Thread Tomasz Chmielewski

On 2015-06-14 09:30, Tomasz Chmielewski wrote:

On 2015-06-13 08:23, Tomasz Chmielewski wrote:


I did get it from /var/crash/ though - is it more useful? I don't have
vmlinux for this kernel though, but have just built 4.1-rc7 with the
same config, can try to get the crash there.


I've uploaded a crash dump and vmlinux here:

http://www.virtall.com/files/temp/201506132321/

Let me know if it's anything useful or if you need more info.


I've tried running it the same procedure to get one more crash, but it 
didn't crash this time.


Instead, btrfs is hanged on any writes - any processes trying to write 
get into D state and never return; there is no write activity when 
checking for example with iostat. sync command does not return.


Reads from this btrfs filesystem are OK.

I've uploaded the output of echo w  /proc/sysrq-trigger here:

http://www.virtall.com/files/temp/dmesg.txt


Tomasz Chmielewski

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes with btrfs and busy database IO - how to debug?

2015-06-13 Thread Tomasz Chmielewski

On 2015-06-13 08:23, Tomasz Chmielewski wrote:


I did get it from /var/crash/ though - is it more useful? I don't have
vmlinux for this kernel though, but have just built 4.1-rc7 with the
same config, can try to get the crash there.


I've uploaded a crash dump and vmlinux here:

http://www.virtall.com/files/temp/201506132321/

Let me know if it's anything useful or if you need more info.


--
Tomasz Chmielewski


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes with btrfs and busy database IO - how to debug?

2015-06-12 Thread Tomasz Chmielewski

On 2015-06-12 16:13, Qu Wenruo wrote:


Remote syslog does not capture anything.

No backtrace?


No (nothing saved on disk, don't have VNC access).

The only way to capture anything is:

while true; do dmesg -c ; done

but that's usually incomplete.



Without backtrace, it's much harder to debug for us.
It's quite possible that some codes go mad and pass a NULL pointer,
and then wait_event() is called on the NULL-some_member.

Anyway, backtrace is needed to debug this.

If syslog can't help, what about kdump + crash to get the backtrace?


I'll try to get a kdump + crash.

--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes with btrfs and busy database IO - how to debug?

2015-06-12 Thread Tomasz Chmielewski
[45705.855326] RAX:  RBX: 88000e1e0078 RCX: 
322e
[45705.855347] RDX:  RSI: 322e RDI: 
8808068aa838
[45705.855368] RBP: 8800a0623d88 R08:  R09: 

[45705.855389] R10: 0001 R11:  R12: 
880806d67800
[45705.855410] R13: 8808068aa838 R14: 88000e1e R15: 
8800b3c5fc20
[45705.855431] FS:  7f6fc5f37700() GS:88082fa4() 
knlGS:

[45705.855463] CS:  0010 DS:  ES:  CR0: 80050033
[45705.855482] CR2: 0008 CR3: 00046293a000 CR4: 
000407e0

[45705.855503] Stack:
[45705.855516]  8800a0623d48 8800b3c5fcd0 8808094fe800 
880806d67800
[45705.855549]  88080a70ec28 a0623db0 0283 
88080a6f1c60
[45705.855582]  880806d67800 88080a6f1c60 880806d67800 


[45705.855615] Call Trace:
[45705.855637]  [c015addb] 
btrfs_commit_transaction+0x40b/0xb60 [btrfs]

[45705.855671]  [810c0700] ? prepare_to_wait_event+0x100/0x100
[45705.855698]  [c0171973] btrfs_sync_file+0x313/0x380 [btrfs]
[45705.855721]  [81236c26] vfs_fsync_range+0x46/0xc0
[45705.855740]  [81236cbc] vfs_fsync+0x1c/0x20
[45705.855758]  [81236cf8] do_fsync+0x38/0x70
[45705.855777]  [812370d0] SyS_fsync+0x10/0x20
[45705.855796]  [8180cbb2] system_call_fastpath+0x16/0x75
[45705.855815] Code: 45 98 48 39 d8 0f 84 ad 00 00 00 48 8d 45 a8 48 83 
c0 18 48 89 45 90 66 0f 1f 44 00 00 48 8b 13 48 8b 43 08 4c 89 ef 4c 8d 
73 88 48 89 42 08 48 89 10 48 89 1b 48 89 5b 08 e8 4f 3b 6b c1 e8 3a
[45705.855906] RIP  [c0158b8e] 
btrfs_wait_pending_ordered+0x5e/0x110 [btrfs]

[45705.855944]  RSP 8800a0623d18
[45705.855959] CR2: 0008


--
Tomasz Chmielewski
http://wpkg.org


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel crashes with btrfs and busy database IO - how to debug?

2015-06-11 Thread Tomasz Chmielewski
I have a server where I've installed a couple of LXC guests, btrfs - so 
easy to test things with snapshots. Or so it seems.


Unfortunately the box crashes when I put too much IO load - with too 
much load being these two running at the same time:


- quite busy MySQL database (doing up to 100% IO wait when running 
alone)

- busy mongo database (doing up to 100% IO wait when running alone)

With both mongo and mysql running at the same time, it crashes after 1-2 
days (tried kernels 4.0.4, 4.0.5, 4.1-rc7 from Ubuntu kernel-ppa). It 
does not crash if I only run mongo, or only mysql. There is plenty of 
memory available (just around 2-4 GB used out of 32 GB) when it crashes.


As the box is only reachable remotely, I'm not able to catch a crash.
Sometimes, I'm able to get a bit of it printed via remote SSH, like 
here:


[162276.341030] BUG: unable to handle kernel NULL pointer dereference at 
0008
[162276.341069] IP: [810c06cd] 
prepare_to_wait_event+0xcd/0x100

[162276.341096] PGD 80a15e067 PUD 6e08c2067 PMD 0
[162276.341116] Oops: 0002 [#1] SMP
[162276.341133] Modules linked in: xfs libcrc32c xt_conntrack veth 
xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc 
intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp 
kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
aesni_intel aes_x86_64 lrw eeepc_wmi gf128mul asus_wmi glue_helper 
sparse_keymap ablk_helper cryptd ie31200_edac shpchp lpc_ich edac_core 
mac_hid 8250_fintek tpm_infineon wmi serio_raw video lp parport btrfs 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq e1000e raid1 raid0 ahci ptp libahci multipath 
pps_core linear [last unloaded: xfs]
[162276.341394] CPU: 6 PID: 12853 Comm: mysqld Not tainted 
4.1.0-040100rc7-generic #201506080035
[162276.341428] Hardware name: System manufacturer System Product 
Name/P8B WS, BIOS 0904 10/24/2011
[162276.341463] task: 8800730d8a10 ti: 88047a0f8000 task.ti: 
88047a0f8000
[162276.341495] RIP: 0010:[810c06cd]  [810c06cd] 
prepare_to_wait_event+0xcd/0x100

[162276.341532] RSP: 0018:88047a0fbcd8  EFLAGS: 00010046
[162276.341583] RDX: 88047a0fbd48 RSI: 8800730d8a10 RDI: 
8801e2f96ee8
[162276.341615] RBP: 88047a0fbd08 R08:  R09: 
0001
[162276.341646] R10: 0001 R11:  R12: 
8801e2f96ee8
[162276.341678] R13: 0002 R14: 8801e2f96e60 R15: 
8806b513f248
[162276.341709] FS:  7f9f2bbd3700() GS:88082fb8() 
knlGS:


Remote syslog does not capture anything.

The above crash does not point at btrfs - although the box does not 
crash with the same tests done on ext4. The box passes memtests and is 
generally stable otherwise.


How can I debug this further?


prepare_to_wait_event can be found here in 4.1-rc7 kernel:

include/linux/wait.h:   long __int = prepare_to_wait_event(wq, 
__wait, state);\
include/linux/wait.h:long prepare_to_wait_event(wait_queue_head_t *q, 
wait_queue_t *wait, int state);
kernel/sched/wait.c:long prepare_to_wait_event(wait_queue_head_t *q, 
wait_queue_t *wait, int state)

kernel/sched/wait.c:EXPORT_SYMBOL(prepare_to_wait_event);



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.1-rc6 - kernel crash after doing chattr +C

2015-06-06 Thread Tomasz Chmielewski

4.1-rc6, busy filesystem.

I was running mongo import which made quite a lot of IO.
During the import, I did chattr +C /var/lib/mongodb - shortly after I 
saw this in dmesg and server died:


[57860.149839] BUG: unable to handle kernel NULL pointer dereference at 
0008
[57860.149877] IP: [c0158b8e] 
btrfs_wait_pending_ordered+0x5e/0x110 [btrfs]

[57860.149923] PGD 5d1ac6067 PUD 5d40fc067 PMD 0
[57860.149943] Oops: 0002 [#1] SMP
[57860.149960] Modules linked in: xt_conntrack veth xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc intel_rapl 
iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm 
crct10dif_pclmul eeepc_wmi asus_wmi crc32_pclmul ghash_clmulni_intel 
sparse_keymap aesni_intel aes_x86_64 ie31200_edac lpc_ich lrw gf128mul 
edac_core glue_helper ablk_helper shpchp cryptd serio_raw wmi video 
tpm_infineon 8250_fintek mac_hid btrfs lp parport raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
e1000e raid1 ahci raid0 ptp libahci pps_core multipath linear
[57860.150203] CPU: 4 PID: 14111 Comm: mongod Not tainted 
4.1.0-040100rc6-generic #201506010235
[57860.150237] Hardware name: System manufacturer System Product 
Name/P8B WS, BIOS 0904 10/24/2011
[57860.150271] task: 88007901bc60 ti: 8805d5c38000 task.ti: 
8805d5c38000
[57860.150303] RIP: 0010:[c0158b8e]  [c0158b8e] 
btrfs_wait_pending_ordered+0x5e/0x110 [btrfs]

[57860.150346] RSP: 0018:8805d5c3bd18  EFLAGS: 00010206
[57860.150364] RAX:  RBX: 880103c9d950 RCX: 
3d44
[57860.150386] RDX:  RSI: 3d44 RDI: 
880806a74838
[57860.150407] RBP: 8805d5c3bd88 R08:  R09: 

[57860.150428] R10: 0001 R11:  R12: 
880806bcb800
[57860.150450] R13: 880806a74838 R14: 880103c9d8d8 R15: 
88080a7e3518
[57860.150471] FS:  7f5f4e6dc700() GS:88082fb0() 
knlGS:

[57860.150504] CS:  0010 DS:  ES:  CR0: 80050033
[57860.150523] CR2: 0008 CR3: 00062a584000 CR4: 
000407e0

[57860.150544] Stack:
[57860.150558]  8805d5c3bd48 88080a7e35c8 880806bcb000 
880806bcb800
[57860.150592]  8800070da638 d5c3bdb0 0287 
88080a72a4d0
[57860.150626]  880806bcb800 88080a72a4d0 880806bcb800 


[57860.150659] Call Trace:
[57860.150682]  [c015addb] 
btrfs_commit_transaction+0x40b/0xb60 [btrfs]

[57860.150717]  [810c0700] ? prepare_to_wait_event+0x100/0x100
[57860.150745]  [c0171973] btrfs_sync_file+0x313/0x380 [btrfs]
[57860.150768]  [81236bf6] vfs_fsync_range+0x46/0xc0
[57860.150788]  [81236c8c] vfs_fsync+0x1c/0x20
[57860.150806]  [81236cc8] do_fsync+0x38/0x70
[57860.150825]  [812370c3] SyS_fdatasync+0x13/0x20
[57860.150846]  [8180cb32] system_call_fastpath+0x16/0x75
[57860.150866] Code: 45 98 48 39 d8 0f 84 ad 00 00 00 48 8d 45 a8 48 83 
c0 18 48 89 45 90 66 0f 1f 44 00 00 48 8b 13 48 8b 43 08 4c 89 ef 4c 8d 
73 88 48 89 42 08 48 89 10 48 89 1b 48 89 5b 08 e8 bf 3a 6b c1 e8 aa
[57860.150959] RIP  [c0158b8e] 
btrfs_wait_pending_ordered+0x5e/0x110 [btrfs]

[57860.150998]  RSP 8805d5c3bd18
[57860.151014] CR2: 0008
[57860.151186] ---[ end trace f41cd52aa31494ac ]---


--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


3.19.3: fs/btrfs/super.c:260 __btrfs_abort_transaction+0x4c/0x10e [btrfs]()

2015-04-03 Thread Tomasz Chmielewski

Got this one for some reason; this fs did not have any issues before:


[64343.060291] perf interrupt took too long (2544  2500), lowering 
kernel.perf_event_max_sample_rate to 5

[67344.630449] [ cut here ]
[67344.630509] WARNING: CPU: 3 PID: 21885 at fs/btrfs/super.c:260 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]()

[67344.630626] BTRFS: Transaction aborted (error -22)
[67344.630627] Modules linked in: xt_nat xt_tcpudp ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables tun 
cpufreq_conservative cpufreq_powersave cpufreq_stats ipv6 btrfs xor 
raid6_pq zlib_deflate coretemp hwmon loop parport_pc parport 8250_fintek 
tpm_infineon tpm_tis tpm pcspkr i2c_i801 i2c_core battery video ehci_pci 
ehci_hcd lpc_ich mfd_core button acpi_cpufreq ext4 crc16 jbd2 mbcache 
raid1 sg sd_mod ahci libahci r8169 mii libata scsi_mod

[67344.631027] CPU: 3 PID: 21885 Comm: btrfs Not tainted 3.19.3 #1
[67344.631074] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[67344.631164]  0009 8800ab38b928 813bd016 
88081facd801
[67344.631253]  8800ab38b978 8800ab38b968 8103b025 

[67344.631342]  a029235e ffea 880042a59800 
8807f1e7dd10

[67344.631431] Call Trace:
[67344.631477]  [813bd016] dump_stack+0x45/0x57
[67344.631525]  [8103b025] warn_slowpath_common+0x97/0xb1
[67344.631575]  [a029235e] ? 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]

[67344.631663]  [8103b0d3] warn_slowpath_fmt+0x41/0x43
[67344.631713]  [a029235e] 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]
[67344.632914]  [a02b658c] create_pending_snapshot+0x6a3/0x6e4 
[btrfs]
[67344.632967]  [a029f93b] ? 
block_rsv_release_bytes.isra.54+0xb0/0xc1 [btrfs]
[67344.633060]  [a02b6633] create_pending_snapshots+0x66/0x89 
[btrfs]
[67344.633114]  [a02b78c4] 
btrfs_commit_transaction+0x44f/0x9bb [btrfs]
[67344.633208]  [a02e03f4] btrfs_mksubvol.isra.61+0x2eb/0x418 
[btrfs]

[67344.633257]  [81065b9d] ? add_wait_queue+0x44/0x44
[67344.633310]  [a02e0670] 
btrfs_ioctl_snap_create_transid+0x14f/0x180 [btrfs]
[67344.633404]  [a02e07d4] 
btrfs_ioctl_snap_create_v2+0xc7/0x112 [btrfs]

[67344.633497]  [a02e43a3] btrfs_ioctl+0x6cc/0x22f5 [btrfs]
[67344.633545]  [810d8bc5] ? handle_mm_fault+0x452/0x9fc
[67344.633594]  [8117870f] ? avc_has_perm+0x2e/0xf7
[67344.633641]  [89cf] do_vfs_ioctl+0x418/0x460
[67344.633688]  [8a65] SyS_ioctl+0x4e/0x7d
[67344.633735]  [81031c5b] ? do_page_fault+0xc/0x11
[67344.633782]  [813c19f2] system_call_fastpath+0x12/0x17
[67344.633829] ---[ end trace ea2b25b59548e150 ]---
[67344.633875] BTRFS: error (device sdb5) in 
create_pending_snapshot:1392: errno=-22 unknown

[67344.633964] BTRFS info (device sdb5): forced readonly
[67344.634010] BTRFS warning (device sdb5): Skipping commit of aborted 
transaction.
[67344.634098] BTRFS: error (device sdb5) in cleanup_transaction:1670: 
errno=-22 unknown

[67344.634186] BTRFS info (device sdb5): delayed_refs has NO entry



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS: unable to add free space :-17

2015-03-23 Thread Tomasz Chmielewski

Got this with 4.0.0-rc5 when doing a degraded mount:

Mar 23 13:09:22 server1 kernel: [  665.197957] BTRFS info (device sdb4): 
allowing degraded mounts
Mar 23 13:09:22 server1 kernel: [  665.198030] BTRFS info (device sdb4): 
disk space caching is enabled
Mar 23 13:09:22 server1 kernel: [  665.213163] BTRFS warning (device 
sdb4): devid 2 missing
Mar 23 13:09:22 server1 kernel: [  665.260077] BTRFS: bdev (null) errs: 
wr 1, rd 1, flush 0, corrupt 0, gen 0
Mar 23 13:10:01 server1 kernel: [  704.310874] [ cut here 
]
Mar 23 13:10:01 server1 kernel: [  704.310936] WARNING: CPU: 1 PID: 4706 
at fs/btrfs/free-space-cache.c:1349 tree_insert_offset+0x7d/0xc3 
[btrfs]()
Mar 23 13:10:01 server1 kernel: [  704.310989] Modules linked in: ipv6 
cpufreq_stats cpufreq_powersave cpufreq_conservative btrfs xor raid6_pq 
zlib_deflate ext3 jbd loop 8250_fintek i2c_i801 parport_pc tpm_infineon 
tpm_tis tpm lpc_ich ehci_pci ehci_hcd mfd_core i2c_core parport pcspkr 
acpi_cpufreq button video ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci 
libahci libata scsi_mod r8169 mii
Mar 23 13:10:01 server1 kernel: [  704.312632] CPU: 1 PID: 4706 Comm: 
btrfs-transacti Not tainted 4.0.0-rc5 #1
Mar 23 13:10:01 server1 kernel: [  704.312680] Hardware name: System 
manufacturer System Product Name/P8H67-M PRO, BIOS 1106 10/17/2011
Mar 23 13:10:01 server1 kernel: [  704.312732]  0009 
880819917c18 813c2f57 88083fa4d801
Mar 23 13:10:01 server1 kernel: [  704.312928]   
880819917c58 8103b031 880036eb9540
Mar 23 13:10:01 server1 kernel: [  704.313124]  a03126f6 
ffef 880036eb9540 01dd651b4000

Mar 23 13:10:01 server1 kernel: [  704.313321] Call Trace:
Mar 23 13:10:01 server1 kernel: [  704.313394]  [813c2f57] 
dump_stack+0x45/0x57
Mar 23 13:10:01 server1 kernel: [  704.313469]  [8103b031] 
warn_slowpath_common+0x97/0xb1
Mar 23 13:10:01 server1 kernel: [  704.313551]  [a03126f6] ? 
tree_insert_offset+0x7d/0xc3 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.313627]  [8103b060] 
warn_slowpath_null+0x15/0x17
Mar 23 13:10:01 server1 kernel: [  704.313707]  [a03126f6] 
tree_insert_offset+0x7d/0xc3 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.313788]  [a0313142] 
link_free_space+0x27/0x3c [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.313868]  [a03143cb] 
__btrfs_add_free_space+0x354/0x39d [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.313952]  [a02f2864] ? 
free_extent_state.part.29+0x34/0x39 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314062]  [a02f2864] ? 
free_extent_state.part.29+0x34/0x39 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314170]  [a02c16b4] ? 
block_group_cache_tree_search+0x8e/0xbd [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314279]  [a02c5518] 
unpin_extent_range.isra.78+0xa6/0x199 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314387]  [a02c9aa2] 
btrfs_finish_extent_commit+0xcb/0xea [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314497]  [a02dbc57] 
btrfs_commit_transaction+0x850/0x9e1 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314606]  [a02d9bf0] 
transaction_kthread+0xef/0x1c3 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314687]  [a02d9b01] ? 
open_ctree+0x1d47/0x1d47 [btrfs]
Mar 23 13:10:01 server1 kernel: [  704.314763]  [810501a3] 
kthread+0xcd/0xd5
Mar 23 13:10:01 server1 kernel: [  704.314836]  [810500d6] ? 
kthread_freezable_should_stop+0x43/0x43
Mar 23 13:10:01 server1 kernel: [  704.314913]  [813c7848] 
ret_from_fork+0x58/0x90
Mar 23 13:10:01 server1 kernel: [  704.314987]  [810500d6] ? 
kthread_freezable_should_stop+0x43/0x43
Mar 23 13:10:01 server1 kernel: [  704.315063] ---[ end trace 
695f505b58a81c8d ]---
Mar 23 13:10:01 server1 kernel: [  704.315135] BTRFS: unable to add free 
space :-17
Mar 23 13:10:33 server1 kernel: [  736.510895] BTRFS: unable to add free 
space :-17
Mar 23 13:11:03 server1 kernel: [  765.780069] BTRFS: unable to add free 
space :-17
Mar 23 13:11:13 server1 kernel: [  776.030671] BTRFS: unable to add free 
space :-17
Mar 23 13:12:39 server1 kernel: [  861.791031] BTRFS: unable to add free 
space :-17
Mar 23 13:13:29 server1 kernel: [  911.761852] BTRFS: unable to add free 
space :-17
Mar 23 13:13:53 server1 kernel: [  936.124674] BTRFS: unable to add free 
space :-17




--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: unable to add free space :-17

2015-03-23 Thread Tomasz Chmielewski

On 2015-03-23 22:48, Chris Mason wrote:

On Mon, Mar 23, 2015 at 8:35 AM, Chris Mason c...@fb.com wrote:



On Mon, Mar 23, 2015 at 8:19 AM, Tomasz Chmielewski t...@virtall.com 
wrote:

Got this with 4.0.0-rc5 when doing a degraded mount:

Do you get this every time, even after going back to rc4?


Actually, I didn't try yet (going back to 4.0.0-rc4).

Shortly after, it errored with:

[ 4450.519046] [ cut here ]
[ 4450.519066] WARNING: CPU: 4 PID: 4734 at fs/btrfs/super.c:260 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]()

[ 4450.519081] BTRFS: Transaction aborted (error -17)
[ 4450.519082] Modules linked in: ipv6 cpufreq_stats cpufreq_powersave 
cpufreq_conservative btrfs xor raid6_pq zlib_deflate ext3 jbd loop 
tpm_infineon tpm_tis i2c_i801 parport_pc parport 8250_fintek lpc_ich 
pcspkr tpm video button acpi_cpufreq i2c_core mfd_core ehci_pci ehci_hcd 
ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod 
r8169 mii
[ 4450.519157] CPU: 4 PID: 4734 Comm: kworker/u16:5 Not tainted 
4.0.0-rc5 #1
[ 4450.519167] Hardware name: System manufacturer System Product 
Name/P8H67-M PRO, BIOS 1106 10/17/2011
[ 4450.519190] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper 
[btrfs]
[ 4450.519201]  0009 88074e0a7c58 813c2f57 

[ 4450.519215]  88074e0a7ca8 88074e0a7c98 8103b031 
8807b2638e68
[ 4450.519255]  a02b6356 ffef 88081bb7e000 
880701ba3210

[ 4450.519323] Call Trace:
[ 4450.519357]  [813c2f57] dump_stack+0x45/0x57
[ 4450.519394]  [8103b031] warn_slowpath_common+0x97/0xb1
[ 4450.519434]  [a02b6356] ? 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]

[ 4450.519501]  [8103b0df] warn_slowpath_fmt+0x41/0x43
[ 4450.519540]  [a02b6356] 
__btrfs_abort_transaction+0x4c/0x10e [btrfs]
[ 4450.519610]  [a02cd6d7] btrfs_run_delayed_refs+0x90/0x21b 
[btrfs]
[ 4450.519653]  [a02cda60] delayed_ref_async_start+0x37/0x76 
[btrfs]
[ 4450.519699]  [a0302c67] normal_work_helper+0xb5/0x16a 
[btrfs]
[ 4450.519742]  [a0302e28] btrfs_extent_refs_helper+0xd/0xf 
[btrfs]

[ 4450.519782]  [8104c16d] process_one_work+0x187/0x2b2
[ 4450.519819]  [8104c503] worker_thread+0x241/0x33e
[ 4450.519856]  [8104c2c2] ? process_scheduled_works+0x2a/0x2a
[ 4450.519894]  [810501a3] kthread+0xcd/0xd5
[ 4450.519930]  [810500d6] ? 
kthread_freezable_should_stop+0x43/0x43

[ 4450.519969]  [813c7848] ret_from_fork+0x58/0x90
[ 4450.520005]  [810500d6] ? 
kthread_freezable_should_stop+0x43/0x43

[ 4450.520044] ---[ end trace 1aff120928ea0fd5 ]---
[ 4450.520079] BTRFS: error (device sdb4) in 
btrfs_run_delayed_refs:2790: errno=-17 Object already exists

[ 4450.520148] BTRFS info (device sdb4): forced readonly
[ 4450.641829] BTRFS: error (device sdb4) in 
btrfs_run_delayed_refs:2790: errno=-17 Object already exists




After reboot to 4.0.0-rc5, I didn't see unable to add free space :-17 
anymore. However, it went again to forced readonly some time later.


I'll try going to 4.0.0-rc4 now.

--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.19 - unable to replace a failed drive - 100% CPU usage in kworker and btrfs-transacti

2015-02-16 Thread Tomasz Chmielewski

On 2015-02-16 19:40, Liu Bo wrote:


  PID USER  PR  NI  VIRT  RES  SHR S  %CPU %MEMTIME+  COMMAND
 6269 root  20   0 000 R  92.5  0.0   2769:33
btrfs-transacti
22247 root  20   0 000 R  92.5  0.0  42:38.65
kworker/u16:16


Can you cat /proc/22247/stack and /proc/6269/stack?


22247 no longer exists - there is a new kworker now:

  PID USER  PR  NI  VIRT  RES  SHR S  %CPU %MEMTIME+  COMMAND
23570 root  20   0 000 R  93.8  0.0 108:46.78 
kworker/u16:15
 6269 root  20   0 000 R  93.5  0.0   3029:41 
btrfs-transacti


# cat /proc/22247/stack
cat: /proc/22247/stack: No such file or directory

# cat /proc/6269/stack
[a02a0b0d] transaction_kthread+0x197/0x1c2 [btrfs]
[81050067] kthread+0xcd/0xd5
[813c12ac] ret_from_fork+0x7c/0xb0
[] 0x

# cat /proc/23570/stack
[] 0x


In case a new sysrq-w was needed, too:

[428113.007373] SysRq : Show Blocked State
[428113.007427]   taskPC stack   pid father
[428113.007493] btrfs   D 8802d67ef948 0  7611   6275 
0x
[428113.007549]  8802d67ef948 8803fe0409d8 88081917e040 
000113c0
[428113.007648]  4000 88081be31810 88081917e040 
8802d67ef8b8
[428113.007748]  8105f3c4 88083fa8 88083fa513c0 
88083fa513c0

[428113.007847] Call Trace:
[428113.007900]  [8105f3c4] ? enqueue_task_fair+0x3e5/0x44f
[428113.007955]  [81054ab6] ? resched_curr+0x45/0x55
[428113.008008]  [81055122] ? check_preempt_curr+0x3e/0x6d
[428113.008062]  [81055163] ? ttwu_do_wakeup+0x12/0x7f
[428113.008115]  [8105526e] ? 
ttwu_do_activate.constprop.73+0x57/0x5c

[428113.008172]  [813be618] schedule+0x65/0x67
[428113.008224]  [813c03b8] schedule_timeout+0x26/0x18d
[428113.008277]  [81057d1e] ? wake_up_process+0x30/0x34
[428113.008331]  [8104b1a3] ? wake_up_worker+0x1f/0x21
[428113.008383]  [8104b3ea] ? insert_work+0x87/0x94
[428113.008447]  [a02e183d] ? free_block_list+0x1f/0x34 
[btrfs]

[428113.008501]  [813bef76] wait_for_common+0x10d/0x13e
[428113.008554]  [81057cdf] ? try_to_wake_up+0x250/0x250
[428113.008608]  [813befbf] wait_for_completion+0x18/0x1a
[428113.008666]  [a028dedc] 
btrfs_async_run_delayed_refs+0xc1/0xe4 [btrfs]
[428113.008770]  [a02a3189] 
__btrfs_end_transaction+0x315/0x33b [btrfs]
[428113.008872]  [a02a31bd] 
btrfs_end_transaction_throttle+0xe/0x10 [btrfs]
[428113.008977]  [a02e5cf1] relocate_block_group+0x2ad/0x4de 
[btrfs]
[428113.009037]  [a02e607a] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[428113.009141]  [a02c37e1] 
btrfs_relocate_chunk.isra.69+0x35/0xa5 [btrfs]
[428113.009244]  [a02c41da] btrfs_shrink_device+0x235/0x408 
[btrfs]
[428113.009304]  [a02c6ba9] btrfs_rm_device+0x2a9/0x704 
[btrfs]
[428113.009359]  [810f5441] ? 
__kmalloc_track_caller+0x40/0x178

[428113.009419]  [a02cf7a7] btrfs_ioctl+0xa9c/0x22f5 [btrfs]
[428113.009473]  [8110f142] ? putname+0x23/0x2c
[428113.009525]  [8110f6d3] ? user_path_at_empty+0x60/0x90
[428113.009579]  [811783bf] ? avc_has_perm+0x2e/0xf7
[428113.009632]  [871b] do_vfs_ioctl+0x418/0x460
[428113.009684]  [81106745] ? vfs_stat+0x16/0x18
[428113.009736]  [87b1] SyS_ioctl+0x4e/0x7d
[428113.009789]  [81031c5b] ? do_page_fault+0xc/0x11
[428113.009841]  [813c1352] system_call_fastpath+0x12/0x17
[428113.009896] Sched Debug Version: v0.11, 3.19.0 #1
[428113.009947] ktime   : 
428338904.250198
[428113.010001] sched_clk   : 
428113009.895839
[428113.010054] cpu_clk : 
428113009.895853

[428113.010107] jiffies : 4337771187
[428113.010160] sched_clock_stable(): 1
[428113.010212]
[428113.010257] sysctl_sched
[428113.010305]   .sysctl_sched_latency: 24.00
[428113.010358]   .sysctl_sched_min_granularity: 3.00
[428113.010411]   .sysctl_sched_wakeup_granularity : 4.00
[428113.010463]   .sysctl_sched_child_runs_first   : 0
[428113.010515]   .sysctl_sched_features   : 11899
[428113.010568]   .sysctl_sched_tunable_scaling: 1 
(logaritmic)

[428113.010621]
[428113.010667] cpu#0, 3411.800 MHz
[428113.010715]   .nr_running: 0
[428113.010766]   .load  : 0
[428113.010817]   .nr_switches   : 141930054
[428113.010869]   .nr_load_updates   : 24939776
[428113.010920]   .nr_uninterruptible: -739558
[428113.010972]   .next_balance  : 4337.771095
[428113.011023]   .curr-pid : 0
[428113.011074]   

3.19 - unable to replace a failed drive - 100% CPU usage in kworker and btrfs-transacti

2015-02-15 Thread Tomasz Chmielewski
I had a failed drive in RAID-1, so it was replaced with a good one, 
followed by:


btrfs device add /dev/sdb4 /home
btrfs device delete missing /home


4 days later, it got to a state when there is no IO anymore (according 
to iostat), btrfs device delete missing did not complete:


# uptime
 03:29:03 up 4 days, 14:38,  1 user,  load average: 2.36, 2.43, 2.54

# btrfs fi show
Label: none  uuid: 84d087aa-3a32-46da-844f-a233237cf04f
Total devices 3 FS bytes used 206.53GiB
devid3 size 1.71TiB used 214.03GiB path /dev/sda4
devid4 size 1.71TiB used 177.00GiB path /dev/sdb4
*** Some devices missing

Btrfs v3.18.2

  PID USER  PR  NI  VIRT  RES  SHR S  %CPU %MEMTIME+  COMMAND
 6269 root  20   0 000 R  92.5  0.0   2769:33 
btrfs-transacti
22247 root  20   0 000 R  92.5  0.0  42:38.65 
kworker/u16:16



# dmesg
[397948.321324] SysRq : Show Blocked State
[397948.321386]   taskPC stack   pid father
[397948.321465] btrfs   D 8802d67ef948 0  7611   6275 
0x
[397948.321521]  8802d67ef948 8804d2a9ba40 88081917e040 
000113c0
[397948.321621]  4000 88081be3 88081917e040 
8802d67ef8b8
[397948.321720]  8105f3c4 88083fa4 88083fa913c0 
88083fa913c0

[397948.321819] Call Trace:
[397948.321872]  [8105f3c4] ? enqueue_task_fair+0x3e5/0x44f
[397948.321927]  [81054ab6] ? resched_curr+0x45/0x55
[397948.321979]  [81055122] ? check_preempt_curr+0x3e/0x6d
[397948.322032]  [81055163] ? ttwu_do_wakeup+0x12/0x7f
[397948.322085]  [8105526e] ? 
ttwu_do_activate.constprop.73+0x57/0x5c

[397948.322141]  [813be618] schedule+0x65/0x67
[397948.322193]  [813c03b8] schedule_timeout+0x26/0x18d
[397948.322245]  [81057d1e] ? wake_up_process+0x30/0x34
[397948.322299]  [8104b1a3] ? wake_up_worker+0x1f/0x21
[397948.322352]  [8104b3ea] ? insert_work+0x87/0x94
[397948.322414]  [a02e183d] ? free_block_list+0x1f/0x34 
[btrfs]

[397948.322468]  [813bef76] wait_for_common+0x10d/0x13e
[397948.322521]  [81057cdf] ? try_to_wake_up+0x250/0x250
[397948.322574]  [813befbf] wait_for_completion+0x18/0x1a
[397948.322631]  [a028dedc] 
btrfs_async_run_delayed_refs+0xc1/0xe4 [btrfs]
[397948.322734]  [a02a3189] 
__btrfs_end_transaction+0x315/0x33b [btrfs]
[397948.322836]  [a02a31bd] 
btrfs_end_transaction_throttle+0xe/0x10 [btrfs]
[397948.322939]  [a02e5cf1] relocate_block_group+0x2ad/0x4de 
[btrfs]
[397948.323039]  [a02e607a] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[397948.323152]  [a02c37e1] 
btrfs_relocate_chunk.isra.69+0x35/0xa5 [btrfs]
[397948.323256]  [a02c41da] btrfs_shrink_device+0x235/0x408 
[btrfs]
[397948.323316]  [a02c6ba9] btrfs_rm_device+0x2a9/0x704 
[btrfs]
[397948.323371]  [810f5441] ? 
__kmalloc_track_caller+0x40/0x178

[397948.323431]  [a02cf7a7] btrfs_ioctl+0xa9c/0x22f5 [btrfs]
[397948.323485]  [8110f142] ? putname+0x23/0x2c
[397948.323537]  [8110f6d3] ? user_path_at_empty+0x60/0x90
[397948.323593]  [811783bf] ? avc_has_perm+0x2e/0xf7
[397948.323654]  [871b] do_vfs_ioctl+0x418/0x460
[397948.323706]  [81106745] ? vfs_stat+0x16/0x18
[397948.323758]  [87b1] SyS_ioctl+0x4e/0x7d
[397948.323809]  [81031c5b] ? do_page_fault+0xc/0x11
[397948.323862]  [813c1352] system_call_fastpath+0x12/0x17
[397948.323917] Sched Debug Version: v0.11, 3.19.0 #1
[397948.323968] ktime   : 
398158300.945440
[397948.324021] sched_clk   : 
397948323.916497
[397948.324074] cpu_clk : 
397948323.916513

[397948.324127] jiffies : 4334753127
[397948.324180] sched_clock_stable(): 1
[397948.324231]
[397948.324277] sysctl_sched
[397948.324325]   .sysctl_sched_latency: 24.00
[397948.324378]   .sysctl_sched_min_granularity: 3.00
[397948.324430]   .sysctl_sched_wakeup_granularity : 4.00
[397948.324483]   .sysctl_sched_child_runs_first   : 0
[397948.324534]   .sysctl_sched_features   : 11899
[397948.324586]   .sysctl_sched_tunable_scaling: 1 
(logaritmic)

[397948.324639]
[397948.324686] cpu#0, 3411.800 MHz
[397948.324734]   .nr_running: 0
[397948.324784]   .load  : 0
[397948.324834]   .nr_switches   : 137328510
[397948.324885]   .nr_load_updates   : 23262071
[397948.324936]   .nr_uninterruptible: -738004
[397948.324987]   .next_balance  : 4334.753128
[397948.325039]   .curr-pid : 0
[397948.325089]   .clock : 397948323.134939
[397948.325141]   .cpu_load[0]

Re: WARNING: CPU: 1 PID: 2436 at fs/btrfs/qgroup.c:1414 btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]()

2015-02-12 Thread Tomasz Chmielewski
]  [8104ff9a] ? 
kthread_freezable_should_stop+0x43/0x43

[197051.338628] ---[ end trace 5d57d07bb94831a1 ]---
[197051.340365] [ cut here ]
[197051.340450] WARNING: CPU: 0 PID: 26314 at fs/btrfs/qgroup.c:1414 
btrfs_delayed_qgroup_accounting+0x9f3/0xa0d [btrfs]()
[197051.340568] Modules linked in: xt_nat xt_tcpudp ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables tun 
cpufreq_conservative cpufreq_powersave cpufreq_stats ipv6 btrfs xor 
raid6_pq zlib_deflate coretemp hwmon loop pcspkr i2c_i801 i2c_core 
battery parport_pc parport tpm_infineon tpm_tis tpm video 8250_fintek 
lpc_ich mfd_core ehci_pci ehci_hcd button acpi_cpufreq ext4 crc16 jbd2 
mbcache raid1 sg sd_mod ahci libahci libata scsi_mod r8169 mii
[197051.341053] CPU: 0 PID: 26314 Comm: kworker/u16:5 Tainted: G
W  3.19.0 #1
[197051.341165] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[197051.341297] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper 
[btrfs]
[197051.341360]  0009 8802c511bc18 813bc952 

[197051.341475]   8802c511bc58 8103b015 
8807f9281c90
[197051.341590]  a0332a33 8807f9281000 8807f2745300 


[197051.341705] Call Trace:
[197051.341763]  [813bc952] dump_stack+0x45/0x57
[197051.341825]  [8103b015] warn_slowpath_common+0x97/0xb1
[197051.341901]  [a0332a33] ? 
btrfs_delayed_qgroup_accounting+0x9f3/0xa0d [btrfs]

[197051.342015]  [8103b044] warn_slowpath_null+0x15/0x17
[197051.342089]  [a0332a33] 
btrfs_delayed_qgroup_accounting+0x9f3/0xa0d [btrfs]
[197051.342214]  [a02d1449] btrfs_run_delayed_refs+0x1e4/0x21b 
[btrfs]
[197051.342287]  [a02d1a2d] delayed_ref_async_start+0x37/0x76 
[btrfs]
[197051.342365]  [a03069d7] normal_work_helper+0xb5/0x16a 
[btrfs]
[197051.342441]  [a0306b98] btrfs_extent_refs_helper+0xd/0xf 
[btrfs]

[197051.342506]  [8104c0cb] process_one_work+0x187/0x2a9
[197051.342568]  [8104c458] worker_thread+0x241/0x33e
[197051.342630]  [8104c217] ? 
process_scheduled_works+0x2a/0x2a

[197051.342694]  [81050067] kthread+0xcd/0xd5
[197051.342755]  [8104ff9a] ? 
kthread_freezable_should_stop+0x43/0x43

[197051.342819]  [813c12ac] ret_from_fork+0x7c/0xb0
[197051.342880]  [8104ff9a] ? 
kthread_freezable_should_stop+0x43/0x43

[197051.342944] ---[ end trace 5d57d07bb94831a2 ]---


Tomasz Chmielewski
http://www.sslrack.com

On 2015-01-04 07:58, Tomasz Chmielewski wrote:

Got this with 3.18.1 and qgroups enabled. Not sure how to reproduce.


[1262648.802286] [ cut here ]
[1262648.802350] WARNING: CPU: 1 PID: 2436 at fs/btrfs/qgroup.c:1414
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]()
[1262648.802441] Modules linked in: ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand
cpufreq_conservative cpufreq_powersave cpufreq_stats nfsd auth_rpcgss
oid_registry exportfs nfs_acl nfs lockd grace fscache sunrpc ipv6
btrfs xor raid6_pq zlib_deflate coretemp hwmon loop pcspkr i2c_i801
i2c_core battery parport_pc parport 8250_fintek tpm_infineon tpm_tis
tpm video ehci_pci ehci_hcd lpc_ich mfd_core acpi_cpufreq button ext4
crc16 jbd2 mbcache raid1 sg sd_mod r8169 mii ahci libahci libata
scsi_mod
[1262648.802854] CPU: 1 PID: 2436 Comm: btrfs-cleaner Not tainted 
3.18.1 #1

[1262648.802902] Hardware name: System manufacturer System Product
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[1262648.802992]  0009 8800c805fbe8 813b1128

[1262648.803081]   8800c805fc28 81039b39
8807f341c000
[1262648.803170]  a0300be2 8807f341c000 8807f3fd8b40

[1262648.803259] Call Trace:
[1262648.803305]  [813b1128] dump_stack+0x46/0x58
[1262648.803352]  [81039b39] warn_slowpath_common+0x77/0x91
[1262648.803406]  [a0300be2] ?
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]
[1262648.803495]  [81039b68] warn_slowpath_null+0x15/0x17
[1262648.803548]  [a0300be2]
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]
[1262648.803642]  [a02a12a5]
btrfs_run_delayed_refs+0x1e4/0x21b [btrfs]
[1262648.803734]  [a02aec96]
btrfs_should_end_transaction+0x4a/0x53 [btrfs]
[1262648.803826]  [a029fc34] btrfs_drop_snapshot+0x379/0x68f 
[btrfs]

[1262648.803881]  [a02be478] ?
btrfs_run_defrag_inodes+0x2fa/0x30e [btrfs]
[1262648.803974]  [a02b0401]
btrfs_clean_one_deleted_snapshot+0xb3/0xc2 [btrfs]
[1262648.804067]  [a02a84ff] cleaner_kthread+0x134/0x16c 
[btrfs]
[1262648.804119]  [a02a83cb] ? btrfs_alloc_root+0x2c/0x2c 
[btrfs]

[1262648.804169]  [8104ebe7] kthread

Re: Kernel bug in 3.19-rc4

2015-01-15 Thread Tomasz Chmielewski
I just started some btrfs stress testing on latest linux kernel 
3.19-rc4:

A few hours later, filesystem stopped working - the kernel bug report
can be found below.


Hi,

your kernel BUG at fs/btrfs/inode.c:3142! from 3.19-rc4 corresponds to 
http://marc.info/?l=linux-btrfsm=141903172106342w=2 - it was kernel 
BUG at /home/apw/COD/linux/fs/btrfs/inode.c:3123! in 3.18.1, and is 
exactly the same code in both cases:



/* grab metadata reservation from transaction handle */
if (reserve) {
ret = btrfs_orphan_reserve_metadata(trans, inode);
BUG_ON(ret); /* -ENOSPC in reservation; Logic error? JDM 
*/

}


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at /home/apw/COD/linux/fs/btrfs/inode.c:3123!

2015-01-07 Thread Tomasz Chmielewski
  [c0458ec9] btrfs_orphan_add+0x1a9/0x1c0 
[btrfs]

[15948.236017]  RSP 88007b97fc98
[15948.761942] ---[ end trace 0ccd21c265dce56b ]---

# ls
bigfile2.img  bigfile.img

# touch 1
(...never returned...)



Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


WARNING: CPU: 1 PID: 2436 at fs/btrfs/qgroup.c:1414 btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]()

2015-01-03 Thread Tomasz Chmielewski

Got this with 3.18.1 and qgroups enabled. Not sure how to reproduce.


[1262648.802286] [ cut here ]
[1262648.802350] WARNING: CPU: 1 PID: 2436 at fs/btrfs/qgroup.c:1414 
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]()
[1262648.802441] Modules linked in: ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand 
cpufreq_conservative cpufreq_powersave cpufreq_stats nfsd auth_rpcgss 
oid_registry exportfs nfs_acl nfs lockd grace fscache sunrpc ipv6 btrfs 
xor raid6_pq zlib_deflate coretemp hwmon loop pcspkr i2c_i801 i2c_core 
battery parport_pc parport 8250_fintek tpm_infineon tpm_tis tpm video 
ehci_pci ehci_hcd lpc_ich mfd_core acpi_cpufreq button ext4 crc16 jbd2 
mbcache raid1 sg sd_mod r8169 mii ahci libahci libata scsi_mod
[1262648.802854] CPU: 1 PID: 2436 Comm: btrfs-cleaner Not tainted 3.18.1 
#1
[1262648.802902] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[1262648.802992]  0009 8800c805fbe8 813b1128 

[1262648.803081]   8800c805fc28 81039b39 
8807f341c000
[1262648.803170]  a0300be2 8807f341c000 8807f3fd8b40 


[1262648.803259] Call Trace:
[1262648.803305]  [813b1128] dump_stack+0x46/0x58
[1262648.803352]  [81039b39] warn_slowpath_common+0x77/0x91
[1262648.803406]  [a0300be2] ? 
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]

[1262648.803495]  [81039b68] warn_slowpath_null+0x15/0x17
[1262648.803548]  [a0300be2] 
btrfs_delayed_qgroup_accounting+0x9f1/0xa0b [btrfs]
[1262648.803642]  [a02a12a5] 
btrfs_run_delayed_refs+0x1e4/0x21b [btrfs]
[1262648.803734]  [a02aec96] 
btrfs_should_end_transaction+0x4a/0x53 [btrfs]
[1262648.803826]  [a029fc34] btrfs_drop_snapshot+0x379/0x68f 
[btrfs]
[1262648.803881]  [a02be478] ? 
btrfs_run_defrag_inodes+0x2fa/0x30e [btrfs]
[1262648.803974]  [a02b0401] 
btrfs_clean_one_deleted_snapshot+0xb3/0xc2 [btrfs]
[1262648.804067]  [a02a84ff] cleaner_kthread+0x134/0x16c 
[btrfs]
[1262648.804119]  [a02a83cb] ? btrfs_alloc_root+0x2c/0x2c 
[btrfs]

[1262648.804169]  [8104ebe7] kthread+0xcd/0xd5
[1262648.804215]  [8104eb1a] ? 
kthread_freezable_should_stop+0x43/0x43

[1262648.804264]  [813b59ec] ret_from_fork+0x7c/0xb0
[1262648.804311]  [8104eb1a] ? 
kthread_freezable_should_stop+0x43/0x43

[1262648.804360] ---[ end trace b76fd72b4be63515 ]---


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs progs release 3.18

2014-12-30 Thread Tomasz Chmielewski
* filesystem usage - give an overview of fs usage in a way (道, みち, 
michi) that's more

* device usage - more detailed information about per-device allocations
  * same restrictions as for 'fi usage'


Interesting.

Used these to create a filesystem, with btrfs-progs v3.17.3:

# mkfs.btrfs -O skinny-metadata -d raid1 -m raid1 /dev/sda4 /dev/sdb4 -f


Now, with btrfs-progs 3.18 and these new options I can see that the fs 
is partially single, not RAID-1 - how come?


# btrfs fil us /srv
Overall:
Device size:   5.25TiB
Device allocated:510.04GiB
Device unallocated:4.76TiB
Used:505.39GiB
Free (estimated):  2.38TiB  (min: 2.38TiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:8.00MiB, Used:0.00B
   /dev/sda4   8.00MiB

Data,RAID1: Size:252.00GiB, Used:250.56GiB
   /dev/sda4 252.00GiB
   /dev/sdb4 252.00GiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sda4   8.00MiB

Metadata,RAID1: Size:3.00GiB, Used:2.13GiB
   /dev/sda4   3.00GiB
   /dev/sdb4   3.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sda4   4.00MiB

System,RAID1: Size:8.00MiB, Used:64.00KiB
   /dev/sda4   8.00MiB
   /dev/sdb4   8.00MiB

Unallocated:
   /dev/sda4   2.38TiB
   /dev/sdb4   2.38TiB


root@backup01 ~ # btrfs dev us /srv
/dev/sda4, ID: 1
   Device size: 2.63TiB
   Data,single: 8.00MiB
   Data,RAID1:252.00GiB
   Metadata,single: 8.00MiB
   Metadata,RAID1:  3.00GiB
   System,single:   4.00MiB
   System,RAID1:8.00MiB
   Unallocated: 2.38TiB

/dev/sdb4, ID: 2
   Device size: 2.63TiB
   Data,RAID1:252.00GiB
   Metadata,RAID1:  3.00GiB
   System,RAID1:8.00MiB
   Unallocated: 2.38TiB


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-19 Thread Tomasz Chmielewski

On 2014-12-19 22:47, Josef Bacik wrote:

On 12/12/2014 09:37 AM, Tomasz Chmielewski wrote:
FYI, still seeing this with 3.18 (scrub passes fine on this 
filesystem).


# time btrfs balance start /mnt/lxc2
Segmentation fault



Ok now I remember why I haven't fix this yet, the images you gave me
restore but then they don't mount because the extent tree is corrupted
for some reason.  Could you re-image this fs and send it to me and I
promise to spend all of my time on the problem until its fixed.


(un)fortunately one filesystem stopped crashing on balance with some 
kernel update, and the other one I had crashing on balance was fixed 
with btrfs - so I'm not able to reproduce anymore / produce an image 
which is crashing.


--
Tomasz Chmielewski
http://www.sslrack.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel BUG at /home/apw/COD/linux/fs/btrfs/inode.c:3123!

2014-12-19 Thread Tomasz Chmielewski
]  [c044ac40] ? 
btrfs_destroy_all_delalloc_inodes+0x120/0x120 [btrfs]

[15948.235844]  [81093a09] kthread+0xc9/0xe0
[15948.235872]  [81093940] ? flush_kthread_worker+0x90/0x90
[15948.235900]  [817b36bc] ret_from_fork+0x7c/0xb0
[15948.235919]  [81093940] ? flush_kthread_worker+0x90/0x90
[15948.235933] Code: e8 7d a1 fc ff 8b 45 c8 e9 6d ff ff ff 0f 1f 44 00 
00 f0 41 80 65 80 fd 4c 89 ef 89 45 c8 e8 cf 20 fe ff 8b 45 c8 e9 48 ff 
ff ff 0f 0b 4c 89 f7 45 31 f6 e8 8a a2 35 c1 e9 f9 fe ff ff 0f 1f 44
[15948.236017] RIP  [c0458ec9] btrfs_orphan_add+0x1a9/0x1c0 
[btrfs]

[15948.236017]  RSP 88007b97fc98
[15948.761942] ---[ end trace 0ccd21c265dce56b ]---

# ls
bigfile2.img  bigfile.img

# touch 1
(...never returned...)

--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-15 Thread Tomasz Chmielewski

On 2014-12-15 21:07, Josef Bacik wrote:

On 12/12/2014 09:37 AM, Tomasz Chmielewski wrote:
FYI, still seeing this with 3.18 (scrub passes fine on this 
filesystem).


# time btrfs balance start /mnt/lxc2
Segmentation fault

real322m32.153s
user0m0.000s
sys 16m0.930s




Sorry Tomasz, you are now at the top of the list.  I assume the images
you sent me before are still good for reproducing this?  Thanks,


I've sent you two URL back then, they should still work. One of these 
filesystems did not crash the 3.18.0 kernel anymore (though there were 
many files changed / added / removed since I've uploaded the images); 
the other still did.



Tomasz

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-13 Thread Tomasz Chmielewski

On 2014-12-12 23:58, Robert White wrote:


I don't have the history to answer this definitively, but I don't
think you have a choice. Nothing else is going to touch that error.

I have not seen any oh my god, btrfsck just ate my filesystem errors
since I joined the list -- but I am a relative newcomer.

I know that you, of course, as a contentious and well-traveled system
administrator, already have a current backup since you are doing
storage maintenance... right? 8-)


Who needs backups with btrfs, right? :)

So apparently btrfsck --repair fixed some issues, the fs is still 
mountable and looks fine.


Running balance again, but that will take many days there.

# btrfsck --repair /dev/sdc1
fixing root item for root 8681, current bytenr 5568935395328, current 
gen 70315, current level 2, new bytenr 5569014104064, new gen 70316, new 
level 2

Fixed 1 roots.
checking extents
checking free space cache
checking fs roots
root 696 inode 2765103 errors 400, nbytes wrong
root 696 inode 2831256 errors 400, nbytes wrong
root 9466 inode 2831256 errors 400, nbytes wrong
root 9505 inode 2831256 errors 400, nbytes wrong
root 10139 inode 2831256 errors 400, nbytes wrong
root 10525 inode 2831256 errors 400, nbytes wrong
root 10561 inode 2831256 errors 400, nbytes wrong
root 10633 inode 2765103 errors 400, nbytes wrong
root 10633 inode 2831256 errors 400, nbytes wrong
root 10650 inode 2765103 errors 400, nbytes wrong
root 10650 inode 2831256 errors 400, nbytes wrong
root 10680 inode 2765103 errors 400, nbytes wrong
root 10680 inode 2831256 errors 400, nbytes wrong
root 10681 inode 2765103 errors 400, nbytes wrong
root 10681 inode 2831256 errors 400, nbytes wrong
root 10701 inode 2765103 errors 400, nbytes wrong
root 10701 inode 2831256 errors 400, nbytes wrong
root 10718 inode 2765103 errors 400, nbytes wrong
root 10718 inode 2831256 errors 400, nbytes wrong
root 10735 inode 2765103 errors 400, nbytes wrong
root 10735 inode 2831256 errors 400, nbytes wrong
enabling repair mode
Checking filesystem on /dev/sdc1
UUID: 371af1dc-d88b-4dee-90ba-91fec2bee6c3
cache and super generation don't match, space cache will be invalidated
found 942113871627 bytes used err is 1
total csum bytes: 2445349244
total tree bytes: 28743073792
total fs tree bytes: 22880043008
total extent tree bytes: 2890547200
btree space waste bytes: 5339534781
file data blocks allocated: 2779865800704
 referenced 3446026993664
Btrfs v3.17.3

real76m27.845s
user19m1.470s
sys 2m55.690s


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-13 Thread Tomasz Chmielewski

On 2014-12-13 10:39, Robert White wrote:


Might I ask why you are running balance? After a persistent error I'd
understand going straight to scrub, but balance is usually for
transformation or to redistribute things after atypical use.


There were several reasons for running balance on this system:

1) I was getting no space left, even though there were hundreds of GBs 
left. Not sure if this still applies to the current kernels (3.18 and 
later) though, but it was certainly the problem in the past.


2) The system was regularly freezing, I'd say once a week was a norm. 
Sometimes I was getting btrfs traces logged in syslog.
After a few freezes the fs was getting corrupted to different degree. At 
some point, it was so bad that it was only possible to use it read only. 
So I had to get the data off, reformat, copy back... It would start 
crashing after a few weeks of usage.


My usage case is quite simple:

- skinny extents, extended inode refs
- mount compress-force=zlib
- rsync many remote data sources (-a -H --inplace --partial) + snapshot
- around 500 snapshots in total, from 20 or so subvolumes

Especially rsync's --inplace option combined with many snapshots and 
large fragmentation was deadly for btrfs - I was seeing system freezes 
right when rsyncing a highly fragmented, large file.


Then, running balance on the corrupted filesystem was more an exercise 
(if scrub passes fine, I would expect balance to pass as well). Some 
BUGs it was causing was sometimes fixed in newer kernels, sometimes not 
(btrfsck was not really usable a few months back).


3) I had different luck with recovering btrfs after a failed drive (in 
RAID-1). Sometimes it worked as expected, sometimes, the fs was getting 
broken so much I had to rsync data off it and format from scratch (where 
mdraid would kick the drive after getting write errors - it's not the 
case with btrfs, and weird things can happen).
Sometimes, running btrfs device delete missing (it's balance in 
principle, I think) would take weeks, during which a second drive could 
easily die.
Again, running balance would be more exercise there, to see if the newer 
kernel still crashes.




An entire generation of folks have grown used to defraging windows
boxes and all, but if you've already got an array that is going to
take many days to balance what benefit do you actually expect to
receive?


For me - it's a good test to see if btrfs is finally getting stable 
(some cases explained above).




Defrag -- used for I think I'm getting a lot of unnecessary head seek
in this application, these files need to be brought into closer
order.


Fragmentation was an issue for btrfs, at least a few kernels back (as 
explained above, with rsync's --inplace).
However, I'm not running autodefrag anywhere - not sure how it affects 
snapshots.




Scrub -- used for defensive checking a-la checkdisk. I suspect that
after that unexpected power outage something may be a little off, or
alternately I think my disks are giving me bitrot, I better check.


For me, it was passing fine, where balance was crashing the kernel.


Again, my main rationale for running balance is to see if btrfs is 
behaving stable. While I have systems with btrfs which are running fine 
for months, I also have ones which will crash after 1-2 weeks (once the 
system grows in size / complexity).


So hopefully, btrfsck had fixed that fs - once it is running stable for 
a week or two, I might be brave to re-enable btrfs quotas (was another 
system freezer, at least a few kernels back).



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-13 Thread Tomasz Chmielewski

On 2014-12-13 21:54, Robert White wrote:

- rsync many remote data sources (-a -H --inplace --partial) + 
snapshot


Using --inplace on a Copy On Write filesystem has only one effect, it
increases fragmentation... a lot...


...if the file was changed.



Every new block is going to get
written to a new area anyway,


Exactly - every new block. But that's true with and without --inplace.
Also - without --inplace, it is every block. In other words, without 
--inplace, the file is likely to be rewritten by rsync to a new one, and 
CoW is lost (more below).




so if you have enough slack space to
keep the one new copy of the new file, which you will probably use up
anyway in the COW event, laying in the fresh copy in a likely more
contiguous way will tend to make things cleaner over time.

--inplace is doubly useless with compression as compression is
perturbed by default if one byte changes in the original file.


No. If you change 1 byte in a 100 MB file, or perhaps 1 GB file, you 
will likely loose a few kBs of CoW. The whole file is certainly not 
rewritten if you use --inplace. However it will be wholly rewritten if 
you don't use --inplace.



The only time --inplace might be helpful is if the file is NOCOW... 
except...


No, you're wrong.
By default, rsync creates a new file if it detects any file modification 
- like touch file.


Consider this experiment:

# create a large file
dd if=/dev/urandom of=bigfile bs=1M count=3000

# copy it with rsync
rsync -a -v --progress bigfile bigfile2

# copy it again - blazing fast, no change
rsync -a -v --progress bigfile bigfile2

# touch the original file
touch bigfile

# try copying again with rsync - notice rsync creates a temp file, like 
.bigfile2.J79ta2

# No change to the file except the timestamp, but good bye your CoW.
rsync -a -v --progress bigfile bigfile2

# Now try the same with --inplace; compare data written to disk with 
iostat -m in both cases.



Same goes for append files - even if they are compressed, most CoW will 
be shared. I'd say it will be similar for lightly modified files 
(changed data will be CoW-unshared, some compressed overhead will be 
unshared, but the rest will be untouched / shared by CoW between the 
snapshots).





- around 500 snapshots in total, from 20 or so subvolumes


That's a lot of snapshots and subvolumes. Not an impossibly high
number, but a lot. That needs it's own use-case evaluation. But
regardless...

Even if you set the NOCOW option on a file to make the --inplace rsync
work, if that file is snapshotted (snapshot?) between the rsync
modification events it will be in 1COW mode because of the snapshot
anyway and you are back to the default anti-optimal conditions.


Again - if the file was changed a lot, it doesn't matter if it's 
--inplace or not. If the file data was not changed, or changed little - 
--inplace will help preserve CoW.




Especially rsync's --inplace option combined with many snapshots and
large fragmentation was deadly for btrfs - I was seeing system freezes
right when rsyncing a highly fragmented, large file.


You are kind of doing all that to yourself.


To clarify - freezes - I mean kernel bugs exposed and machine freezing.
I think we all agree that whatever userspace is doing in the filesystem, 
it should not result is kernel BUG / freeze.




Combining _forced_
compression with denying the natural opportunity for the re-write of
the file to move it to nicely contiguous new locations and then
pinning it all in place with multiple snapshots you've created the
worst of all possible worlds.


I disagree. It's quite compact, for my data usage. If I needed blazing 
fast file access, I wouldn't be using a CoW filesystem nor snapshots in 
the first place. For data mostly stored and rarely read, it is OK.



(...)


And keep repeating this to yourself :: balance does not reorganize
anything, it just moves the existing disorder to a new location. This
is not a perfect summation, and it's clearly wrong if you are using
convert, but it's the correct way to view what's happening while
asking yourself should I balance?.


I agree - I don't run it unless I need to (or I'm curious to see if it 
would expose some more bugs).
It would be quite a step back for a filesystem to need some periodic 
maintenance like that after all.


Also I'm in the opinion that balance should not cause the kernel to BUG 
- it should abort, possibly remount the fs ro etc. (suggest running 
btrfsck, if there is enough confidence in this tool), but definitely not 
BUG.



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Balance scrub defrag

2014-12-12 Thread Tomasz Chmielewski

I use SMART (smartmontools etc) and its tests to keep track of and warn
me of such issues. It's way more likely to catch incipient media
failures long before scrub would. It's also more likely to correct
situations before they become visible to userspace. Its also a way
better full-platter scan that involves less real time delay and won't
bog down a running system.


Don't put too much trust in SMART - sectors can rot unexpectedly even if 
SMART is thinking everything is fine with the drive.


I had exactly this issue recently:

1) one of the drives in the server failed and was replaced

2) btrfs device delete missing (which basically moves data from the 
remaining drive to the new one) was failing with IO error


3) according to SMART, the drive with IO error was fine (no reallocated 
sectors, no warnings etc.)



So, scrub to the rescue - it printed broken files, after removing them 
manually, it was possible to finish btrfs device delete missing.


Probably it makes sense to run scrub occasionally (just like mdraid is 
doing on most distributions).



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-12 Thread Tomasz Chmielewski

FYI, still seeing this with 3.18 (scrub passes fine on this filesystem).

# time btrfs balance start /mnt/lxc2
Segmentation fault

real322m32.153s
user0m0.000s
sys 16m0.930s


[20182.461873] BTRFS info (device sdd1): relocating block group 
6915027369984 flags 17

[20194.050641] BTRFS info (device sdd1): found 4819 extents
[20286.243576] BTRFS info (device sdd1): found 4819 extents
[20287.143471] BTRFS info (device sdd1): relocating block group 
6468350771200 flags 17

[20295.756934] BTRFS info (device sdd1): found 3613 extents
[20306.981773] BTRFS (device sdd1): parent transid verify failed on 
5568935395328 wanted 70315 found 102416
[20306.983962] BTRFS (device sdd1): parent transid verify failed on 
5568935395328 wanted 70315 found 102416
[20307.029841] BTRFS (device sdd1): parent transid verify failed on 
5568935395328 wanted 70315 found 102416

[20307.030037] [ cut here ]
[20307.030083] kernel BUG at fs/btrfs/relocation.c:242!
[20307.030130] invalid opcode:  [#1] SMP
[20307.030175] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative 
cpufreq_powersave cpufreq_stats nfsd auth_rpcgss oid_registry exportfs 
nfs_acl nfs lockd grace fscache sunrpc ipv6 btrfs xor raid6_pq 
zlib_deflate coretemp hwmon loop pcspkr i2c_i801 i2c_core lpc_ich 
mfd_core 8250_fintek battery parport_pc parport tpm_infineon tpm_tis tpm 
ehci_pci ehci_hcd video button acpi_cpufreq ext4 crc16 jbd2 mbcache 
raid1 sg sd_mod r8169 mii ahci libahci libata scsi_mod

[20307.030587] CPU: 3 PID: 4218 Comm: btrfs Not tainted 3.18.0 #1
[20307.030634] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[20307.030724] task: 8807f2cac830 ti: 8807e9198000 task.ti: 
8807e9198000
[20307.030811] RIP: 0010:[a02e8240]  [a02e8240] 
relocate_block_group+0x432/0x4de [btrfs]

[20307.030914] RSP: 0018:8807e919bb18  EFLAGS: 00010202
[20307.030960] RAX: 8805f06c40f8 RBX: 8805f06c4000 RCX: 
00018023
[20307.031008] RDX: 8805f06c40d8 RSI: 8805f06c40e8 RDI: 
8807ff403900
[20307.031056] RBP: 8807e919bb88 R08: 0001 R09: 

[20307.031105] R10: 0003 R11: a02e43a6 R12: 
8807e637f090
[20307.031153] R13: 8805f06c4108 R14: fff4 R15: 
8805f06c4020
[20307.031201] FS:  7f1bdb4ba880() GS:88081fac() 
knlGS:

[20307.031289] CS:  0010 DS:  ES:  CR0: 80050033
[20307.031336] CR2: 7f5672e18070 CR3: 0007e99cc000 CR4: 
001407e0

[20307.031384] Stack:
[20307.031426]  ea0016296680 8805f06c40e8 ea0016296380 

[20307.031515]  ea0016296400 00ffea0016296440 a805e22b2a30 
1000
[20307.031604]  8804d86963f0 8805f06c4000  
8807f2d785a8

[20307.031693] Call Trace:
[20307.031743]  [a02e8444] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[20307.031838]  [a02c5fd4] 
btrfs_relocate_chunk.isra.70+0x35/0xa5 [btrfs]

[20307.031931]  [a02c75d4] btrfs_balance+0xa66/0xc6b [btrfs]
[20307.031981]  [810bd63a] ? 
__alloc_pages_nodemask+0x137/0x702
[20307.032036]  [a02cd485] btrfs_ioctl_balance+0x220/0x29f 
[btrfs]

[20307.032089]  [a02d2586] btrfs_ioctl+0x1134/0x22f6 [btrfs]
[20307.032138]  [810d5d83] ? handle_mm_fault+0x44d/0xa00
[20307.032186]  [81175862] ? avc_has_perm+0x2e/0xf7
[20307.032234]  [810d889d] ? __vm_enough_memory+0x25/0x13c
[20307.032282]  [8110f05d] do_vfs_ioctl+0x3f2/0x43c
[20307.032329]  [8110f0f5] SyS_ioctl+0x4e/0x7d
[20307.032376]  [81030ab3] ? do_page_fault+0xc/0x11
[20307.032424]  [813b5992] system_call_fastpath+0x12/0x17
[20307.032488] Code: 00 00 00 48 39 83 f8 00 00 00 74 02 0f 0b 4c 39 ab 
08 01 00 00 74 02 0f 0b 48 83 7b 20 00 74 02 0f 0b 83 bb 20 01 00 00 00 
74 02 0f 0b 83 bb 24 01 00 00 00 74 02 0f 0b 48 8b 73 18 48 8b 7b 08
[20307.032660] RIP  [a02e8240] 
relocate_block_group+0x432/0x4de [btrfs]

[20307.032754]  RSP 8807e919bb18
[20307.033068] ---[ end trace 18be77360e49d59d ]---



On 2014-11-25 23:33, Tomasz Chmielewski wrote:

I'm still seeing this when running balance with 3.18-rc6:

[95334.066898] BTRFS info (device sdd1): relocating block group
6468350771200 flags 17
[95344.384279] BTRFS info (device sdd1): found 5371 extents
[95373.555640] BTRFS (device sdd1): parent transid verify failed on
5568935395328 wanted 70315 found 89269
[95373.574208] BTRFS (device sdd1): parent transid verify failed on
5568935395328 wanted 70315 found 89269
[95373.574483] [ cut here ]
[95373.574542] kernel BUG at fs/btrfs/relocation.c:242!
[95373.574601] invalid opcode:  [#1] SMP
[95373.574661] Modules linked in: ipt_MASQUERADE
nf_nat_masquerade_ipv4

Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-12 Thread Tomasz Chmielewski

On 2014-12-12 22:36, Robert White wrote:


In another thread [that was discussing SMART] you talked about
replacing a drive and then needing to do some patching-up of the
result because of drive failures. Is this the same filesystem where
that happened?


Nope, it was on a different server.

--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

2014-12-12 Thread Tomasz Chmielewski

On 2014-12-12 23:34, Robert White wrote:

On 12/12/2014 01:46 PM, Tomasz Chmielewski wrote:

On 2014-12-12 22:36, Robert White wrote:


In another thread [that was discussing SMART] you talked about
replacing a drive and then needing to do some patching-up of the
result because of drive failures. Is this the same filesystem where
that happened?


Nope, it was on a different server.



okay, so how did the btrfsck turn out?


# time btrfsck /dev/sdc1 /root/btrfsck.log

real22m0.140s
user0m3.090s
sys 0m6.120s

root@bkp010 /usr/src/btrfs-progs # echo $?
1

# cat /root/btrfsck.log
root item for root 8681, current bytenr 5568935395328, current gen 
70315, current level 2, new bytenr 5569014104064, new gen 70316, new 
level 2

Found 1 roots with an outdated root item.
Please run a filesystem check with the option --repair to fix them.


Now, I'm a bit afraid to run --repair - as far as I remember, some time 
ago, it used to do all weird things except the actual repair.
Is it better nowadays? I'm using latest clone from 
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs device delete missing - Input/output error

2014-12-07 Thread Tomasz Chmielewski

On 2014-12-07 06:26, Chris Murphy wrote:
On Sat, Dec 6, 2014 at 2:17 AM, Tomasz Chmielewski t...@virtall.com 
wrote:


After we run again btrfs device delete missing /home, the newly 
created
directory eventually (/home/backup/ma-int/weekly.tmp) is being 
detected as

csum failed ino 



If that comes up clean then chances are this is file corruption and
you can mount and do:

btrfs scrub start -r MP


Actually, this seemed to be enough.

I didn't realize that scrub would print the names of corrupted files.

After removing them, btrfs device delete missing is finally making 
some progress (previously, it would break at 300+GB; still running now):


# btrfs fi show /home
Label: none  uuid: 92e93437-cc9f-475d-a739-085f3270896b
Total devices 3 FS bytes used 1.40TiB
devid2 size 1.70TiB used 1.42TiB path /dev/sdb4
devid3 size 1.70TiB used 807.00GiB path /dev/sda4
*** Some devices missing

Btrfs v3.14.2


Thanks for your hints!

--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs device delete missing - Input/output error

2014-12-06 Thread Tomasz Chmielewski

On 2014-12-06 05:57, Chris Murphy wrote:


Right - so we're getting these:

# parted /dev/sdb u s p
Model: ATA ST3000DM001-9YN1 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B


OK it's a 4096 byte physical sector drive so you have to use the
bs=4096 command with the proper seek value (which is based on the bs
value).


Yep.
So here is what I did:

# echo 2262535088 / 8|bc# (512 * 8 = 4096)
282816886


* verify if it's this place

# dd if=/dev/sdb of=/dev/null bs=4096 count=1 skip=282816886
dd: reading `/dev/sdb': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 2.83002 s, 0.0 kB/s


* overwrite it:

dd if=/dev/zero of=/dev/sdb bs=4096 count=1 seek=282816886
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 3.0277e-05 s, 135 MB/s


* try to read it:

# dd if=/dev/sdb of=/dev/null bs=4096 count=1 skip=282816886
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 2.6204e-05 s, 156 MB/s


* try to read with the old skip value (repeat for 2262535088 - 
2262535095, or use a different count value - this is where we were 
getting the errors, it's also 8 * 512):


dd if=/dev/sdb of=/dev/null count=1 skip=2262535088
...
dd if=/dev/sdb of=/dev/null count=1 skip=2262535095


* Unfortunately this is still an error for btrfs, because the checksum 
does not match:


# time btrfs device delete missing /home
ERROR: error removing the device 'missing' - Input/output error

# dmesg -c
[84200.109774] BTRFS info (device sda4): relocating block group 
1375492636672 flags 17
[84203.635559] BTRFS info (device sda4): csum failed ino 261 off 
384262144 csum 2566472073 expected csum 4193010590
[84203.650980] BTRFS info (device sda4): csum failed ino 261 off 
384262144 csum 2566472073 expected csum 4193010590
[84203.651444] BTRFS info (device sda4): csum failed ino 261 off 
384262144 csum 2566472073 expected csum 4193010590



* remounting with nodatacow (nodatasum) does not help (since it's for 
new data)


* let's find the inode printing the error - turned out to be a directory 
- so created a new one, moved the data from the corrupt one, removed 
the directory with that inode:


# find /home -mount -inum 261
/home/backup/ma-int/weekly


* repeat for any other found:

# dmesg -c
[85197.300494] BTRFS info (device sda4): relocating block group 
1375492636672 flags 17
[85200.448713] BTRFS info (device sda4): csum failed ino 267 off 
384262144 csum 2566472073 expected csum 4193010590
[85200.472581] BTRFS info (device sda4): csum failed ino 267 off 
384262144 csum 2566472073 expected csum 4193010590
[85200.473551] BTRFS info (device sda4): csum failed ino 267 off 
384262144 csum 2566472073 expected csum 4193010590



* unfortunately it never ends - let's say we have 
/home/backup/ma-int/weekly, which we fixed with:


mkdir /home/backup/ma-int/weekly.tmp
mv /home/backup/ma-int/weekly/* /home/backup/ma-int/weekly.tmp
rmdir /home/backup/ma-int/weekly

After we run again btrfs device delete missing /home, the newly created 
directory eventually (/home/backup/ma-int/weekly.tmp) is being detected 
as csum failed ino 



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs device delete missing - Input/output error

2014-12-05 Thread Tomasz Chmielewski

The first way to try to recover the current volume would be to
overwrite LBA 2262535088 which you should only do with the filesystem
unmounted. That's the sector causing the read error. If this is a 512
byte drive:

dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088


It is a 512 byte drive.

Unfortunately overwriting that fails - which is quite weird, given that 
there are 0 reallocated sectors, according to SMART.


Is there a way to determine what's stored there (i.e. if it's a file?).


# dd if=/dev/zero of=/dev/sdb count=1 seek=2262535088
dd: writing to `/dev/sdb': Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 2.88439 s, 0.0 kB/s

# dmesg -c
[ 8177.713212] ata4.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 
0x0

[ 8177.713285] ata4.00: irq_stat 0x4008
[ 8177.713349] ata4.00: failed command: READ FPDMA QUEUED
[ 8177.713419] ata4.00: cmd 60/08:50:b0:8b:db/00:00:86:00:00/40 tag 10 
ncq 4096 in
[ 8177.713419]  res 41/40:08:b0:8b:db/00:00:86:00:00/00 Emask 
0x409 (media error) F

[ 8177.713662] ata4.00: status: { DRDY ERR }
[ 8177.713725] ata4.00: error: { UNC }
[ 8177.755099] ata4.00: configured for UDMA/133
[ 8177.755175] sd 3:0:0:0: [sdb]
[ 8177.755252] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 8177.755318] sd 3:0:0:0: [sdb]
[ 8177.755380] Sense Key : Medium Error [current] [descriptor]
[ 8177.755449] Descriptor sense data with sense descriptors (in hex):
[ 8177.755516] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 8177.755600] 86 db 8b b0
[ 8177.755667] sd 3:0:0:0: [sdb]
[ 8177.755729] Add. Sense: Unrecovered read error - auto reallocate 
failed

[ 8177.757316] sd 3:0:0:0: [sdb] CDB:
[ 8177.757377] Read(16): 88 00 00 00 00 00 86 db 8b b0 00 00 00 08 00 00
[ 8177.757463] end_request: I/O error, dev sdb, sector 2262535088
[ 8177.757542] ata4: EH complete

--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs device delete missing - Input/output error

2014-12-05 Thread Tomasz Chmielewski

On 2014-12-05 17:41, Chris Murphy wrote:


You're getting a read error when writing. This is expected when
writing 512 bytes to a 4k sector.

What do you get for
 parted /dev/sdb u s p


Right - so we're getting these:

# parted /dev/sdb u s p
Model: ATA ST3000DM001-9YN1 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  StartEnd  Size File system  Name  Flags
 5  2048s4095s2048s   
bios_grub

 1  4096s67112959s67108864s   raid
 2  67112960s68161535s1048576sraid
 3  68161536s2215645183s  2147483648s raid
 4  2215645184s  5860533134s  3644887951s  btrfs  raid


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs device delete missing - Input/output error

2014-12-04 Thread Tomasz Chmielewski

[26691.003053] sd 3:0:0:0: [sdb] CDB:
[26691.003101] Read(16): 88 00 00 00 00 00 86 db 8b b0 00 00 00 08 00 00
[26691.003163] end_request: I/O error, dev sdb, sector 2262535088
[26691.003215] btrfs_dev_stat_print_on_error: 36 callbacks suppressed
[26691.003268] BTRFS: bdev /dev/sdb4 errs: wr 0, rd 60, flush 0, corrupt 
0, gen 0

[26691.003383] ata4: EH complete



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931!

2014-11-25 Thread Tomasz Chmielewski

I'm still seeing this when running balance with 3.18-rc6:

[95334.066898] BTRFS info (device sdd1): relocating block group 
6468350771200 flags 17

[95344.384279] BTRFS info (device sdd1): found 5371 extents
[95373.555640] BTRFS (device sdd1): parent transid verify failed on 
5568935395328 wanted 70315 found 89269
[95373.574208] BTRFS (device sdd1): parent transid verify failed on 
5568935395328 wanted 70315 found 89269

[95373.574483] [ cut here ]
[95373.574542] kernel BUG at fs/btrfs/relocation.c:242!
[95373.574601] invalid opcode:  [#1] SMP
[95373.574661] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative 
cpufreq_powersave cpufreq_stats nfsd auth_rpcgss oid_registry exportfs 
nfs_acl nfs lockd grace fscache sunrpc ipv6 btrfs xor raid6_pq 
zlib_deflate coretemp hwmon loop pcspkr i2c_i801 i2c_core battery 
tpm_infineon tpm_tis tpm 8250_fintek video parport_pc parport ehci_pci 
lpc_ich ehci_hcd mfd_core button acpi_cpufreq ext4 crc16 jbd2 mbcache 
raid1 sg sd_mod ahci libahci libata scsi_mod r8169 mii

[95373.576506] CPU: 1 PID: 6089 Comm: btrfs Not tainted 3.18.0-rc6 #1
[95373.576568] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[95373.576683] task: 8807e9b91810 ti: 8807da1b8000 task.ti: 
8807da1b8000
[95373.576794] RIP: 0010:[a0323144]  [a0323144] 
relocate_block_group+0x432/0x4de [btrfs]

[95373.576933] RSP: 0018:8807da1bbb18  EFLAGS: 00010202
[95373.576993] RAX: 8806327a70f8 RBX: 8806327a7000 RCX: 
00018020
[95373.577056] RDX: 8806327a70d8 RSI: 8806327a70e8 RDI: 
8807ff403900
[95373.577118] RBP: 8807da1bbb88 R08: 0001 R09: 

[95373.577181] R10: 0003 R11: a031f2aa R12: 
8804601de5a0
[95373.577243] R13: 8806327a7108 R14: fff4 R15: 
8806327a7020
[95373.577307] FS:  7f9ccfa99840() GS:88081fa4() 
knlGS:

[95373.577418] CS:  0010 DS:  ES:  CR0: 80050033
[95373.577479] CR2: 7f98c4133000 CR3: 0007dd7bf000 CR4: 
001407e0

[95373.577540] Stack:
[95373.577594]  ea0004962e80 8806327a70e8 ea000c7fdb80 

[95373.577708]  ea000d289600 00ffea000d289640 a805e22b2a30 
1000
[95373.577822]  8802eb7b0240 8806327a7000  
8807f3b5a5a8

[95373.577937] Call Trace:
[95373.578009]  [a0323348] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[95373.578137]  [a0300fd4] 
btrfs_relocate_chunk.isra.70+0x35/0xa5 [btrfs]

[95373.578263]  [a03025d4] btrfs_balance+0xa66/0xc6b [btrfs]
[95373.578329]  [810bd63a] ? 
__alloc_pages_nodemask+0x137/0x702
[95373.578407]  [a0308485] btrfs_ioctl_balance+0x220/0x29f 
[btrfs]

[95373.578483]  [a030d586] btrfs_ioctl+0x1134/0x22f6 [btrfs]
[95373.578547]  [810d5d90] ? handle_mm_fault+0x44d/0xa00
[95373.578610]  [81175856] ? avc_has_perm+0x2e/0xf7
[95373.578672]  [810d88a9] ? __vm_enough_memory+0x25/0x13c
[95373.578736]  [8110f05d] do_vfs_ioctl+0x3f2/0x43c
[95373.578798]  [8110f0f5] SyS_ioctl+0x4e/0x7d
[95373.578859]  [81030ab3] ? do_page_fault+0xc/0x11
[95373.578920]  [813b58d2] system_call_fastpath+0x12/0x17
[95373.578981] Code: 00 00 00 48 39 83 f8 00 00 00 74 02 0f 0b 4c 39 ab 
08 01 00 00 74 02 0f 0b 48 83 7b 20 00 74 02 0f 0b 83 bb 20 01 00 00 00 
74 02 0f 0b 83 bb 24 01 00 00 00 74 02 0f 0b 48 8b 73 18 48 8b 7b 08
[95373.579226] RIP  [a0323144] 
relocate_block_group+0x432/0x4de [btrfs]

[95373.579352]  RSP 8807da1bbb18




On 2014-10-04 00:06, Tomasz Chmielewski wrote:

On 2014-10-03 20:17 (Fri), Josef Bacik wrote:

On 10/02/2014 03:27 AM, Tomasz Chmielewski wrote:

Got this when running balance with 3.17.0-rc7:



Give these two patches a try

https://patchwork.kernel.org/patch/4938281/
https://patchwork.kernel.org/patch/4939761/


With these two patches applied on top of 3.13-rc7, it BUGs somewhere 
else now:


[ 2030.858792] BTRFS info (device sdd1): relocating block group
6469424513024 flags 17
[ 2039.674077] BTRFS info (device sdd1): found 20937 extents
[ 2066.726661] BTRFS info (device sdd1): found 20937 extents
[ 2068.048208] BTRFS info (device sdd1): relocating block group
6468350771200 flags 17
[ 2080.796412] BTRFS info (device sdd1): found 46927 extents
[ 2092.703850] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.714622] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.725269] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.725680] [ cut here ]
[ 2092.725740] kernel BUG at fs/btrfs/relocation.c:242!
[ 2092.725800] invalid opcode:  [#1] SMP
[ 2092.725860] Modules linked

Re: 5 _thousand_ snapshots? even 160? (was: device balance times)

2014-10-22 Thread Tomasz Chmielewski

But 5000 snapshots?

Why?  Are you *TRYING* to test btrfs until it breaks, or TRYING to
demonstrate a balance taking an entire year?


Remember a given btrfs filesystem is not necessarily a backup 
destination for data from one source.


It can be, say, 30 or 60 daily snapshots, plus several monthly, for each 
data source * number of data sources.


So while it probably will make a difference (5000 snapshots from one 
source, vs 5000 snapshots made from many sources) for balance times, I 
wouldn't call a large number of snapshots that unusual.


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


device balance times

2014-10-21 Thread Tomasz Chmielewski
FYI - after a failed disk and replacing it I've run a balance; it took 
almost 3 weeks to complete, for 120 GBs of data:


# time btrfs balance start -v /home
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing
Done, had to relocate 124 out of 124 chunks

real30131m52.873s
user0m0.000s
sys 74m59.180s


Kernel is 3.17.0-rc7.


Filesystem is not that big, merely 124 GB used out of 1.8 TB:

/dev/sdb4   1.8T  124G  1.6T 
  8% /home


# btrfs fi df /home
Data, RAID1: total=121.00GiB, used=117.56GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=8.00GiB, used=4.99GiB
GlobalReserve, single: total=512.00MiB, used=0.00

# btrfs fi show /home
Label: none  uuid: 84d087aa-3a32-46da-844f-a233237cf04f
Total devices 2 FS bytes used 122.56GiB
devid2 size 1.71TiB used 129.03GiB path /dev/sdb4
devid3 size 1.71TiB used 129.03GiB path /dev/sda4


The only special thing about this filesystem is that there are ~250 
snapshots there:


# btrfs sub list /home|wc -l
253

It's using compression:

/dev/sdb4 on /home type btrfs (rw,noatime,compress=lzo,space_cache)


Other than taking occasional backups from remote, the server is idle.

# hdparm -t /dev/sda /dev/sdb

/dev/sda:
 Timing buffered disk reads: 394 MB in  3.01 seconds = 131.03 MB/sec

/dev/sdb:
 Timing buffered disk reads: 402 MB in  3.00 seconds = 133.86 MB/sec



How long does the balance take for others with many snapshots?




--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931!

2014-10-04 Thread Tomasz Chmielewski

Hi,

is btrfs-image with single -s flag OK? I.e.

btrfs-image -s -c 9 -t 32 /dev/sdc1 /root/btrfs-2.img

?

Tomasz Chmielewski


On 2014-10-04 00:09 (Sat), Josef Bacik wrote:

Can you make a btrfs-image of this fs and send it to me?  Thanks,

Josef

Tomasz Chmielewski t...@virtall.com wrote:


On 2014-10-03 20:17 (Fri), Josef Bacik wrote:

On 10/02/2014 03:27 AM, Tomasz Chmielewski wrote:

Got this when running balance with 3.17.0-rc7:



Give these two patches a try

https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/4938281/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=zoOEeHoPSaWycUnRMebS4NI5cEnXmHt7kS4QG2us9Mk%3D%0As=dc71db2c3614702306e4085366f39b206c2b93859afded8030aea69d56f570f7
https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/4939761/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=zoOEeHoPSaWycUnRMebS4NI5cEnXmHt7kS4QG2us9Mk%3D%0As=d9f61fcd6ba68d21f5180b30ffafed66471754728c5e46f70a2010c13ca48726


With these two patches applied on top of 3.13-rc7, it BUGs somewhere
else now:

[ 2030.858792] BTRFS info (device sdd1): relocating block group
6469424513024 flags 17
[ 2039.674077] BTRFS info (device sdd1): found 20937 extents
[ 2066.726661] BTRFS info (device sdd1): found 20937 extents
[ 2068.048208] BTRFS info (device sdd1): relocating block group
6468350771200 flags 17
[ 2080.796412] BTRFS info (device sdd1): found 46927 extents
[ 2092.703850] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.714622] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.725269] parent transid verify failed on 5568935395328 wanted
70315 found 71183
[ 2092.725680] [ cut here ]
[ 2092.725740] kernel BUG at fs/btrfs/relocation.c:242!
[ 2092.725800] invalid opcode:  [#1] SMP
[ 2092.725860] Modules linked in: ipt_MASQUERADE iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
ip_tables x_tables cpufreq_ondemand cpufreq_conservative
cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq
zlib_deflate coretemp hwmon loop i2c_i801 parport_pc pcspkr i2c_core
parport video battery tpm_infineon tpm_tis tpm lpc_ich mfd_core 
ehci_pci

ehci_hcd acpi_cpufreq button ext4 crc16 jbd2 mbcache raid1 sg sd_mod
ahci libahci libata scsi_mod r8169 mii
[ 2092.727740] CPU: 3 PID: 3937 Comm: btrfs Not tainted 3.17.0-rc7 #3
[ 2092.727801] Hardware name: System manufacturer System Product
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[ 2092.727917] task: 8800c7883020 ti: 8800c7d04000 task.ti:
8800c7d04000
[ 2092.728029] RIP: 0010:[a0322a4a]  [a0322a4a]
relocate_block_group+0x432/0x4de [btrfs]
[ 2092.728169] RSP: 0018:8800c7d07a58  EFLAGS: 00010206
[ 2092.728229] RAX: 8806c69a18f8 RBX: 8806c69a1800 RCX:
00018020
[ 2092.728292] RDX: 8806c69a18d8 RSI: 8806c69a18e8 RDI:
8807ff403900
[ 2092.728356] RBP: 8800c7d07ac8 R08: 0001 R09:

[ 2092.728419] R10: 0003 R11: a031eb54 R12:
8805d515c240
[ 2092.728482] R13: 8806c69a1908 R14: fff4 R15:
8806c69a1820
[ 2092.728546] FS:  7f4f251d0840() GS:88081fac()
knlGS:
[ 2092.728660] CS:  0010 DS:  ES:  CR0: 80050033
[ 2092.728721] CR2: ff600400 CR3: c7fb CR4:
001407e0
[ 2092.728783] Stack:
[ 2092.728837]  ea0002edf300 8806c69a18e8 ea0002edf000

[ 2092.728952]  ea0002edf080 00ffea0002edf0c0 a805e22b2a30
1000
[ 2092.729067]  8807d969f870 8806c69a1800 
8807f3f285b0
[ 2092.729183] Call Trace:
[ 2092.729256]  [a0322c4e]
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[ 2092.729385]  [a02ff79c]
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[ 2092.729512]  [a030e99f] ?
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[ 2092.729632]  [a02bfb04] ? 
btrfs_set_path_blocking+0x23/0x54

[btrfs]
[ 2092.729704]  [a02c4517] ? btrfs_search_slot+0x7bc/0x816
[btrfs]
[ 2092.729782]  [a02fbc81] ? free_extent_buffer+0x6f/0x7c
[btrfs]
[ 2092.729859]  [a0302679] btrfs_balance+0xa7b/0xc80 [btrfs]
[ 2092.729935]  [a0308177] btrfs_ioctl_balance+0x220/0x29f
[btrfs]
[ 2092.730012]  [a030d1e4] btrfs_ioctl+0x10bd/0x2281 [btrfs]
[ 2092.730076]  [810d5152] ? handle_mm_fault+0x44d/0xa00
[ 2092.730140]  [81173e76] ? avc_has_perm+0x2e/0xf7
[ 2092.730202]  [810d7c6d] ? __vm_enough_memory+0x25/0x13c
[ 2092.730266]  [8110d72d] do_vfs_ioctl+0x3f2/0x43c
[ 2092.730328]  [8110d7c5] SyS_ioctl+0x4e/0x7d
[ 2092.730389]  [81030a71] ? do_page_fault+0xc/0xf
[ 2092.730452]  [813b0652] system_call_fastpath+0x16/0x1b
[ 2092.730512] Code: 00 00 00 48 39 83 f8 00 00 00 74 02 0f 0b 4c 39 ab
08 01 00 00 74 02 0f 0b 48 83 7b 20 00 74 02 0f 0b 83 bb 20 01 00

Re: 3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931!

2014-10-03 Thread Tomasz Chmielewski

On 2014-10-03 20:17 (Fri), Josef Bacik wrote:

On 10/02/2014 03:27 AM, Tomasz Chmielewski wrote:

Got this when running balance with 3.17.0-rc7:



Give these two patches a try

https://patchwork.kernel.org/patch/4938281/
https://patchwork.kernel.org/patch/4939761/


With these two patches applied on top of 3.13-rc7, it BUGs somewhere 
else now:


[ 2030.858792] BTRFS info (device sdd1): relocating block group 
6469424513024 flags 17

[ 2039.674077] BTRFS info (device sdd1): found 20937 extents
[ 2066.726661] BTRFS info (device sdd1): found 20937 extents
[ 2068.048208] BTRFS info (device sdd1): relocating block group 
6468350771200 flags 17

[ 2080.796412] BTRFS info (device sdd1): found 46927 extents
[ 2092.703850] parent transid verify failed on 5568935395328 wanted 
70315 found 71183
[ 2092.714622] parent transid verify failed on 5568935395328 wanted 
70315 found 71183
[ 2092.725269] parent transid verify failed on 5568935395328 wanted 
70315 found 71183

[ 2092.725680] [ cut here ]
[ 2092.725740] kernel BUG at fs/btrfs/relocation.c:242!
[ 2092.725800] invalid opcode:  [#1] SMP
[ 2092.725860] Modules linked in: ipt_MASQUERADE iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
ip_tables x_tables cpufreq_ondemand cpufreq_conservative 
cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq 
zlib_deflate coretemp hwmon loop i2c_i801 parport_pc pcspkr i2c_core 
parport video battery tpm_infineon tpm_tis tpm lpc_ich mfd_core ehci_pci 
ehci_hcd acpi_cpufreq button ext4 crc16 jbd2 mbcache raid1 sg sd_mod 
ahci libahci libata scsi_mod r8169 mii

[ 2092.727740] CPU: 3 PID: 3937 Comm: btrfs Not tainted 3.17.0-rc7 #3
[ 2092.727801] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[ 2092.727917] task: 8800c7883020 ti: 8800c7d04000 task.ti: 
8800c7d04000
[ 2092.728029] RIP: 0010:[a0322a4a]  [a0322a4a] 
relocate_block_group+0x432/0x4de [btrfs]

[ 2092.728169] RSP: 0018:8800c7d07a58  EFLAGS: 00010206
[ 2092.728229] RAX: 8806c69a18f8 RBX: 8806c69a1800 RCX: 
00018020
[ 2092.728292] RDX: 8806c69a18d8 RSI: 8806c69a18e8 RDI: 
8807ff403900
[ 2092.728356] RBP: 8800c7d07ac8 R08: 0001 R09: 

[ 2092.728419] R10: 0003 R11: a031eb54 R12: 
8805d515c240
[ 2092.728482] R13: 8806c69a1908 R14: fff4 R15: 
8806c69a1820
[ 2092.728546] FS:  7f4f251d0840() GS:88081fac() 
knlGS:

[ 2092.728660] CS:  0010 DS:  ES:  CR0: 80050033
[ 2092.728721] CR2: ff600400 CR3: c7fb CR4: 
001407e0

[ 2092.728783] Stack:
[ 2092.728837]  ea0002edf300 8806c69a18e8 ea0002edf000 

[ 2092.728952]  ea0002edf080 00ffea0002edf0c0 a805e22b2a30 
1000
[ 2092.729067]  8807d969f870 8806c69a1800  
8807f3f285b0

[ 2092.729183] Call Trace:
[ 2092.729256]  [a0322c4e] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[ 2092.729385]  [a02ff79c] 
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[ 2092.729512]  [a030e99f] ? 
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[ 2092.729632]  [a02bfb04] ? btrfs_set_path_blocking+0x23/0x54 
[btrfs]
[ 2092.729704]  [a02c4517] ? btrfs_search_slot+0x7bc/0x816 
[btrfs]
[ 2092.729782]  [a02fbc81] ? free_extent_buffer+0x6f/0x7c 
[btrfs]

[ 2092.729859]  [a0302679] btrfs_balance+0xa7b/0xc80 [btrfs]
[ 2092.729935]  [a0308177] btrfs_ioctl_balance+0x220/0x29f 
[btrfs]

[ 2092.730012]  [a030d1e4] btrfs_ioctl+0x10bd/0x2281 [btrfs]
[ 2092.730076]  [810d5152] ? handle_mm_fault+0x44d/0xa00
[ 2092.730140]  [81173e76] ? avc_has_perm+0x2e/0xf7
[ 2092.730202]  [810d7c6d] ? __vm_enough_memory+0x25/0x13c
[ 2092.730266]  [8110d72d] do_vfs_ioctl+0x3f2/0x43c
[ 2092.730328]  [8110d7c5] SyS_ioctl+0x4e/0x7d
[ 2092.730389]  [81030a71] ? do_page_fault+0xc/0xf
[ 2092.730452]  [813b0652] system_call_fastpath+0x16/0x1b
[ 2092.730512] Code: 00 00 00 48 39 83 f8 00 00 00 74 02 0f 0b 4c 39 ab 
08 01 00 00 74 02 0f 0b 48 83 7b 20 00 74 02 0f 0b 83 bb 20 01 00 00 00 
74 02 0f 0b 83 bb 24 01 00 00 00 74 02 0f 0b 48 8b 73 18 48 8b 7b 08
[ 2092.730759] RIP  [a0322a4a] 
relocate_block_group+0x432/0x4de [btrfs]

[ 2092.730885]  RSP 8800c7d07a58
[ 2092.731233] ---[ end trace 16c7709ebf2c379c ]---


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent-tree.c:7727! with 3.17-rc3

2014-10-03 Thread Tomasz Chmielewski
]  [81110948] __dentry_kill+0x145/0x1a4
[ 7120.708996]  [81110f32] dput+0x130/0x148
[ 7120.709073]  [810fff55] __fput+0x183/0x1ab
[ 7120.709149]  [810fffab] fput+0x9/0xb
[ 7120.709226]  [8104d8a9] task_work_run+0x79/0x90
[ 7120.709304]  [8103b9bd] do_exit+0x3a0/0x8f6
[ 7120.709381]  [810053ae] oops_end+0xa2/0xab
[ 7120.709457]  [810054de] die+0x55/0x5f
[ 7120.709535]  [81002c97] do_trap+0x6a/0x12c
[ 7120.709612]  [8104f593] ? 
__atomic_notifier_call_chain+0xd/0xf

[ 7120.709691]  [81002e1b] do_error_trap+0xc2/0xd4
[ 7120.709772]  [a027eca6] ? walk_down_proc+0xdc/0x22b [btrfs]
[ 7120.709851]  [81063b6e] ? __wake_up+0x3f/0x48
[ 7120.709935]  [a02adc81] ? free_extent_buffer+0x6f/0x7c 
[btrfs]
[ 7120.710016]  [a0271bf7] ? btrfs_release_path+0x6c/0x8b 
[btrfs]

[ 7120.710095]  [81002e9c] do_invalid_op+0x1b/0x1d
[ 7120.710173]  [813b1928] invalid_op+0x18/0x20
[ 7120.710252]  [a0271c38] ? btrfs_free_path+0x22/0x26 [btrfs]
[ 7120.710335]  [a027eca6] ? walk_down_proc+0xdc/0x22b [btrfs]
[ 7120.710416]  [a027ec8f] ? walk_down_proc+0xc5/0x22b [btrfs]
[ 7120.710500]  [a0291d20] ? 
join_transaction.isra.30+0x24/0x309 [btrfs]

[ 7120.710622]  [a0280e92] walk_down_tree+0x45/0xd5 [btrfs]
[ 7120.710704]  [a028389a] btrfs_drop_snapshot+0x2f5/0x68f 
[btrfs]
[ 7120.710789]  [a02d4325] merge_reloc_roots+0x139/0x23f 
[btrfs]
[ 7120.710873]  [a02d4a7e] relocate_block_group+0x466/0x4de 
[btrfs]
[ 7120.710957]  [a02d4c4e] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[ 7120.711081]  [a02b179c] 
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[ 7120.711205]  [a02c099f] ? 
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[ 7120.711326]  [a0271b04] ? btrfs_set_path_blocking+0x23/0x54 
[btrfs]
[ 7120.711408]  [a0276517] ? btrfs_search_slot+0x7bc/0x816 
[btrfs]
[ 7120.711492]  [a02adc81] ? free_extent_buffer+0x6f/0x7c 
[btrfs]

[ 7120.711577]  [a02b4679] btrfs_balance+0xa7b/0xc80 [btrfs]
[ 7120.711660]  [a02ba177] btrfs_ioctl_balance+0x220/0x29f 
[btrfs]

[ 7120.711744]  [a02bf1e4] btrfs_ioctl+0x10bd/0x2281 [btrfs]
[ 7120.711823]  [810d5152] ? handle_mm_fault+0x44d/0xa00
[ 7120.711901]  [81173e76] ? avc_has_perm+0x2e/0xf7
[ 7120.711979]  [810d7c6d] ? __vm_enough_memory+0x25/0x13c
[ 7120.712057]  [8110d72d] do_vfs_ioctl+0x3f2/0x43c
[ 7120.712134]  [8110d7c5] SyS_ioctl+0x4e/0x7d
[ 7120.712212]  [81030a71] ? do_page_fault+0xc/0xf
[ 7120.712289]  [813b0652] system_call_fastpath+0x16/0x1b
[ 7120.712367] Code: 55 41 54 49 89 fc 53 48 83 ec 18 48 85 ff 44 89 45 
cc 44 89 4d c8 48 8b 98 58 07 00 00 0f 84 eb 00 00 00 48 85 db 74 12 48 
8b 03 48 39 38 74 02 0f 0b ff 43 14 e9 f3 00 00 00 48 8b 3d 18 9c 00
[ 7120.714556] RIP  [a01188ba] jbd2__journal_start+0x3d/0x151 
[jbd2]

[ 7120.714668]  RSP 8807beddf3d8
[ 7120.714742] CR2: 0001f9c6
[ 7120.714816] ---[ end trace 5ede8b32160a5ba0 ]---
[ 7120.714892] Fixing recursive fault but reboot is needed!



On 2014-09-08 20:04 (Mon), Tomasz Chmielewski wrote:

On 2014-09-03 19:42 (Wed), Tomasz Chmielewski wrote:

Got the following with 3.17-rc3 and running balance (had to power
cycle after that):


I'm seeing similar BUG with 3.17-rc4:

[ 1049.755843] BTRFS info (device sdb5): found 35715 extents
[ 1050.257075] BTRFS info (device sdb5): relocating block group
7091332251648 flags 20
[ 2087.976104] BTRFS info (device sdb5): found 40911 extents
[ 2091.357102] BTRFS info (device sdb5): relocating block group
7006506647552 flags 20
[ 2518.793237] INFO: task btrfs-balance:5688 blocked for more than 120 
seconds.

[ 2518.793263]   Not tainted 3.17.0-rc4 #1
[ 2518.793269] echo 0  /proc/sys/kernel/hung_task_timeout_secs
disables this message.
[ 2518.793279] btrfs-balance   D 88081fad15c0 0  5688  2 
0x

[ 2518.793291]  8807dceebac8 0046 8807dceeb9c8
8807f0f83020
[ 2518.793303]  000115c0 4000 8807f4153020
8807f0f83020
[ 2518.793315]  8807dceeba08 8105639d 
8807f3cd8050
[ 2518.793404] Call Trace:
[ 2518.793452]  [8105639d] ? wake_up_process+0x31/0x35
[ 2518.793500]  [81049baf] ? wake_up_worker+0x1f/0x21
[ 2518.793547]  [81049df6] ? insert_work+0x87/0x94
[ 2518.793605]  [a031724b] ? free_block_list+0x1f/0x34 
[btrfs]

[ 2518.793655]  [813ad75e] ? wait_for_common+0x118/0x13e
[ 2518.793703]  [813acf2e] schedule+0x65/0x67
[ 2518.793755]  [a02d934d]
wait_current_trans.isra.32+0x94/0xe2 [btrfs]
[ 2518.793843]  [81063cdc] ? add_wait_queue+0x44/0x44
[ 2518.793895]  [a02da82f] start_transaction+0x427/0x472 
[btrfs]
[ 2518.793948]  [a02dab02] btrfs_start_transaction+0x16/0x18 
[btrfs]
[ 2518.794002]  [a031b51a

3.17.0-rc7: kernel BUG at fs/btrfs/relocation.c:931!

2014-10-02 Thread Tomasz Chmielewski

Got this when running balance with 3.17.0-rc7:

(...)
[173394.571080] BTRFS info (device sdd1): relocating block group 
4391666974720 flags 17

[173405.407779] BTRFS info (device sdd1): found 52296 extents
[173441.235837] BTRFS info (device sdd1): found 52296 extents
[173442.266918] BTRFS info (device sdd1): relocating block group 
4390593232896 flags 17

[173451.515002] BTRFS info (device sdd1): found 22314 extents
[173473.761612] BTRFS info (device sdd1): found 22314 extents
[173474.498414] BTRFS info (device sdd1): relocating block group 
4389519491072 flags 20

[173475.410657] [ cut here ]
[173475.410717] kernel BUG at fs/btrfs/relocation.c:931!
[173475.410774] invalid opcode:  [#1] SMP
[173475.410829] Modules linked in: ipt_MASQUERADE iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
ip_tables x_tables cpufreq_ondemand cpufreq_conservative 
cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq 
zlib_deflate coretemp hwmon loop i2c_i801 i2c_core pcspkr battery 
tpm_infineon tpm_tis tpm parport_pc parport video lpc_ich mfd_core 
ehci_pci ehci_hcd button acpi_cpufreq ext4 crc16 jbd2 mbcache raid1 sg 
sd_mod ahci libahci libata scsi_mod r8169 mii

[173475.411284] CPU: 1 PID: 5512 Comm: btrfs Not tainted 3.17.0-rc7 #1
[173475.411341] Hardware name: System manufacturer System Product 
Name/P8H77-M PRO, BIOS 1101 02/04/2013
[173475.411450] task: 8807f1744830 ti: 88076e9b task.ti: 
88076e9b
[173475.411555] RIP: 0010:[a02ef1ae]  [a02ef1ae] 
build_backref_tree+0x64a/0xe77 [btrfs]

[173475.411684] RSP: 0018:88076e9b3888  EFLAGS: 00010287
[173475.411740] RAX: 8805abb30480 RBX: 880589dfcf00 RCX: 
0003
[173475.411845] RDX: 0510a31b8000 RSI: 880589dfcac0 RDI: 
8804c69f8800
[173475.411949] RBP: 88076e9b3988 R08: 000143e0 R09: 

[173475.412053] R10: 8807c97366f0 R11:  R12: 
8804c69f8800
[173475.412157] R13: 880589dfca80 R14:  R15: 
88065a3b
[173475.412262] FS:  7f320e446840() GS:88081fa4() 
knlGS:

[173475.413687] CS:  0010 DS:  ES:  CR0: 80050033
[173475.413744] CR2: ff600400 CR3: 0007ea08f000 CR4: 
001407e0

[173475.413849] Stack:
[173475.413899]  8807c820a000  880589dfcf00 
8801a22bf7c0
[173475.414007]  8801a22bfb60 880589dfcac0 88065a3b0124 
0001
[173475.414114]  88065a3b0120 0003 8805abb30480 
8805dbbb7240

[173475.414223] Call Trace:
[173475.414291]  [a02f0480] relocate_tree_blocks+0x1b7/0x532 
[btrfs]
[173475.414364]  [a02cac81] ? free_extent_buffer+0x6f/0x7c 
[btrfs]

[173475.414434]  [a02ebcf3] ? tree_insert+0x49/0x50 [btrfs]
[173475.414501]  [a02eea1b] ? add_tree_block+0x13a/0x162 
[btrfs]
[173475.414570]  [a02f1861] relocate_block_group+0x275/0x4de 
[btrfs]
[173475.414640]  [a02f1c22] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[173475.414763]  [a02ce79c] 
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[173475.414884]  [a02dd99f] ? 
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[173475.414995]  [a028eb04] ? 
btrfs_set_path_blocking+0x23/0x54 [btrfs]
[173475.415107]  [a0293517] ? btrfs_search_slot+0x7bc/0x816 
[btrfs]
[173475.415177]  [a02cac81] ? free_extent_buffer+0x6f/0x7c 
[btrfs]

[173475.415248]  [a02d1679] btrfs_balance+0xa7b/0xc80 [btrfs]
[173475.415318]  [a02d7177] btrfs_ioctl_balance+0x220/0x29f 
[btrfs]

[173475.415388]  [a02dc1e4] btrfs_ioctl+0x10bd/0x2281 [btrfs]
[173475.415448]  [810d5152] ? handle_mm_fault+0x44d/0xa00
[173475.415507]  [81173e76] ? avc_has_perm+0x2e/0xf7
[173475.415566]  [810d7c6d] ? __vm_enough_memory+0x25/0x13c
[173475.415625]  [8110d72d] do_vfs_ioctl+0x3f2/0x43c
[173475.415682]  [8110d7c5] SyS_ioctl+0x4e/0x7d
[173475.415740]  [81030a71] ? do_page_fault+0xc/0xf
[173475.415798]  [813b0652] system_call_fastpath+0x16/0x1b
[173475.415856] Code: ff ff 01 e9 50 02 00 00 48 63 8d 48 ff ff ff 48 8b 
85 50 ff ff ff 48 83 3c c8 00 75 46 48 8b 53 18 49 39 94 24 d8 00 00 00 
74 02 0f 0b 4c 89 e7 e8 9b e8 ff ff 85 c0 74 21 48 8b 55 98 48 8d 43
[173475.416080] RIP  [a02ef1ae] build_backref_tree+0x64a/0xe77 
[btrfs]

[173475.416151]  RSP 88076e9b3888
[173475.416482] ---[ end trace 17e512e0d6dc91d7 ]---



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.17.0-rc6 system freeze

2014-09-29 Thread Tomasz Chmielewski

On 2014-09-29 16:10 (Mon), Liu Bo wrote:

Hi Tomasz,

On Mon, Sep 29, 2014 at 02:00:18PM +0200, Tomasz Chmielewski wrote:

System froze under 3.17.0-rc6 with btrfs. It had to be hard rebooted.



How does this happen?  A stressful write with compression?


Rsync (with --inplace - can be stressful / lead to fragmentation on 
large files which change) + snapshot + remove old snapshot, for several 
remote sources (rsync server1, snapshot, remove old snap; rsync server2, 
snapshot, remove old snap etc.).


Filesystem is RAID-1 with compress-force=zlib:

/dev/sdc1 on /mnt/lxc2 type btrfs 
(rw,noatime,compress-force=zlib,space_cache)


The last operation written to the logfile was snapshot removal.


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how long should btrfs device delete missing ... take?

2014-09-11 Thread Tomasz Chmielewski
After a disk died and was replaced, btrfs device delete missing is 
taking more than 10 days on an otherwise idle server:


# btrfs fi show /home
Label: none  uuid: 84d087aa-3a32-46da-844f-a233237cf04f
Total devices 3 FS bytes used 362.44GiB
devid2 size 1.71TiB used 365.03GiB path /dev/sdb4
devid3 size 1.71TiB used 58.00GiB path /dev/sda4
*** Some devices missing

Btrfs v3.16



So far, it has copied 58 GB out of 365 GB - and it took 10 days. At this 
speed, the whole operation will take 2-3 months (assuming that the only 
healthy disk doesn't die in the meantime).

Is this expected time for btrfs RAID-1?

There are no errors in dmesg/smart, performance of both disks is fine:

# hdparm -t /dev/sda /dev/sdb

/dev/sda:
 Timing buffered disk reads: 442 MB in  3.01 seconds = 146.99 MB/sec

/dev/sdb:
 Timing buffered disk reads: 402 MB in  3.39 seconds = 118.47 MB/sec


# btrfs fi df /home
Data, RAID1: total=352.00GiB, used=351.02GiB
System, RAID1: total=32.00MiB, used=96.00KiB
Metadata, RAID1: total=13.00GiB, used=11.38GiB
unknown, single: total=512.00MiB, used=67.05MiB

# btrfs sub list /home | wc -l
260

# uptime
 17:21:53 up 10 days,  6:01,  2 users,  load average: 3.22, 3.53, 3.55


I've tried running this on the latest 3.16.x kernel earlier, but since 
the progress was so slow, rebooted after about a week to see if the 
latest RC will be any faster.



--
Tomasz Chmielewski
http://www.sslrack.com



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how long should btrfs device delete missing ... take?

2014-09-11 Thread Tomasz Chmielewski

After a disk died and was replaced, btrfs device delete missing is
taking more than 10 days on an otherwise idle server:


Something isn't right though, because it's clearly neither reading nor 
writing at \
anywhere close to 1/2 the drive read throughput. I'm curious what 
'iotop -d30 -o' \
shows (during the replace, before cancel), which should be pretty 
consistent by \
averaging 30 seconds worth of io. And then try 'iotop -d3 -o' and see 
if there are \
spikes. I'm willing to bet there's a lot of nothing going on, with 
occasional spikes, \

rather than a constant trickle.


That's more or less what I'm seeing with both. The numbers will go up or 
down slightly, but it's counted in kilobytes per second:


Total DISK READ:   0.00 B/s | Total DISK WRITE: 545.82 B/s
  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IOCOMMAND
  940 be/3 root0.00 B/s  136.46 B/s  0.00 %  0.10 % [jbd2/md2-8]
 4714 be/4 root0.00 B/s  329.94 K/s  0.00 %  0.00 % 
[btrfs-transacti]
25534 be/4 root0.00 B/s  402.97 K/s  0.00 %  0.00 % 
[kworker/u16:0]



The bottleneck may be here - one CPU core is mostly 100% busy (kworker). 
Not sure what it's really busy with though:


  PID USER  PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
25546 root   20   0 0 0 0 R 93.0  0.0 18:22.94 
kworker/u16:7
14473 root   20   0 0 0 0 S  5.0  0.0 25:00.14 
kworker/0:0



[912979.063432] SysRq : Show Blocked State
[912979.063485]   taskPC stack   pid father
[912979.063545] btrfs   D 88083fa515c0 0  4793   4622 
0x
[912979.063601]  88061a29b878 0086  
88003683e040
[912979.063701]  000115c0 4000 880813e3 
88003683e040
[912979.063800]  88061a29b7e8 8105d8e9 88083fa4 
88083fa115c0

[912979.063899] Call Trace:
[912979.063951]  [8105d8e9] ? enqueue_task_fair+0x3e5/0x44f
[912979.064006]  [81053484] ? resched_curr+0x47/0x57
[912979.064058]  [81053aed] ? check_preempt_curr+0x3e/0x6d
[912979.064111]  [81053b2e] ? ttwu_do_wakeup+0x12/0x7f
[912979.064164]  [81053c3c] ? 
ttwu_do_activate.constprop.74+0x57/0x5c

[912979.064220]  [813acc1e] schedule+0x65/0x67
[912979.064272]  [813aed0c] schedule_timeout+0x26/0x198
[912979.064324]  [8105639d] ? wake_up_process+0x31/0x35
[912979.064378]  [81049baf] ? wake_up_worker+0x1f/0x21
[912979.064431]  [81049df6] ? insert_work+0x87/0x94
[912979.064493]  [a02d524b] ? free_block_list+0x1f/0x34 
[btrfs]

[912979.064548]  [813ad443] wait_for_common+0x10d/0x13e
[912979.064600]  [8105635d] ? try_to_wake_up+0x251/0x251
[912979.064653]  [813ad48c] wait_for_completion+0x18/0x1a
[912979.064710]  [a0283a01] 
btrfs_async_run_delayed_refs+0xc1/0xe4 [btrfs]
[912979.064814]  [a02983c5] 
__btrfs_end_transaction+0x2bb/0x2e1 [btrfs]
[912979.064916]  [a02983f9] 
btrfs_end_transaction_throttle+0xe/0x10 [btrfs]
[912979.065020]  [a02d973d] relocate_block_group+0x2ad/0x4de 
[btrfs]
[912979.065079]  [a02d9ac6] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[912979.065184]  [a02b66f0] 
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[912979.065286]  [a02c58d7] ? 
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[912979.065387]  [a0276b04] ? 
btrfs_set_path_blocking+0x23/0x54 [btrfs]
[912979.065486]  [a027b517] ? btrfs_search_slot+0x7bc/0x816 
[btrfs]
[912979.065546]  [a02b2bd5] ? free_extent_buffer+0x6f/0x7c 
[btrfs]
[912979.065605]  [a02b89e9] btrfs_shrink_device+0x23c/0x3a5 
[btrfs]
[912979.065679]  [a02bb2c7] btrfs_rm_device+0x2a1/0x759 
[btrfs]

[912979.065747]  [a02c3ab3] btrfs_ioctl+0xa52/0x227f [btrfs]
[912979.065811]  [81107182] ? putname+0x23/0x2c
[912979.065863]  [8110b3cb] ? user_path_at_empty+0x60/0x90
[912979.065918]  [81173b1a] ? avc_has_perm+0x2e/0xf7
[912979.065978]  [810d7ad5] ? __vm_enough_memory+0x25/0x13c
[912979.066032]  [8110d3c1] do_vfs_ioctl+0x3f2/0x43c
[912979.066084]  [811026fd] ? vfs_stat+0x16/0x18
[912979.066136]  [8110d459] SyS_ioctl+0x4e/0x7d
[912979.066188]  [81030a71] ? do_page_fault+0xc/0xf
[912979.066240]  [813afd92] system_call_fastpath+0x16/0x1b
[912979.066296] Sched Debug Version: v0.11, 3.17.0-rc3 #1
[912979.066347] ktime   : 
913460840.666210
[912979.066401] sched_clk   : 
912979066.295474
[912979.066454] cpu_clk : 
912979066.295485

[912979.066507] jiffies : 4386283381
[912979.066560] sched_clock_stable(): 1
[912979.066610]
[912979.066656] sysctl_sched
[912979.066703]   .sysctl_sched_latency: 24.00
[912979.066756]   .sysctl_sched_min_granularity 

  1   2   3   >