Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-09 Thread Koen Kooi
Hi,

Today the kernel got wedged during shutdown (4.11.x tends to do that, haven't
debugged) and I pressed the reset button. The next boot btrfs won't mount:

[Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
failed on 5840011722752 wanted 170755 found 170832
[Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
failed on 5840011722752 wanted 170755 found 170832
[Fri Jun  9 20:46:07 2017] BTRFS error (device md0): failed to read block 
groups: -5
[Fri Jun  9 20:46:08 2017] BTRFS error (device md0): open_ctree failed

I tried repair, but that didn't work either:

# btrfsck --repair /dev/md0
enabling repair mode
couldn't open RDWR because of unsupported option features (3).
ERROR: cannot open file system
enabling repair mode

Googling around it was suggested clearing the v2 space cache:

# btrfsck --mode=lowmem --clear-space-cache v2 /dev/md0
parent transid verify failed on 5840011722752 wanted 170755 found 170832
parent transid verify failed on 5840011722752 wanted 170755 found 170832
parent transid verify failed on 5840011722752 wanted 170755 found 170832
parent transid verify failed on 5840011722752 wanted 170755 found 170832
Ignoring transid failure
leaf parent key incorrect 5840011722752
parent transid verify failed on 5367057465344 wanted 170755 found 170828
parent transid verify failed on 5367057465344 wanted 170755 found 170828
parent transid verify failed on 5367057465344 wanted 170755 found 170828
parent transid verify failed on 5367057465344 wanted 170755 found 170828
Ignoring transid failure
leaf parent key incorrect 72105984
btrfs unable to find ref byte nr 4628577484800 parent 0 root 10  owner 0 offset 
1
parent transid verify failed on 5366993256448 wanted 170755 found 170827
parent transid verify failed on 5366993256448 wanted 170755 found 170827
parent transid verify failed on 5366993256448 wanted 170755 found 170827
parent transid verify failed on 5366993256448 wanted 170755 found 170827
Ignoring transid failure
leaf parent key incorrect 41287680
ERROR: failed to clear free space cache v2: -1
transaction.h:41: btrfs_start_transaction: BUG_ON `root->commit_root` 
triggered, value 22938400
btrfs check[0x411674]
btrfs check(close_ctree_fs_info+0x125)[0x41368c]
btrfs check(cmd_check+0x36d8)[0x45e8e8]
btrfs check(main+0x15d)[0x40ac5c]
/lib/libc.so.6(__libc_start_main+0xf0)[0x7f9b4cb060d0]
btrfs check[0x40a729]
Clear free space cache v2

The underlying md0 (raid6) doesn't report any errors, trying different kernels 
makes no difference, 4.10.17, 4.11.4 and 4.12.0-rc4 all give the same errors. 
Everything above was
done with btrfs-progs 4.11.

Any hints on how I can fix the errors in the filesystem? I don't mind loosing 
todays changes, but I would like to keep all the older data :)

regards,

Koen

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-09 Thread Koen Kooi
Op 09-06-17 om 21:57 schreef Hugo Mills:
> On Fri, Jun 09, 2017 at 09:12:16PM +0200, Koen Kooi wrote:
>> Hi,
>>
>> Today the kernel got wedged during shutdown (4.11.x tends to do that, haven't
>> debugged) and I pressed the reset button. The next boot btrfs won't mount:
>>
>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
>> failed on 5840011722752 wanted 170755 found 170832
>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
>> failed on 5840011722752 wanted 170755 found 170832
>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): failed to read block 
>> groups: -5
>> [Fri Jun  9 20:46:08 2017] BTRFS error (device md0): open_ctree failed
> 
>With a transid failure on mount, about the only thing that's likely
> to work is mounting with -o usebackuproot. If that doesn't work, then
> a rebuild of the FS is almost certainly needed.

Hrm, that is also a no-go:

# mount /dev/md0 /media/data/  -o usebackuproot

[  740.294141] BTRFS info (device md0): trying to use backup root at mount time
[  740.294145] BTRFS info (device md0): using free space tree
[  740.294146] BTRFS info (device md0): has skinny extents
[  754.248228] BTRFS error (device md0): parent transid verify failed on 
5840011722752 wanted 170755 found 170832
[  754.449435] BTRFS error (device md0): parent transid verify failed on 
5840011722752 wanted 170755 found 170832
[  754.449527] BTRFS error (device md0): failed to read block groups: -5
[  754.609960] BTRFS error (device md0): open_ctree failed

So, any more suggestions of things to try?

regards,

Koen

> 
>Hugo.
> 
>> I tried repair, but that didn't work either:
>>
>> # btrfsck --repair /dev/md0
>> enabling repair mode
>> couldn't open RDWR because of unsupported option features (3).
>> ERROR: cannot open file system
>> enabling repair mode
>>
>> Googling around it was suggested clearing the v2 space cache:
>>
>> # btrfsck --mode=lowmem --clear-space-cache v2 /dev/md0
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> Ignoring transid failure
>> leaf parent key incorrect 5840011722752
>> parent transid verify failed on 5367057465344 wanted 170755 found 170828
>> parent transid verify failed on 5367057465344 wanted 170755 found 170828
>> parent transid verify failed on 5367057465344 wanted 170755 found 170828
>> parent transid verify failed on 5367057465344 wanted 170755 found 170828
>> Ignoring transid failure
>> leaf parent key incorrect 72105984
>> btrfs unable to find ref byte nr 4628577484800 parent 0 root 10  owner 0 
>> offset 1
>> parent transid verify failed on 5366993256448 wanted 170755 found 170827
>> parent transid verify failed on 5366993256448 wanted 170755 found 170827
>> parent transid verify failed on 5366993256448 wanted 170755 found 170827
>> parent transid verify failed on 5366993256448 wanted 170755 found 170827
>> Ignoring transid failure
>> leaf parent key incorrect 41287680
>> ERROR: failed to clear free space cache v2: -1
>> transaction.h:41: btrfs_start_transaction: BUG_ON `root->commit_root` 
>> triggered, value 22938400
>> btrfs check[0x411674]
>> btrfs check(close_ctree_fs_info+0x125)[0x41368c]
>> btrfs check(cmd_check+0x36d8)[0x45e8e8]
>> btrfs check(main+0x15d)[0x40ac5c]
>> /lib/libc.so.6(__libc_start_main+0xf0)[0x7f9b4cb060d0]
>> btrfs check[0x40a729]
>> Clear free space cache v2
>>
>> The underlying md0 (raid6) doesn't report any errors, trying different 
>> kernels makes no difference, 4.10.17, 4.11.4 and 4.12.0-rc4 all give the 
>> same errors. Everything above was
>> done with btrfs-progs 4.11.
>>
>> Any hints on how I can fix the errors in the filesystem? I don't mind 
>> loosing todays changes, but I would like to keep all the older data :)
>>
>> regards,
>>
>> Koen
>>
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-11 Thread Koen Kooi

> Op 11 jun. 2017, om 06:20 heeft Chris Murphy  het 
> volgende geschreven:
> 
> On Fri, Jun 9, 2017 at 1:57 PM, Hugo Mills  wrote:
>> On Fri, Jun 09, 2017 at 09:12:16PM +0200, Koen Kooi wrote:
>>> Hi,
>>> 
>>> Today the kernel got wedged during shutdown (4.11.x tends to do that, 
>>> haven't
>>> debugged) and I pressed the reset button. The next boot btrfs won't mount:
>>> 
>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
>>> failed on 5840011722752 wanted 170755 found 170832
>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid verify 
>>> failed on 5840011722752 wanted 170755 found 170832
>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): failed to read block 
>>> groups: -5
>>> [Fri Jun  9 20:46:08 2017] BTRFS error (device md0): open_ctree failed
>> 
>>   With a transid failure on mount, about the only thing that's likely
>> to work is mounting with -o usebackuproot. If that doesn't work, then
>> a rebuild of the FS is almost certainly needed.
> 
> Weird that it wants almost 80 generations back from what's found.
> Sounds like betrayal somewhere…

The issue I have with 4.11.x is that after a few days “kworker” starts to take 
100% cpu which only a reboot will fix. I’m not sure what caused this btrfs 
corruption, since multiple things changed: 

1) kernel 4.10.x -> 4.11.x
2) A journal was added to /dev/md0
3) Force blk-mq mode

4.12rc would hard lock, but rc4 looks a lot better, it’s still up after 2 days, 
but then again, /dev/md0 hasn’t been used :)

I now fully understand what “RAID is not a backup” is all about :/

> I'd say take a btrfs-image and put it up somewhere and also file a
> bug. The fsck should not crash.

I’ll create a bugzilla account and file a bug for that.

> 
> What are these showing?

Output attached inline, see below.

> # btrfs insp dump-s -f /dev/

superblock: bytenr=65536, device=/dev/md0
-
csum_type   0 (crc32c)
csum_size   4
csum0x0fb18762 [match]
bytenr  65536
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fside00f9fd0-8b57-43c1-8d8b-1c27a40aef28
label   
generation  170816
root4355118137344
sys_array_size  129
chunk_root_generation   170426
root_level  1
chunk_root  29519205384192
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 20003257057280
bytes_used  14730770751488
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 1
compat_flags0x0
compat_ro_flags 0x3
( FREE_SPACE_TREE |
  FREE_SPACE_TREE_VALID )
incompat_flags  0x169
( MIXED_BACKREF |
  COMPRESS_LZO |
  BIG_METADATA |
  EXTENDED_IREF |
  SKINNY_METADATA )
cache_generation74920
uuid_tree_generation170816
dev_item.uuid   acbd8ddf-5bf8-4b08-9cdd-a750c2cb7bc6
dev_item.fsid   e00f9fd0-8b57-43c1-8d8b-1c27a40aef28 [match]
dev_item.type   0
dev_item.total_bytes20003257057280
dev_item.bytes_used 14871399759872
dev_item.io_align   4096
dev_item.io_width   4096
dev_item.sector_size4096
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0
sys_chunk_array[2048]:
item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 29519205367808)
length 33554432 owner 2 stripe_len 65536 type SYSTEM|DUP
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 1 offset 288874299392
dev_uuid acbd8ddf-5bf8-4b08-9cdd-a750c2cb7bc6
stripe 1 devid 1 offset 288907853824
dev_uuid acbd8ddf-5bf8-4b08-9cdd-a750c2cb7bc6
backup_roots[4]:
backup 0:
backup_tree_root:   4354691088384   gen: 170813 level: 1
backup_chunk_root:  29519205384192  gen: 170426 level: 1
backup_extent_root: 4354711486464   gen: 170814 level: 2
backup_fs_root: 1667614900224   gen: 110362 level: 0
backup_dev_root:2324468563968   gen: 170426 level: 1
backup_csum_root:   3920799973376   gen: 170813 

Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-11 Thread Koen Kooi

> Op 11 jun. 2017, om 12:05 heeft Koen Kooi  het 
> volgende geschreven:
> 
>> 
>> Op 11 jun. 2017, om 06:20 heeft Chris Murphy  het 
>> volgende geschreven:
>> 
>> On Fri, Jun 9, 2017 at 1:57 PM, Hugo Mills  wrote:

[..]

>> I'd say take a btrfs-image and put it up somewhere and also file a
>> bug. The fsck should not crash.
> 
> I’ll create a bugzilla account and file a bug for that.

Done: https://bugzilla.kernel.org/show_bug.cgi?id=196031

Btrfs-image is still running, will put it online when it has finished. It’s 
already 1.2G:

koen@beast:~$ du -ms /data/backup/btrfs.img 
1205/data/backup/btrfs.img

regards,

Koen--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-11 Thread Koen Kooi

> Op 12 jun. 2017, om 00:58 heeft Chris Murphy  het 
> volgende geschreven:
> 
> On Sun, Jun 11, 2017 at 4:05 AM, Koen Kooi  wrote:
>> 
>>> Op 11 jun. 2017, om 06:20 heeft Chris Murphy  het 
>>> volgende geschreven:
>>> 
>>> On Fri, Jun 9, 2017 at 1:57 PM, Hugo Mills  wrote:
>>>> On Fri, Jun 09, 2017 at 09:12:16PM +0200, Koen Kooi wrote:
>>>>> Hi,
>>>>> 
>>>>> Today the kernel got wedged during shutdown (4.11.x tends to do that, 
>>>>> haven't
>>>>> debugged) and I pressed the reset button. The next boot btrfs won't mount:
>>>>> 
>>>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid 
>>>>> verify failed on 5840011722752 wanted 170755 found 170832
>>>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): parent transid 
>>>>> verify failed on 5840011722752 wanted 170755 found 170832
>>>>> [Fri Jun  9 20:46:07 2017] BTRFS error (device md0): failed to read block 
>>>>> groups: -5
>>>>> [Fri Jun  9 20:46:08 2017] BTRFS error (device md0): open_ctree failed
> 
> 
> Superblock shows gen 170816 and the backups have nothing newer. So why
> is it finding generation 170832? It's confused.
> 
> 
>> 
>> 1) kernel 4.10.x -> 4.11.x
>> 2) A journal was added to /dev/md0
>> 3) Force blk-mq mode
> 
> Could be blk-mq + md bug, *shrug* not sure. It's not 4.11 on its own,
> I've been running all of those version since rc1 and haven't had any
> problems, although I also haven't had any forced shutdowns either.
> 
> 
> 
>>> # btrfs rescue super /dev/
>> 
>> All supers are valid, no need to recover
> 
> So it's not like the supers were in the middle of being updated at the
> time of the failure.
> 
> 
>> 
>> 
>>> # btrfs-find-root /dev/
>> 
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> parent transid verify failed on 5840011722752 wanted 170755 found 170832
>> Ignoring transid failure
>> leaf parent key incorrect 5840011722752
>> Superblock thinks the generation is 170816
>> Superblock thinks the level is 1
>> Found tree root at 4355118137344 gen 170816 level 1
>> Well block 4354996797440(gen: 170815 level: 1) seems good, but 
>> generation/level doesn't match, want gen: 170816 level: 1
>> Well block 4354823323648(gen: 170814 level: 1) seems good, but 
>> generation/level doesn't match, want gen: 170816 level: 1
>> Well block 4354691088384(gen: 170813 level: 1) seems good, but 
>> generation/level doesn't match, want gen: 170816 level: 1
> 
> Try mounting with -o ro,usebackuproot and report back on dmesg. At
> least that's faster to make a backup than scraping with btrfs restore.
> Although I think what you have should be possible to scrape with btrfs
> restore if ro,usebackuproot doesn't work.


root@beast:~# mount /dev/md0 /data/media/ -o ro,usebackuproot

[Mon Jun 12 07:15:47 2017] BTRFS info (device md0): trying to use backup root 
at mount time
[Mon Jun 12 07:15:47 2017] BTRFS info (device md0): using free space tree
[Mon Jun 12 07:15:47 2017] BTRFS info (device md0): has skinny extents
[Mon Jun 12 07:15:48 2017] BTRFS error (device md0): parent transid verify 
failed on 5840011722752 wanted 170755 found 170832
[Mon Jun 12 07:15:48 2017] BTRFS error (device md0): parent transid verify 
failed on 5840011722752 wanted 170755 found 170832
[Mon Jun 12 07:15:48 2017] BTRFS error (device md0): failed to read block 
groups: -5
[Mon Jun 12 07:15:48 2017] BTRFS error (device md0): open_ctree failed

> 
> Also worth trying btrfs check --mode=lowmem. This doesn't repair but
> is a whole new implementation so it might find the source of the
> problem better than the current fsck. There are patches that can be
> applied to fix some of the found problems but of course it could make
> things worse.

That runs for a few hours and segfaults at the end, I’ll run it again and post 
the log.

regards,

Koen--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-11 Thread Koen Kooi
Op 12-06-17 om 00:58 schreef Chris Murphy:

[..]

> Also worth trying btrfs check --mode=lowmem. This doesn't repair but
> is a whole new implementation so it might find the source of the
> problem better than the current fsck.

I ran it under 'catchsegv' to give more data where it segfaults, here's the log:

https://dominion.thruhere.net/btrfsck-lowmem.txt.gz

It's 688K compressed and 16MiB uncompressed.

regards,

Koen


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem won't mount (open_ctree failed) or repair (BUG_ON)

2017-06-11 Thread Koen Kooi
Op 12-06-17 om 01:00 schreef Chris Murphy:
> On Sun, Jun 11, 2017 at 4:13 AM, Koen Kooi  wrote:
>>
>>> Op 11 jun. 2017, om 12:05 heeft Koen Kooi  het 
>>> volgende geschreven:
>>>
>>>>
>>>> Op 11 jun. 2017, om 06:20 heeft Chris Murphy  het 
>>>> volgende geschreven:
>>>>
>>>> On Fri, Jun 9, 2017 at 1:57 PM, Hugo Mills  wrote:
>>
>> [..]
>>
>>>> I'd say take a btrfs-image and put it up somewhere and also file a
>>>> bug. The fsck should not crash.
>>>
>>> I’ll create a bugzilla account and file a bug for that.
>>
>> Done: https://bugzilla.kernel.org/show_bug.cgi?id=196031
>>
>> Btrfs-image is still running, will put it online when it has finished. It’s 
>> already 1.2G:
>>
>> koen@beast:~$ du -ms /data/backup/btrfs.img
>> 1205/data/backup/btrfs.img
> 
> Hopefully you're using -s -t4 -c9 but if not you can at least compress
> it after the fact but that takes even longer.

I wasn't using '-s', after I did it and ran it again it shrunk from 14GiB to 
733MiB:

https://dominion.thruhere.net/btrfs.img

xz -9 -e wouldn't only compressed that 14GiB file to 13.99GiB, so I wonder why 
'-s' seems to make such a difference.

regards,

Koen


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix regression when running delayed references

2015-10-22 Thread Koen Kooi
Op 22-10-15 om 10:47 schreef fdman...@kernel.org:
> From: Filipe Manana 
> 
> In the kernel 4.2 merge window we had a refactoring/rework of the delayed
> references implementation in order to fix certain problems with qgroups.
> However that rework introduced one more regression that leads to the
> following trace when running delayed references for metadata:
> 
> [35908.064664] kernel BUG at fs/btrfs/extent-tree.c:1832!
> [35908.065201] invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [35908.065201] Modules linked in: dm_flakey dm_mod btrfs crc32c_generic xor 
> raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc 
> loop fuse parport_pc psmouse i2
> [35908.065201] CPU: 14 PID: 15014 Comm: kworker/u32:9 Tainted: GW 
>   4.3.0-rc5-btrfs-next-17+ #1
> [35908.065201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
> [35908.065201] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
> [35908.065201] task: 880114b7d780 ti: 88010c4c8000 task.ti: 
> 88010c4c8000
> [35908.065201] RIP: 0010:[]  [] 
> insert_inline_extent_backref+0x52/0xb1 [btrfs]
> [35908.065201] RSP: 0018:88010c4cbb08  EFLAGS: 00010293
> [35908.065201] RAX:  RBX: 88008a661000 RCX: 
> 
> [35908.065201] RDX: a04dd58f RSI: 0001 RDI: 
> 
> [35908.065201] RBP: 88010c4cbb40 R08: 1000 R09: 
> 88010c4cb9f8
> [35908.065201] R10:  R11: 002c R12: 
> 
> [35908.065201] R13: 88020a74c578 R14:  R15: 
> 
> [35908.065201] FS:  () GS:88023edc() 
> knlGS:
> [35908.065201] CS:  0010 DS:  ES:  CR0: 8005003b
> [35908.065201] CR2: 015e8708 CR3: 000102185000 CR4: 
> 06e0
> [35908.065201] Stack:
> [35908.065201]  88010c4cbb18 0f37 88020a74c578 
> 88015a408000
> [35908.065201]  880154a44000  0005 
> 88010c4cbbd8
> [35908.065201]  a0492b9a 0005  
> 
> [35908.065201] Call Trace:
> [35908.065201]  [] __btrfs_inc_extent_ref+0x8b/0x208 [btrfs]
> [35908.065201]  [] ? __btrfs_run_delayed_refs+0x4d4/0xd33 
> [btrfs]
> [35908.065201]  [] __btrfs_run_delayed_refs+0xafa/0xd33 
> [btrfs]
> [35908.065201]  [] ? join_transaction.isra.10+0x25/0x41f 
> [btrfs]
> [35908.065201]  [] ? join_transaction.isra.10+0xa8/0x41f 
> [btrfs]
> [35908.065201]  [] btrfs_run_delayed_refs+0x75/0x1dd [btrfs]
> [35908.065201]  [] delayed_ref_async_start+0x3c/0x7b [btrfs]
> [35908.065201]  [] normal_work_helper+0x14c/0x32a [btrfs]
> [35908.065201]  [] btrfs_extent_refs_helper+0x12/0x14 
> [btrfs]
> [35908.065201]  [] process_one_work+0x24a/0x4ac
> [35908.065201]  [] worker_thread+0x206/0x2c2
> [35908.065201]  [] ? rescuer_thread+0x2cb/0x2cb
> [35908.065201]  [] ? rescuer_thread+0x2cb/0x2cb
> [35908.065201]  [] kthread+0xef/0xf7
> [35908.065201]  [] ? kthread_parkme+0x24/0x24
> [35908.065201]  [] ret_from_fork+0x3f/0x70
> [35908.065201]  [] ? kthread_parkme+0x24/0x24
> [35908.065201] Code: 6a 01 41 56 41 54 ff 75 10 41 51 4d 89 c1 49 89 c8 48 8d 
> 4d d0 e8 f6 f1 ff ff 48 83 c4 28 85 c0 75 2c 49 81 fc ff 00 00 00 77 02 <0f> 
> 0b 4c 8b 45 30 8b 4d 28 45 31
> [35908.065201] RIP  [] 
> insert_inline_extent_backref+0x52/0xb1 [btrfs]
> [35908.065201]  RSP 
> [35908.310885] ---[ end trace fe4299baf0666457 ]---

Would this also solve this:

Oct 22 12:03:20 beast kernel: WARNING: CPU: 5 PID: 323 at lib/list_debug.c:62
__list_del_entry+0x5a/0x98()
Oct 22 12:03:20 beast kernel: list_del corruption. next->prev should be
88033f864500, but was 88033f8642c0
Oct 22 12:03:20 beast kernel: Modules linked in: arc4 md4 nls_utf8 cifs
dns_resolver fscache ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack veth loop b43
mac80211 cfg80211 ssb mmc_core kvm_intel kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel serio_raw sb_edac edac_core i2c_i801 btusb btrtl btintel
btbcm bluetooth joydev bcma rfkill cp210x tpm_infineon tpm_tis tpm
sch_fq_codel radeon crc32c_intel ttm drm_kms_helper
Oct 22 12:03:20 beast kernel: CPU: 5 PID: 323 Comm: kworker/u16:12 Tainted: G
   W   4.2.2 #50
Oct 22 12:03:20 beast kernel: Hardware name: System manufacturer System
Product Name/X79-DELUXE, BIOS 0901 06/20/2014
Oct 22 12:03:20 beast kernel: Workqueue: btrfs-delalloc btrfs_delalloc_helper
Oct 22 12:03:20 beast kernel:  0009 88013993fb98
8170b663 0006
Oct 22 12:03:20 beast kernel:  88013993fbe8 88013993fbd8
8106aa40 88013993fc78
Oct 22 12:03:20 beast kernel:  813392c3 88033f864480
88033f864500 880ba968d510
Oct 22 12:03:20 beast kernel: Call Trace:
Oct 22 12:03:20 beast kernel

Re: [PATCH RESEND] btrfs-progs: allow "no" to disable compression for convenience

2017-10-14 Thread Koen Kooi
Op 14-10-17 om 14:54 schreef Satoru Takeuchi:
> It's messy to use "" to disable compression. Introduce the new value "no"
> which can also be used for this purpose.

Wouldn't 'none' be a better fit?

regards,

Koen

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs raid1 array has issues with rtorrent usage pattern.

2014-10-30 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dan Merillat schreef op 30-10-14 04:17:
> It's specifically BTRFS related, I was able to reproduce it on a bare 
> drive (no lvm, no md, no bcache).  It's not bad RAM, I was able to 
> reproduce it on multiple machines running either 3.17 or late RCs.
> 
> I've tested 3.18-rc2 for about 2 hours now, can't get any failures, so 
> that's good.  If anyone else can reproduce this it'll probably need to be
> sent to 3.17-stable.

3.17.2 has a lot of btrfs backports queued[1] already, could you see if the
fix for your problem is already present?

regards,

Koen

[1]
https://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/commit/queue-3.17/btrfs-fix-a-deadlock-in-btrfs_dev_replace_finishing.patch?id=2792dbfd1e02a70a8eef7e0cc3f44cb77d6c100f

> 
> On Wed, Oct 29, 2014 at 7:24 PM, Alec Blayne  wrote:
>> Really nice to know it's already getting handled :)
>> 
>> I'm already "downgrading" to 3.16.6 now that I know I won't have that 
>> issue. I was already planning to because of the read-only snapshots
>> issue.
>> 
>> Thank you and good luck debugging!
>> 
>> On 29-10-2014 21:50, Dan Merillat wrote:
>>> I'm in the middle of debugging the exact same thing.  3.17.0 - 
>>> rtorrent dies with SIGBUS.
>>> 
>>> I've done some debugging, the sequence is something like this: open a
>>> new file fallocate() to the final size mmap() all (or a portion) of
>>> the file write to the region run SHA1 on that mmap'd region to
>>> validate the chink crash, eventually.  Generally not at the same
>>> point.
>>> 
>>> Reading that file (cat > /dev/null) returns -EIO.
>>> 
>>> Looking up the process maps, the SIGBUS appears to be happening in
>>> the middle of a mapped region of a pre-allocated file - I.E. it
>>> shouldn't be.  I'm not completely ruling out a rtorrent bug but it
>>> appears sane to me.
>>> 
>>> Weirder: "old" files, that have been around a while, work just fine
>>> for seeding. I've re-hashed my entire collection without an error.
>>> 
>>> Seeing this on both inherit-COW and no-inherit-COW files, and the 
>>> filesystem is not using compression.
>>> 
>>> The interesting part is going back and attempting to read the files 
>>> later they sometimes don't throw an IO error.
>>> 
>>> Absolutely nothing in dmesg.
>>> 
>>> Working on a testcase that triggers it reliably but no luck so far.
>>> I thought I had bad RAM but two people upgrading to 3.17 and seeing
>>> the same bug at around the same time can't be a coincidence.  I
>>> rebooted to 3.17 on the 25th, the first new download was on the 28th
>>> and that failed.
>>> 
>>> Working on a testcase for it that's more reproducable than "go grab 
>>> torrent files with rtorrent".
>>> 
>>> On Tue, Oct 28, 2014 at 12:49 PM, Alec Blayne  wrote:
 Hi, it seems that when using rtorrent to download into a btrfs
 system, it leads to the creation of files that fail to read
 properly. For instance, I get rtorrent to crash, but if I try to
 rsync the file he was writting into someplace else, rsync also
 fails with the message "can't map file "$file": Input/Output error
 (5)". If I give it time, eventually the file gets into a good state
 and I can rsync it somewhere else (as long as rtorrent doesn't keep
 writting into it). This doesn't happen using ext4 on the same
 system.
 
 No btrfs errors, or any other errors, show up in any log. Scrubbing
 or balancing don't turn up any issues. I've tried using a subvolume
 mounted with nodatacow and/or flushoncommit, which didn't help. I'm
 not using quotas and at some point had a single snapshot that I
 deleted. The filesystem was originally created recently (on a
 3.16.4+ kernel).
 
 Here's what the array looks like:
 
 Label: 'data'  uuid: ffe83a3d-f4ba-46b7-8424-4ec3380cb811 Total
 devices 4 FS bytes used 3.14TiB devid4 size 2.73TiB used
 2.36TiB path /dev/sdd1 devid5 size 1.82TiB used 1.45TiB path
 /dev/sdc1 devid6 size 1.82TiB used 1.45TiB path /dev/sdb1 devid
 7 size 1.82TiB used 1.45TiB path /dev/sda1
 
 Btrfs v3.17
 
 Data, RAID1: total=3.34TiB, used=3.13TiB System, RAID1:
 total=32.00MiB, used=512.00KiB Metadata, RAID1: total=10.00GiB,
 used=7.31GiB GlobalReserve, single: total=512.00MiB, used=0.00B
 
 
 On linux 3.17.1: Linux 3.17.1-gentoo-r1 #3 SMP PREEMPT Tue Oct 28 
 02:43:11 WET 2014 x86_64 AMD Athlon(tm) 5350 APU with Radeon(tm)
 R3 AuthenticAMD GNU/Linux
 
 I'm utterly puzzled and clueless at how to dig into this issue. -- 
 To unsubscribe from this list: send the line "unsubscribe
 linux-btrfs" in the body of a message to majord...@vger.kernel.org 
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> in the body of a message to majord...@vger.kernel.org More majordomo info
> at  http://vger.kernel.org/majordo

Re: btrfs-prog: improve build-system by autoconf

2014-12-22 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Karel Zak schreef op 18-12-14 14:31:
> On Wed, Dec 17, 2014 at 03:07:26PM +0100, David Sterba wrote:
>> On Fri, Dec 12, 2014 at 01:35:14PM +0100, Karel Zak wrote:
>>> This is first step to make btrfs-progs build system more
>>> conventional for userspace users and developers. All is implemented
>>> by small incremental patches to keep things review-able.
>> 
>> Thanks. I went through the patches and haven't found major problems.
>> The changes are affecting build system and this will need a longer
>> period before all distros have a chance to adapt to that, so I'm
>> postponing it to 3.19.
> 
> Cool, I'll try to prepare next set of patches with automake.
> 
> BTW, I have good experience with build-system changes -- downstream 
> distributions (maintainers) are usually pretty flexible :-)

Especially when switching from homegrown makefiles to autofoo!

regards,

Koen

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)
Comment: GPGTools - http://gpgtools.org

iD8DBQFUmBLqMkyGM64RGpERApjpAJ9Agir/DDJiFYUR8qPDcNmx7pnLnQCgoKsD
1HuCreKom9ZYzZevIbqWz08=
=g1VP
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: price to pay for nocow file bit?

2015-01-08 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Murphy schreef op 08-01-15 om 09:24:
> On Wed, Jan 7, 2015 at 1:10 PM, Josef Bacik  wrote:
>> On 01/07/2015 12:43 PM, Lennart Poettering wrote:
>>> 
>>> Heya!
>>> 
>>> Currently, systemd-journald's disk access patterns (appending to the 
>>> end of files, then updating a few pointers in the front) result in 
>>> awfully fragmented journal files on btrfs, which has a pretty 
>>> negative effect on performance when accessing them.
>>> 
>> 
>> I've been wondering if mount -o autodefrag would deal with this problem
>> but I haven't had the chance to look into it.
> 
> I've been using autodefrag and haven't run into journal corruptions that
> I can attribute to btrfs since the last one was fixed over a year ago.
> Chris Mason has suggested preference to use of autodefrag for this use
> case rather than xattr +C. But I don't know the time frame for autodefrag
> by default, it's come up a couple times but it's not the default yet.

Same here, no issues with using autodefrag and journals.

regards,

Koen

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)
Comment: GPGTools - http://gpgtools.org

iD8DBQFUrkFVMkyGM64RGpERAgGKAJ9pmXA4STYx6sUJP5HBALcUCkfMqwCeNhzR
8v4u6bvhtFZYxYbGDiHghps=
=4MPU
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS setup advice for laptop performance ?

2014-04-12 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Marc MERLIN schreef op 12-04-14 15:17:
> On Fri, Apr 04, 2014 at 04:09:06PM +0100, Hugo Mills wrote:
>>> - Generally speaking, does LZO compression improve or degrade
>>> performance ? I'm not able to figure it out clearly.
>> 
>> Yes, it improves or degrades performance. :)
>> 
>> It'll depend entirely on what you're doing with it. If you're storing
>> lots of zeroes (Phoronix, I'm looking at you), then you'll get huge
>> speedups. If you're storing video data, you'll get a (very) slight
>> performance drop as it scompresses the first few blocks of the file and
>> then gives up. I suspect that in general, the performance differences
>> won't be noticable unless you have highly compressible large files, but
>> if you _really_ care about it, benchmark it(*).
>> 
>> Hugo.
>> 
>> (*) If you don't want to go through the effort of benchmarking, you 
>> don't care enough about it, and should just pick something at random.
> 
> Speaking of this bit, I once tried to use zlib instead of lzo, and
> somehow it felt that my laptop on SSD booted noticeably slower after
> that, which felt weird since decompression speed should be about the
> same.
> 
> Has anyone else noticed anything like this?

LZO should decompress a lot faster than zlib, I know that's the case on ARM
and 32bit x86.

regards,

Koen

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)
Comment: GPGTools - http://gpgtools.org

iD8DBQFTSXQQMkyGM64RGpERAtTRAJ9WQg0xA3s3AA+jMryzn6PVWpyEegCbBZTR
IzOZtgJvMbLT2fXdw0fOCxQ=
=2FXK
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v3] btrfs: usage error should not be logged into system log

2014-05-22 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Anand Jain schreef op 22-05-14 12:41:
> From: Anand Jain 
> 
> I have an opinion that system logs /var/log/messages are valuable info to
> investigate the real system issues at the data center. People handling
> data center issues do spend a lot time and efforts analyzing messages 
> files. Having usage error logged into /var/log/messages is something we
> should avoid.

Do you mean 'syslog' when you say '/var/log/messages'? There's no
/var/log/messages on my machines..

regards,

Koen

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)
Comment: GPGTools - http://gpgtools.org

iD8DBQFTfd2kMkyGM64RGpERAsd6AKCZxfhjjtYWUZLJwS0NnghuCb9lBQCfYye2
L3z3JmZqj9TTb+355MMB6w8=
=SfnW
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html