Kernel BUG: __tree_mod_log_rewind

2013-05-07 Thread Elladan
I can get btrfs to throw a kernel bug easily by running btrfs fi
defrag on some files in 3.9.0:

May  7 01:57:33 caper kernel: [0.00] Linux version
3.9.0-030900-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro
4.6.3-1ubuntu5) ) #201304291257 SMP Mon Apr 29 16:58:15 UTC 2013
...
May  7 02:09:21 caper kernel: [  726.745485] [ cut here
]
May  7 02:09:21 caper kernel: [  726.745567] Kernel BUG at
a00ea503 [verbose debug info unavailable]
May  7 02:09:21 caper kernel: [  726.745643] invalid opcode:  [#1] SMP
May  7 02:09:21 caper kernel: [  726.745807] Modules linked in:
snd_hrtimer zram(C) bnep rfcomm bluetooth parport_pc ppdev nfsd
nfs_acl auth_rpcgss nfs fscache binfmt_misc lockd sunrpc
snd_hda_codec_hdmi joydev hid_gaff ff_memless snd_usb_
audio snd_usbmidi_lib uvcvideo snd_seq_midi videobuf2_core videodev
snd_rawmidi videobuf2_vmalloc videobuf2_memops snd_seq_midi_event
dm_multipath snd_hda_codec_realtek snd_seq scsi_dh kvm_amd
snd_seq_device snd_hda_intel kvm snd_hda_codec
 snd_hwdep microcode snd_pcm snd_timer k10temp edac_core edac_mce_amd
serio_raw snd sp5100_tco i2c_piix4 soundcore snd_page_alloc mac_hid
wmi it87 hwmon_vid lp parport xfs btrfs raid6_pq zlib_deflate xor
libcrc32c ses enclosure dm_crypt hi
d_generic usbhid hid usb_storage firewire_ohci firewire_core crc_itu_t
ahci pata_acpi pata_atiixp libahci r8169
May  7 02:09:21 caper kernel: [  726.749841] CPU 3
May  7 02:09:21 caper kernel: [  726.749900] Pid: 1703, comm:
btrfs-endio-wri Tainted: G C   3.9.0-030900-generic
#201304291257 Gigabyte Technology Co., Ltd.
GA-MA790GP-UD4H/GA-MA790GP-UD4H
May  7 02:09:21 caper kernel: [  726.750069] RIP:
0010:[]  []
__tree_mod_log_rewind+0x253/0x260 [btrfs]
May  7 02:09:21 caper kernel: [  726.750244] RSP:
0018:88011a2e1838  EFLAGS: 00010293
May  7 02:09:21 caper kernel: [  726.750316] RAX: 
RBX: 88004b2798f0 RCX: 88011a2e17d8
May  7 02:09:21 caper kernel: [  726.750390] RDX: 13f3a75c
RSI: 05e8 RDI: 8800172ea880
May  7 02:09:21 caper kernel: [  726.750463] RBP: 88011a2e1868
R08: 1000 R09: 88011a2e17e8
May  7 02:09:21 caper kernel: [  726.750536] R10: 000103db
R11:  R12: 880098cf4d80
May  7 02:09:21 caper kernel: [  726.750609] R13: 002b
R14: 8800172ea700 R15: 0009c7a7
May  7 02:09:21 caper kernel: [  726.750683] FS:
7fa2bc594700() GS:88014fd8()
knlGS:
May  7 02:09:21 caper kernel: [  726.750770] CS:  0010 DS:  ES:
 CR0: 8005003b
May  7 02:09:21 caper kernel: [  726.750841] CR2: fd82c000
CR3: 00014654d000 CR4: 07e0
May  7 02:09:21 caper kernel: [  726.750914] DR0: 
DR1:  DR2: 
May  7 02:09:21 caper kernel: [  726.750987] DR3: 
DR6: 0ff0 DR7: 0400
May  7 02:09:21 caper kernel: [  726.751061] Process btrfs-endio-wri
(pid: 1703, threadinfo 88011a2e, task 88004a6b2ea0)
May  7 02:09:21 caper kernel: [  726.751147] Stack:
May  7 02:09:21 caper kernel: [  726.751212]  88011a2e1858
880104c8de30 0009c7a7 8800
May  7 02:09:21 caper kernel: [  726.751488]  a8598000
880148278000 88011a2e18b8 a00ea5ef
May  7 02:09:21 caper kernel: [  726.751763]  880098cf4d80
88004b2798f0 8800338d3000 0001
May  7 02:09:21 caper kernel: [  726.752038] Call Trace:
May  7 02:09:21 caper kernel: [  726.752135]  []
tree_mod_log_rewind+0xdf/0x240 [btrfs]
May  7 02:09:21 caper kernel: [  726.752237]  []
btrfs_search_old_slot+0x4cb/0x670 [btrfs]
May  7 02:09:21 caper kernel: [  726.752351]  []
__resolve_indirect_ref+0xc8/0x150 [btrfs]
May  7 02:09:21 caper kernel: [  726.752462]  []
__resolve_indirect_refs+0x9e/0x200 [btrfs]
May  7 02:09:21 caper kernel: [  726.752573]  []
find_parent_nodes+0x45d/0x6b0 [btrfs]
May  7 02:09:21 caper kernel: [  726.752684]  []
btrfs_find_all_roots+0x99/0x100 [btrfs]
May  7 02:09:21 caper kernel: [  726.752792]  [] ?
btrfs_submit_direct+0x190/0x190 [btrfs]
May  7 02:09:21 caper kernel: [  726.752901]  [] ?
btrfs_submit_direct+0x190/0x190 [btrfs]
May  7 02:09:21 caper kernel: [  726.753012]  []
iterate_extent_inodes+0x177/0x2c0 [btrfs]
May  7 02:09:21 caper kernel: [  726.753123]  []
iterate_inodes_from_logical+0x92/0xb0 [btrfs]
May  7 02:09:21 caper kernel: [  726.753244]  [] ?
btrfs_submit_direct+0x190/0x190 [btrfs]
May  7 02:09:21 caper kernel: [  726.753353]  []
record_extent_backrefs+0x78/0xf0 [btrfs]
May  7 02:09:21 caper kernel: [  726.753462]  []
relink_file_extents+0x44/0x180 [btrfs]
May  7 02:09:21 caper kernel: [  726.753571]  []
btrfs_finish_ordered_io+0x135/0x4d0 [btrfs]
May  7 02:09:21 caper kernel: [  726.753681]  []
finish_ordered_fn+0x15/0x20 [btrfs]
May  7 02:09:21 caper kernel: [  726.753791]  []
worker_loop+0xa0/0x320 [btrfs]
May  7 02:09:21 caper kernel: [  726.753901

Re: A couple bugs with btrfs and 3.5.0 kernel

2013-01-20 Thread Elladan
On Sun, Jan 20, 2013 at 5:51 PM, Liu Bo  wrote:
> On Sun, Jan 20, 2013 at 05:39:57PM -0800, Elladan wrote:
>> Any ideas?  I guess I could try to mount in degraded mode or try a 3.6
>> kernel or something, but this all seems like I should probably just
>> restore from backups and move on.
>
> Hi Elladan,
>
> For 'bio too big' issue, this patch is helpful,
>
> https://patchwork.kernel.org/patch/1619691/
>
> thanks,
> liubo

Hi,

After poking around, I determined that the 3.8 kernel is the first one
with this patch.  I installed it, and re-ran btrfs device delete.  The
delete ran to completion successfully.

However, "btrfs fi show" still indicated that the deleted device was
part of the filesystem.  I don't know if that was a bug in my older
btrfs binary or not.  It mounts fine without the deleted device.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


A couple bugs with btrfs and 3.5.0 kernel

2013-01-20 Thread Elladan
I upgraded to Ubuntu 12.10 and thought, "Hey, that 3.5 kernel is
relatively recent.  And they seem to finally have implemented
restriping.  Maybe it's time to try btrfs again!"

So, first off, I backed up all my data.

Next, I decided I would attempt to use btrfs's features for my benefit.

Specifically (this part is less interesting except as setup):

1. I put a btrfs filesystem on top of dm-crypt on an external USB drive.
2. I copied data to it.
3. I unmounted the original partition, and then immediately mounted
the btrfs partition in its place.

Ok, now to the interesting bits:

My goal here is to delete the usb device and just leave myself with my
data, migrated back to the internal disk (with minimal downtime)

So, I figured I could use restriping/device delete to live-migrate
back onto the internal hard disk.

4. I did a btrfs device add on a partition (over lvm/dm-crypt) on the
internal disk.  Now I have 2 partitons in the fs.

I attempted to btrfs device delete the usb disk, and it errored out
(with somewhat inscrutable information) telling me that I can't reduce
raid1 to dup this way.

Note: Arguably, this is a bug.  You really ought to do it, but with a
-f option, and automatically reduce the chunks appropriately.

Note: Also arguably, this is also a bug because it should not have
changed the metadata profile from dup to raid1 without asking me.
Maybe I don't want raid1.

Anyway, I figure I can fix this up with a balance filter (this is
primarily what made me think btrfs might be more usable now).

6. I attempt to balance with a filter -mconvert=dup.  This immediately
errors out with no real indication as to why.

In the dmesg log I found:

[52656.153908] btrfs: unable to start balance with target metadata profile 32

Clearly a bug.

7. After some random trial and error, I find that it accepts
-mconvert=single, and the result appears to be metadata in dup state.
Maybe.

Ok now that's done, it's time to delete.

8. btrfs device delete /dev/dm-11 /btrfs

Some hours later, it fails.  I find stuff like this all over my dmesg log:

[113936.300109] bio too big device dm-11 (1024 > 240)
[113936.297242] btrfs: bdev /dev/dm-11 errs: wr 101, rd 10247, flush
0, corrupt 109, gen 0
[113935.425960] btrfs_dev_stat_print_on_error: 38 callbacks suppressed

It also found 2 files with csum errors, which were left on the USB device.

[92750.052638] btrfs csum failed ino 257 off 49278976 csum 948519347
private 2127080388
[95692.348662] btrfs: checksum error at logical 94682349568 on dev
/dev/mapper/tempusb, sector 224788736, root 256, inode 114815, offset
14360576, length 4096, links 1 (path:...path to file)

The csum errors appeared to have caused it to stop.

Googling around seemed to indicate that someone had once experienced a
similar problem with an external drive around the 3.0 kernel era.
They suggested something about the filesystem not working when dealing
with devices mixed between SATA and USB, which sounded a bit wacky to
me.  I initially assumed that maybe the USB drive was a bit flaky, but
this sounds to me like the csum errors were probably btrfs causing
silent corruption.

I tried deleting the files with the csum errors and running the device
delete again, but it immediately failed with invalid argument errors
and nothing in the dmesg log.  Clearly a bug.

Then, I tried unmounting, remounting, and then re-running the delete.
This time it started, but it's been running for a long time and
spamming my kernel logs with the bio too big for device errors.  I'm
guessing I'll probably need to sysrq reboot or something.

This is with Ubuntu's standard 3.5.0-22 generic kernel.

Any ideas?  I guess I could try to mount in degraded mode or try a 3.6
kernel or something, but this all seems like I should probably just
restore from backups and move on.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kernel oops with 2.6.31 in btrfs_set_acl

2009-09-18 Thread Elladan
Hi,

I got an oops this morning with btrfs on the Ubuntu 2.6.31 kernel.  I mounted
with compress and thread_pool=4.  This is on a 64 bit quad AMD machine.

Sep 18 04:00:07 caper kernel: [614439.131866] BUG: unable to handle kernel NULL 
pointer dereference at 0004
Sep 18 04:00:07 caper kernel: [614439.131882] IP: [] 
posix_acl_equiv_mode+0x1/0xa0
Sep 18 04:00:07 caper kernel: [614439.131900] PGD 126d98067 PUD 11e572067 PMD 0 
Sep 18 04:00:07 caper kernel: [614439.131911] Oops:  [#4] SMP 
Sep 18 04:00:07 caper kernel: [614439.131918] last sysfs file: 
/sys/devices/virtual/net/pan0/statistics/collisions
Sep 18 04:00:07 caper kernel: [614439.131925] CPU 1 
Sep 18 04:00:07 caper kernel: [614439.131928] Modules linked in: joydev 
hid_gaff ff_memless btrfs zlib_deflate crc32c libcrc32c isofs udf crc_itu_t 
nls_iso8859_1 nls_cp437 vfat fat usb_storage binfmt_misc ppdev bnep 
ipt_MASQUERADE iptabl
e_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT 
xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_amd kvm tun video 
output nfsd nfs lockd nfs_acl auth_rpcgss sunrpc it87 hwmon_vid lp parport 
snd_hd
a_codec_realtek snd_hda_intel snd_hda_codec snd_usb_audio snd_usb_lib snd_hwdep 
snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi 
snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device uvcvideo usblp 
pcspk
r amd64_edac_mod edac_core i2c_piix4 snd soundcore snd_page_alloc videodev 
v4l1_compat v4l2_compat_ioctl32 nvidia(P) xfs exportfs ohci1394 ieee1394 r8169 
mii usbhid
Sep 18 04:00:07 caper kernel: [614439.132057] Pid: 23795, comm: rdiff-backup 
Tainted: P  D W  2.6.31-020631-generic #020631 GA-MA790GP-UD4H
Sep 18 04:00:07 caper kernel: [614439.132064] RIP: 0010:[]  
[] posix_acl_equiv_mode+0x1/0xa0
Sep 18 04:00:07 caper kernel: [614439.132077] RSP: 0018:88007cc61c58  
EFLAGS: 00010246
Sep 18 04:00:07 caper kernel: [614439.132082] RAX: 8180 RBX: 
ffea RCX: 0004
Sep 18 04:00:07 caper kernel: [614439.132088] RDX: 8000 RSI: 
88007cc61c8c RDI: 
Sep 18 04:00:07 caper kernel: [614439.132094] RBP: 88007cc61cb8 R08: 
 R09: a0f5e9a0
Sep 18 04:00:07 caper kernel: [614439.132099] R10:  R11: 
 R12: 
Sep 18 04:00:07 caper kernel: [614439.132104] R13: 8000 R14: 
88005aa4b9b0 R15: 8000
Sep 18 04:00:07 caper kernel: [614439.132112] FS:  7fc423a2e6f0() 
GS:88002803f000() knlGS:f72c1750
Sep 18 04:00:07 caper kernel: [614439.132117] CS:  0010 DS:  ES:  CR0: 
80050033
Sep 18 04:00:07 caper kernel: [614439.132123] CR2: 0004 CR3: 
00011b806000 CR4: 06a0
Sep 18 04:00:07 caper kernel: [614439.132129] DR0:  DR1: 
 DR2: 
Sep 18 04:00:07 caper kernel: [614439.132134] DR3:  DR6: 
0ff0 DR7: 0400
Sep 18 04:00:07 caper kernel: [614439.132141] Process rdiff-backup (pid: 23795, 
threadinfo 88007cc6, task 88007f432d60)
Sep 18 04:00:07 caper kernel: [614439.132146] Stack:
Sep 18 04:00:07 caper kernel: [614439.132149]  88007cc61cb8 
a0f3c157 88007cc61ce8 0008
Sep 18 04:00:07 caper kernel: [614439.132158] <0> 0eb6 
88003e0af110 81807cc61c98 
Sep 18 04:00:07 caper kernel: [614439.132168] <0> 0004 
8000 88005aa4b9b0 880079484240
Sep 18 04:00:07 caper kernel: [614439.132178] Call Trace:
Sep 18 04:00:07 caper kernel: [614439.132228]  [] ? 
btrfs_set_acl+0x87/0x230 [btrfs]
Sep 18 04:00:07 caper kernel: [614439.132266]  [] 
btrfs_xattr_set_acl+0x84/0xa0 [btrfs]
Sep 18 04:00:07 caper kernel: [614439.132300]  [] 
btrfs_xattr_acl_access_set+0xe/0x10 [btrfs]
Sep 18 04:00:07 caper kernel: [614439.132311]  [] 
generic_setxattr+0x6f/0x90
Sep 18 04:00:07 caper kernel: [614439.132349]  [] 
btrfs_setxattr+0x4a/0xa0 [btrfs]
Sep 18 04:00:07 caper kernel: [614439.132359]  [] 
vfs_setxattr+0xbe/0x210
Sep 18 04:00:07 caper kernel: [614439.132368]  [] 
setxattr+0xb5/0x110
Sep 18 04:00:07 caper kernel: [614439.132375]  [] ? 
path_put+0x2c/0x40
Sep 18 04:00:07 caper kernel: [614439.132383]  [] ? 
putname+0x31/0x50
Sep 18 04:00:07 caper kernel: [614439.132390]  [] ? 
user_path_at+0x59/0x90
Sep 18 04:00:07 caper kernel: [614439.132400]  [] ? 
_atomic_dec_and_lock+0x4d/0x70
Sep 18 04:00:07 caper kernel: [614439.132407]  [] ? 
path_put+0x2c/0x40
Sep 18 04:00:07 caper kernel: [614439.132415]  [] ? 
sys_fchmodat+0x71/0xe0
Sep 18 04:00:07 caper kernel: [614439.132423]  [] ? 
_atomic_dec_and_lock+0x4d/0x70
Sep 18 04:00:07 caper kernel: [614439.132432]  [] 
sys_setxattr+0x92/0xa0
Sep 18 04:00:07 caper kernel: [614439.132442]  [] 
system_call_fastpath+0x16/0x1b
Sep 18 04:00:07 caper kernel: [614439.132447] Code: 08 0f 85 6a ff ff ff 85 ff 
90 0f 85 61 ff ff ff e9 77 f