Re: Periodic kernel freezes

2015-10-30 Thread David Goodwin


On 30/10/2015 16:25, Alex Adriaanse wrote:

I have an EC2 instance on AWS that tends to freeze several times per
week. When it freezes it stops responding to network traffic, disk
I/O stops, and CPU goes to 100%. The system comes back fine after a
reboot. I was finally able to get a kernel backtrace from when this
happened today, which I have attached to this email.

The VM in question runs Debian Jessie, and has 3 BTRFS filesystems,
including the root filesystem. Details are included below.

Any ideas?



Hi Alex -

I kept experiencing problems with the Jessie 3.16.x kernel on EC2 (and 
elsewhere) with BTRFS.


Out of 8 nodes, one managed an uptime of 90 days, while the average was 
about 21 days.


Crashes were seemingly random, and it was difficult to get stack traces.

For the stack traces I did get, it wasn't always obvious that the 
problem lay with BTRFS.


Reboots normally needed to be forceful.

I'd suggest upgrading to a backports kernel (I compiled various 4.1.x 
kernels, but there's now 4.2.x in jessie-backports).


You might also want to turn off compression...

David.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Periodic kernel freezes

2015-10-30 Thread Alex Adriaanse
I have an EC2 instance on AWS that tends to freeze several times per week. When 
it freezes it stops responding to network traffic, disk I/O stops, and CPU goes 
to 100%. The system comes back fine after a reboot. I was finally able to get a 
kernel backtrace from when this happened today, which I have attached to this 
email.

The VM in question runs Debian Jessie, and has 3 BTRFS filesystems, including 
the root filesystem. Details are included below.

Any ideas?

Thanks,

Alex



# uname -a
Linux prod-docker-1-a 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u5 
(2015-10-09) x86_64 GNU/Linux

#   btrfs --version
Btrfs v3.17

# df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/xvda   8.0G  1.3G  6.4G  17% /
udev 10M 0   10M   0% /dev
tmpfs   3.0G  8.6M  3.0G   1% /run
tmpfs   7.5G   12K  7.5G   1% /dev/shm
tmpfs   5.0M 0  5.0M   0% /run/lock
tmpfs   7.5G 0  7.5G   0% /sys/fs/cgroup
/dev/xvdb50G  3.9G   45G   9% /var/lib/docker
/dev/xvdc   200G   70G  130G  35% /srv/volumes


# btrfs fi show
Label: none  uuid: 8a293966-5c19-485c-a819-a6b801a1085d
Total devices 1 FS bytes used 1.21GiB
devid1 size 8.00GiB used 3.28GiB path /dev/xvda

Label: 'docker'  uuid: 5bf935e0-4519-43d9-b2e9-b3fb19374b72
Total devices 1 FS bytes used 3.70GiB
devid1 size 50.00GiB used 6.04GiB path /dev/xvdb

Label: 'volumes'  uuid: 2d121370-7879-4485-8fd5-1fe0db5a0c12
Total devices 1 FS bytes used 68.82GiB
devid1 size 200.00GiB used 124.04GiB path /dev/xvdc

Btrfs v3.17


# btrfs fi df /
Data, single: total=2.85GiB, used=1.17GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=204.75MiB, used=38.03MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=16.00MiB, used=0.00B

# btrfs fi df /var/lib/docker
Data, single: total=4.01GiB, used=3.52GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=1.00GiB, used=179.58MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=64.00MiB, used=0.00B

# btrfs fi df /srv/volumes
Data, single: total=122.01GiB, used=68.55GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=1.00GiB, used=277.20MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=96.00MiB, used=0.00B
[344317.872151] [ cut here ]
[344317.876091] kernel BUG at 
/build/linux-xkTWug/linux-3.16.7-ckt11/mm/page_alloc.c:1011!
[344317.876091] invalid opcode:  [#1] SMP 
[344317.876091] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack 
ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
xt_addrtype iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp 
llc crc32_pclmul ppdev ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul 
glue_helper ablk_helper cryptd evdev psmouse serio_raw parport_pc parport ttm 
drm_kms_helper drm i2c_piix4 i2c_core processor thermal_sys button autofs4 
btrfs xor raid6_pq ata_generic xen_blkfront crct10dif_pclmul crct10dif_common 
crc32c_intel ata_piix libata scsi_mod ixgbevf(O)
[344317.876091] CPU: 0 PID: 9842 Comm: kworker/u30:7 Tainted: G   O  
3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u5
[344317.876091] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
[344317.876091] Workqueue: btrfs-delalloc btrfs_delalloc_helper [btrfs]
[344317.876091] task: 8800eb30b630 ti: 880001a08000 task.ti: 
880001a08000
[344317.876091] RIP: 0010:[]  [] 
move_freepages+0x107/0x110
[344317.876091] RSP: 0018:880001a0b918  EFLAGS: 00010006
[344317.876091] RAX: 8803e08fb000 RBX:  RCX: 
0001
[344317.876091] RDX: ea000d922fc8 RSI: ea000d91c000 RDI: 
8803e08fbe00
[344317.876091] RBP: 0001 R08: 8803e08fbe00 R09: 

[344317.876091] R10:  R11: 8803e08fbeb0 R12: 
ea000d91cbd0
[344317.876091] R13:  R14:  R15: 
8803e08fbe00
[344317.876091] FS:  () GS:8803e040() 
knlGS:
[344317.876091] CS:  0010 DS:  ES:  CR0: 80050033
[344317.876091] CR2: 7fd0a085fc00 CR3: 00035f1d5000 CR4: 
001406f0
[344317.876091] Stack:
[344317.876091]  81143c1c  0002115a8000 
ea000d91cbf0
[344317.876091]  8803e08fbe90 8803e0412f78 8800eb30b698 
8803e08fbe00
[344317.876091]  001f  0001 
001a
[344317.876091] Call Trace:
[344317.876091]  [] ? __rmqueue+0x37c/0x460
[344317.876091]  [] ? get_page_from_freelist+0x685/0x910
[344317.876091]  [] ? __alloc_pages_nodemask+0x16d/0xb30
[344317.876091]  [] ? __alloc_pages_nodemask+0x16d/0xb30
[344317.876091]  [] ? btrfs_find_space_for_alloc+0x22a/0x270 
[btrfs]
[344317.8