I am able to reproduce paging error consistently with one of our data files.
There are 2 large XML files loading at the same time (100MB & 200MB).
There is a single application with several open connections to the SEDNA
instance.
There is nothing unusual in the SEDNA event.log.
We have over 200+ Amazon instances using the same configuration without a
problem. Many of the SEDNA instances store much larger files.
SEDNA: 3.5.161
OS: Ubuntu 11.10 (GNU/Linux 3.0.0-14-virtual x86_64)
free -t -m
total used free shared buffers cached
Mem: 592 576 15 0 4 464
-/+ buffers/cache: 106 485
Swap: 1183 1 1182
Total: 1776 577 1198
ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
ubuntu 624 0.0 0.3 26308 2212 ? Ssl 14:54 0:01
/home/ubuntu/sedna/bin/se_gov -back
ubuntu 705 0.5 35.7 240060 216700 ? Ssl 14:54 0:48
/home/ubuntu/sedna/bin/se_sm -backg
ubuntu 776 0.1 9.9 1104856 60400 ? Sl 16:43 0:02
/home/ubuntu/sedna/bin/se_trn
ubuntu 785 2.7 6.7 1104888 41148 ? Sl 16:43 1:13
/home/ubuntu/sedna/bin/se_trn
ubuntu 788 0.8 9.8 1104792 59812 ? Sl 16:43 0:22
/home/ubuntu/sedna/bin/se_trn
ubuntu 792 0.8 0.0 0 0 ? Zl 16:43 0:23 [se_trn]
<defunct>
ubuntu 795 3.4 39.8 1105280 241416 ? Sl 16:43 1:34
/home/ubuntu/sedna/bin/se_trn
ubuntu 798 1.2 8.8 1104952 53368 ? Sl 16:43 0:34
/home/ubuntu/sedna/bin/se_trn
ubuntu 826 0.0 0.2 73080 1560 ? S 17:10 0:00 sshd:
ubuntu@pts/0
ubuntu 827 0.0 1.1 26824 7244 pts/0 Ss 17:10 0:00 -bash
ubuntu 949 0.0 0.4 74028 2440 ? S 17:11 0:00 sshd:
ubuntu@notty
ubuntu 950 0.0 0.1 12788 1120 ? Ss 17:11 0:00
/usr/lib/openssh/sftp-server
ubuntu 954 0.0 0.2 16748 1240 pts/0 R+ 17:27 0:00 ps ux
Partial from kern.log with the BUG.
Jun 13 14:54:50 ip-10-244-165-112 kernel: [ 27.612603] init:
plymouth-upstart-bridge main process (533) killed by TERM signal
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.235823] BUG: unable to
handle kernel paging request at ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] IP:
[<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] PGD 1c04067 PUD
1c08067 PMD f37067 PTE 80100000113ac065
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Oops: 0003 [#1] SMP
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CPU 0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Modules linked in:
acpiphp
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Pid: 792, comm:
se_trn Not tainted 3.0.0-14-virtual #23-Ubuntu
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP:
e030:[<ffffffff81006c25>] [<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RSP:
e02b:ffff880023899cb8 EFLAGS: 00010297
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RAX:
0000000000000000 RBX: ffff8800113ace00 RCX: 800000025f83d027
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RDX:
0000000000000000 RSI: 800000025f83d027 RDI: ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RBP:
ffff880023899cd8 R08: ffffea00003c4db0 R09: 00003ffffffff000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R10:
0000000000000008 R11: 0000000000000293 R12: 800000025f83d027
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R13:
800000025f83d027 R14: 00007fd3189c0000 R15: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] FS:
00007fd326f7a740(0000) GS:ffff8800266a9000(0000) knlGS:0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CS: e033 DS: 0000
ES: 0000 CR0: 000000008005003b
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
ffff8800113ace00 CR3: 0000000023873000 CR4: 0000000000002620
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Process se_trn
(pid: 792, threadinfo ffff880023898000, task ffff88002344ae00)
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Stack:
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] 0000000000000008
00003ffffffff000 ffff8800236ed2c0 ffffea000070fc38
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] ffff880023899ce8
ffffffff81006cf4 ffff880023899d78 ffffffff8112b22b
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] 000000000000000a
0000000000000000 0000020000000000 ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Call Trace:
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81006cf4>] xen_set_pte_at+0x14/0x20
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112b22b>] __do_fault+0x22b/0x510
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112e74a>] handle_pte_fault+0xfa/0x210
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81005cce>] ? xen_pmd_val+0xe/0x10
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81004759>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112ec18>] handle_mm_fault+0x1f8/0x350
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81073d1b>] ? set_current_blocked+0x5b/0x70
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8160850e>] do_page_fault+0x14e/0x530
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8100122a>] ? hypercall_page+0x22a/0x1000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81605215>] page_fault+0x25/0x30
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Code: 84 00 00 00
00 00 55 48 89 e5 48 83 ec 20 48 89 5d f0 4c 89 65 f8 66 66 66 66 90 48 89
fb 49 89 f4 e8 60 ba 02 00 83 f8 01 74 13 <4c> 89 23 48 8b 5d f0 4c 8b 65 f8
c9 c3 66 0f 1f 44 00 00 ff 14
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP
[<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RSP
<ffff880023899cb8>
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] ---[ end trace
b231ecb3aa501510 ]---
I don't know if the problem is with SEDNA, Linux, or a simple configuration
issue.
Any ideas on the root cause or where I should start?
Thanks,
Malcolm
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion