Hi Malcolm,
"Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.235823] BUG: unable
to handle kernel paging request at ffff8800113ace00" - is definitely bug in
kernel/hardware/virtualization software, not in Sedna. Though, there is a
small chance that some bug in Sedna initiates this.
Have you tried to reproduce this on different machine (not on amazon)?
Is it reproducible on another amazon machine?
Do you use the latest amazon image, latest kernel?
Ivan
I am able to reproduce paging error consistently with one of our data files.
>
> There are 2 large XML files loading at the same time (100MB & 200MB).
>
> There is a single application with several open connections to the SEDNA
> instance.
>
> There is nothing unusual in the SEDNA event.log.
>
> We have over 200+ Amazon instances using the same configuration without a
> problem. Many of the SEDNA instances store much larger files.
>
> SEDNA: 3.5.161
>
> OS: Ubuntu 11.10 (GNU/Linux 3.0.0-14-virtual x86_64)
>
> free -t -m
> total used free shared buffers cached
> Mem: 592 576 15 0 4 464
> -/+ buffers/cache: 106 485
> Swap: 1183 1 1182
> Total: 1776 577 1198
>
> ps ux
> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> ubuntu 624 0.0 0.3 26308 2212 ? Ssl 14:54 0:01
> /home/ubuntu/sedna/bin/se_gov -back
> ubuntu 705 0.5 35.7 240060 216700 ? Ssl 14:54 0:48
> /home/ubuntu/sedna/bin/se_sm -backg
> ubuntu 776 0.1 9.9 1104856 60400 ? Sl 16:43 0:02
> /home/ubuntu/sedna/bin/se_trn
> ubuntu 785 2.7 6.7 1104888 41148 ? Sl 16:43 1:13
> /home/ubuntu/sedna/bin/se_trn
> ubuntu 788 0.8 9.8 1104792 59812 ? Sl 16:43 0:22
> /home/ubuntu/sedna/bin/se_trn
> ubuntu 792 0.8 0.0 0 0 ? Zl 16:43 0:23 [se_trn]
> <defunct>
> ubuntu 795 3.4 39.8 1105280 241416 ? Sl 16:43 1:34
> /home/ubuntu/sedna/bin/se_trn
> ubuntu 798 1.2 8.8 1104952 53368 ? Sl 16:43 0:34
> /home/ubuntu/sedna/bin/se_trn
> ubuntu 826 0.0 0.2 73080 1560 ? S 17:10 0:00 sshd:
> ubuntu@pts/0
> ubuntu 827 0.0 1.1 26824 7244 pts/0 Ss 17:10 0:00 -bash
> ubuntu 949 0.0 0.4 74028 2440 ? S 17:11 0:00 sshd:
> ubuntu@notty
> ubuntu 950 0.0 0.1 12788 1120 ? Ss 17:11 0:00
> /usr/lib/openssh/sftp-server
> ubuntu 954 0.0 0.2 16748 1240 pts/0 R+ 17:27 0:00 ps ux
>
>
> Partial from kern.log with the BUG.
>
> Jun 13 14:54:50 ip-10-244-165-112 kernel: [ 27.612603] init:
> plymouth-upstart-bridge main process (533) killed by TERM signal
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.235823] BUG: unable to
> handle kernel paging request at ffff8800113ace00
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] IP:
> [<ffffffff81006c25>] xen_set_pte+0x25/0xe0
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] PGD 1c04067 PUD
> 1c08067 PMD f37067 PTE 80100000113ac065
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Oops: 0003 [#1]
> SMP
>
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CPU 0
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Modules linked in:
> acpiphp
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Pid: 792, comm:
> se_trn Not tainted 3.0.0-14-virtual #23-Ubuntu
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP:
> e030:[<ffffffff81006c25>] [<ffffffff81006c25>] xen_set_pte+0x25/0xe0
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RSP:
> e02b:ffff880023899cb8 EFLAGS: 00010297
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RAX:
> 0000000000000000 RBX: ffff8800113ace00 RCX: 800000025f83d027
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RDX:
> 0000000000000000 RSI: 800000025f83d027 RDI: ffff8800113ace00
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RBP:
> ffff880023899cd8 R08: ffffea00003c4db0 R09: 00003ffffffff000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R10:
> 0000000000000008 R11: 0000000000000293 R12: 800000025f83d027
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R13:
> 800000025f83d027 R14: 00007fd3189c0000 R15: 0000000000000000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] FS:
> 00007fd326f7a740(0000) GS:ffff8800266a9000(0000) knlGS:0000000000000000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CS: e033 DS: 0000
> ES: 0000 CR0: 000000008005003b
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
> ffff8800113ace00 CR3: 0000000023873000 CR4: 0000000000002620
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR0:
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR3:
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Process se_trn
> (pid: 792, threadinfo ffff880023898000, task ffff88002344ae00)
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Stack:
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] 0000000000000008
> 00003ffffffff000 ffff8800236ed2c0 ffffea000070fc38
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] ffff880023899ce8
> ffffffff81006cf4 ffff880023899d78 ffffffff8112b22b
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] 000000000000000a
> 0000000000000000 0000020000000000 ffff8800113ace00
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Call Trace:
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff81006cf4>] xen_set_pte_at+0x14/0x20
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff8112b22b>] __do_fault+0x22b/0x510
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff8112e74a>] handle_pte_fault+0xfa/0x210
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff81005cce>] ? xen_pmd_val+0xe/0x10
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff81004759>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff8112ec18>] handle_mm_fault+0x1f8/0x350
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff81073d1b>] ? set_current_blocked+0x5b/0x70
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff8160850e>] do_page_fault+0x14e/0x530
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff8100122a>] ? hypercall_page+0x22a/0x1000
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
> [<ffffffff81605215>] page_fault+0x25/0x30
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Code: 84 00 00 00
> 00 00 55 48 89 e5 48 83 ec 20 48 89 5d f0 4c 89 65 f8 66 66 66 66 90 48 89
> fb 49 89 f4 e8 60 ba 02 00 83 f8 01 74 13 <4c> 89 23 48 8b 5d f0 4c 8b 65
> f8
> c9 c3 66 0f 1f 44 00 00 ff 14
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP
> [<ffffffff81006c25>] xen_set_pte+0x25/0xe0
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RSP
> <ffff880023899cb8>
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
> ffff8800113ace00
> Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] ---[ end trace
> b231ecb3aa501510 ]---
>
> I don't know if the problem is with SEDNA, Linux, or a simple configuration
> issue.
>
> Any ideas on the root cause or where I should start?
>
> Thanks,
> Malcolm
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Sedna-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion