I am able to reproduce paging error consistently with one of our data files.

There are 2 large XML files loading at the same time (100MB & 200MB).

There is a single application with several open connections to the SEDNA
instance.

There is nothing unusual in the SEDNA event.log.

We have over 200+ Amazon instances using the same configuration without a
problem.  Many of the SEDNA instances store much larger files.

SEDNA: 3.5.161

OS: Ubuntu 11.10 (GNU/Linux 3.0.0-14-virtual x86_64)

free -t -m
             total       used       free     shared    buffers     cached
Mem:           592        576         15          0          4        464
-/+ buffers/cache:        106        485
Swap:         1183          1       1182
Total:        1776        577       1198

ps ux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
ubuntu     624  0.0  0.3  26308  2212 ?        Ssl  14:54   0:01
/home/ubuntu/sedna/bin/se_gov -back
ubuntu     705  0.5 35.7 240060 216700 ?       Ssl  14:54   0:48
/home/ubuntu/sedna/bin/se_sm -backg
ubuntu     776  0.1  9.9 1104856 60400 ?       Sl   16:43   0:02
/home/ubuntu/sedna/bin/se_trn
ubuntu     785  2.7  6.7 1104888 41148 ?       Sl   16:43   1:13
/home/ubuntu/sedna/bin/se_trn
ubuntu     788  0.8  9.8 1104792 59812 ?       Sl   16:43   0:22
/home/ubuntu/sedna/bin/se_trn
ubuntu     792  0.8  0.0      0     0 ?        Zl   16:43   0:23 [se_trn]
<defunct>
ubuntu     795  3.4 39.8 1105280 241416 ?      Sl   16:43   1:34
/home/ubuntu/sedna/bin/se_trn
ubuntu     798  1.2  8.8 1104952 53368 ?       Sl   16:43   0:34
/home/ubuntu/sedna/bin/se_trn
ubuntu     826  0.0  0.2  73080  1560 ?        S    17:10   0:00 sshd:
ubuntu@pts/0
ubuntu     827  0.0  1.1  26824  7244 pts/0    Ss   17:10   0:00 -bash
ubuntu     949  0.0  0.4  74028  2440 ?        S    17:11   0:00 sshd:
ubuntu@notty
ubuntu     950  0.0  0.1  12788  1120 ?        Ss   17:11   0:00
/usr/lib/openssh/sftp-server
ubuntu     954  0.0  0.2  16748  1240 pts/0    R+   17:27   0:00 ps ux


Partial from kern.log with the BUG.

Jun 13 14:54:50 ip-10-244-165-112 kernel: [   27.612603] init:
plymouth-upstart-bridge main process (533) killed by TERM signal
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.235823] BUG: unable to
handle kernel paging request at ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] IP:
[<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] PGD 1c04067 PUD
1c08067 PMD f37067 PTE 80100000113ac065
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Oops: 0003 [#1] SMP

Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CPU 0 
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Modules linked in:
acpiphp
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] 
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Pid: 792, comm:
se_trn Not tainted 3.0.0-14-virtual #23-Ubuntu  
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP:
e030:[<ffffffff81006c25>]  [<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RSP:
e02b:ffff880023899cb8  EFLAGS: 00010297
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RAX:
0000000000000000 RBX: ffff8800113ace00 RCX: 800000025f83d027
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RDX:
0000000000000000 RSI: 800000025f83d027 RDI: ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RBP:
ffff880023899cd8 R08: ffffea00003c4db0 R09: 00003ffffffff000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R10:
0000000000000008 R11: 0000000000000293 R12: 800000025f83d027
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] R13:
800000025f83d027 R14: 00007fd3189c0000 R15: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] FS:
00007fd326f7a740(0000) GS:ffff8800266a9000(0000) knlGS:0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CS:  e033 DS: 0000
ES: 0000 CR0: 000000008005003b
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
ffff8800113ace00 CR3: 0000000023873000 CR4: 0000000000002620
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Process se_trn
(pid: 792, threadinfo ffff880023898000, task ffff88002344ae00)
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Stack:
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]  0000000000000008
00003ffffffff000 ffff8800236ed2c0 ffffea000070fc38
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]  ffff880023899ce8
ffffffff81006cf4 ffff880023899d78 ffffffff8112b22b
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]  000000000000000a
0000000000000000 0000020000000000 ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Call Trace:
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81006cf4>] xen_set_pte_at+0x14/0x20
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112b22b>] __do_fault+0x22b/0x510
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112e74a>] handle_pte_fault+0xfa/0x210
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81005cce>] ? xen_pmd_val+0xe/0x10
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81004759>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8112ec18>] handle_mm_fault+0x1f8/0x350
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81073d1b>] ? set_current_blocked+0x5b/0x70
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8160850e>] do_page_fault+0x14e/0x530
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff8100122a>] ? hypercall_page+0x22a/0x1000
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]
[<ffffffff81605215>] page_fault+0x25/0x30
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] Code: 84 00 00 00
00 00 55 48 89 e5 48 83 ec 20 48 89 5d f0 4c 89 65 f8 66 66 66 66 90 48 89
fb 49 89 f4 e8 60 ba 02 00 83 f8 01 74 13 <4c> 89 23 48 8b 5d f0 4c 8b 65 f8
c9 c3 66 0f 1f 44 00 00 ff 14 
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] RIP
[<ffffffff81006c25>] xen_set_pte+0x25/0xe0
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005]  RSP
<ffff880023899cb8>
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] CR2:
ffff8800113ace00
Jun 13 16:53:16 ip-10-244-165-112 kernel: [ 7133.236005] ---[ end trace
b231ecb3aa501510 ]---

I don't know if the problem is with SEDNA, Linux, or a simple configuration
issue.

Any ideas on the root cause or where I should start?

Thanks,
Malcolm


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sedna-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sedna-discussion

Reply via email to