Hello,
I've got 9 domain Us.  They are each a RHEL 5.2 instance.  They have 1G ram, 1 
cpu, 100G drive.  They are paravirtualized. The drives used are created as such:
 pfexec zfs create -s -V 100G datastore/virtMachine1

The hardware is a Dell 2900, 48G ram, 3.06T 15krm sas drives.  OpenSolaris 
seems to be fairly happy on this system.  

When I ran all zones, everything was fine and fast, however vendor requires 
RHEL, and I refuse to give up ZFS, so I had to fire up xVM just so I could run 
MySQL inside an x86 container called RHEL5.2

Anyway, these domainUs boot, run, work pretty well (slower than zones by about 
17% btw), and generally work fine.  

Except that they crash pretty regularly anywhere inbetween 6 and 10 days.
I've been searching forums, etc. Not sure what to do.  here's a log entry:
Jul  1 15:15:08 ecw-mysql1 unix: [ID 836849 kern.notice] 
Jul  1 15:15:08 ecw-mysql1 ^Mpanic[cpu0]/thread=ffffff005b7e1c80: 
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 683410 kern.notice] BAD TRAP: type=e 
(#pf Page fault) rp=ffffff005b7e1120 addr=fffffe0a3e18ec20
Jul  1 15:15:08 ecw-mysql1 unix: [ID 100000 kern.notice] 
Jul  1 15:15:08 ecw-mysql1 unix: [ID 839527 kern.notice] sched: 
Jul  1 15:15:08 ecw-mysql1 unix: [ID 753105 kern.notice] #pf Page fault
Jul  1 15:15:08 ecw-mysql1 unix: [ID 532287 kern.notice] Bad kernel fault at 
addr=0xfffffe0a3e18ec20
Jul  1 15:15:08 ecw-mysql1 unix: [ID 243837 kern.notice] pid=0, 
pc=0xfffffffffb8a0663, sp=0xffffff005b7e1218, eflags=0x10246
Jul  1 15:15:08 ecw-mysql1 unix: [ID 211416 kern.notice] cr0: 
8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 2660<vmxe,xmme,fxsr,mce,pae>
Jul  1 15:15:08 ecw-mysql1 unix: [ID 624947 kern.notice] cr2: fffffe0a3e18ec20
Jul  1 15:15:08 ecw-mysql1 unix: [ID 100000 kern.notice] 
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        rdi: 
fffffe0a3e18ec20 rsi:                0 rdx:         e0508673
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        rcx:            
    3  r8:                0  r9: ffffff0cb9384000
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        rax:            
    0 rbx:         e0508673 rbp: ffffff005b7e12b0
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        r10:            
    0 r11: ffffff0000002000 r12:                0
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        r13:            
    1 r14: fffffe0a3e18ec20 r15:         e0508673
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        fsb:            
    0 gsb: fffffffffbc5ef70  ds:               4b
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]         es:            
   4b  fs:                0  gs:              1c3
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]        trp:            
    e err:                3 rip: fffffffffb8a0663
Jul  1 15:15:08 ecw-mysql1 unix: [ID 592667 kern.notice]         cs:            
 e030 rfl:            10246 rsp: ffffff005b7e1218
Jul  1 15:15:08 ecw-mysql1 unix: [ID 266532 kern.notice]         ss:            
 e02b
Jul  1 15:15:08 ecw-mysql1 unix: [ID 100000 kern.notice] 
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1000 
unix:die+10f ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1110 
unix:trap+1768 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1120 
unix:_cmntrap+12f ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e12b0 
unix:atomic_cas_ptr+3 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1350 
unix:hati_pte_map+160 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e13d0 
unix:hati_load_common+15d ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1490 
unix:hat_devload+15d ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e14f0 
rootnex:rootnex_map_regspec+151 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e15a0 
rootnex:rootnex_map+141 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e15f0 
genunix:ddi_map+51 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e16e0 
npe:npe_bus_map+43d ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1720 
pcie_pci:pepb_bus_map+31 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1760 
pcie_pci:pepb_bus_map+31 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e17b0 
genunix:ddi_map+51 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1870 
genunix:ddi_regs_map_setup+d5 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e18c0 
genunix:pci_config_setup+69 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1900 
pcie:pcie_init_bus+41 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1a30 
pcie_pci:pepb_initchild+bc ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1ab0 
pcie_pci:pepb_ctlops+276 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1af0 
genunix:init_node+78 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1b30 
genunix:i_ndi_config_node+fa ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1b60 
genunix:i_ndi_init_hw_children+48 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1bc0 
genunix:config_immediate_children+83 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1c10 
genunix:devi_config_common+a6 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1c60 
genunix:mt_config_thread+53 ()
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 655072 kern.notice] ffffff005b7e1c70 
unix:thread_start+8 ()
Jul  1 15:15:08 ecw-mysql1 unix: [ID 100000 kern.notice] 
Jul  1 15:15:08 ecw-mysql1 genunix: [ID 672855 kern.notice] syncing file 
systems...
Jul  1 15:15:09 ecw-mysql1 genunix: [ID 904073 kern.notice]  done
Jul  1 15:15:10 ecw-mysql1 genunix: [ID 111219 kern.notice] dumping to 
/dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Jul  1 15:17:51 ecw-mysql1 genunix: [ID 409368 kern.notice] ^M100% done: 
1588175 pages dumped, compression ratio 3.44, 
Jul  1 15:17:51 ecw-mysql1 genunix: [ID 851671 kern.notice] dump succeeded



Anyway, sometimes they blame xVM hypervisor for the crash, sometimes not.  I've 
got twin Dell 2900s, have moved the domainUs from one machine to the other, 
same results.

Name                                        ID   Mem VCPUs      State   Time(s)
Def                                          1024     1                 0.0
Domain-0                                     0 34154     8     r-----   3871.7
EDB_Bs                                     1  1024     1     -b----    803.8
EDB_Faare                         8  2048     2     -b----   1099.3
EDB_Gral                     7  1024     1     -b----    185.2
EDB_NC                                    6  1024     1     -b----    290.0
EDB_Tg                                 2  1024     1     -b----     45.0
EDB_Way                              3  1024     1     -b----     62.3
EDB_Wel                                   5  1024     1     -b----    278.2
EDB_Wnd                                 9  1024     1     -b----    306.9
EHX_Dbase                                10  4096     1     -b----     51.3
Iine                              4  1024     1     -b----     76.0
Repair                                                   512     1              
  13.1

Anyway the crashed occur when the Time(s) for any one domainU gets up around 
25000 or so.  These are production databases, so they do get a lot of work.  

Anyway, it's aggravating when  the servers die like that, but zfs is there 
helping out, so that's nice.
no idea if any of this makes sense, it's late, and I'm not too concerned about 
it anymore, however any help would be great!
thanks,
Jack
-- 
This message posted from opensolaris.org
_______________________________________________
xen-discuss mailing list
[email protected]

Reply via email to