Am Freitag, den 21.12.2012, 07:46 -0500 schrieb EricD: > I have oVirt 3.1 with two nodes, one of them is running good without > restarting the other keep restarting, the maximum uptime that the > server can get is 10 days before it restart, I think that it might be > something related to the disk. > > FYI, the disk are 2 disk of 1TB (RAID-0) to get 2TB. > > # /var/log/messages > Dec 21 05:28:44 hypervisor01a ntpd[945]: 0.0.0.0 c61c 0c clock_step > +17997.588918 s > Dec 21 05:28:44 hypervisor01a ntpd[945]: 0.0.0.0 c614 04 freq_mode > Dec 21 05:28:45 hypervisor01a kdump: No crashkernel parameter > specified for running kernel > Dec 21 05:28:45 hypervisor01a kdumpctl[1366]: Starting kdump: > Dec 21 05:28:45 hypervisor01a kdump: failed to start up > Dec 21 05:28:45 hypervisor01a systemd[1]: kdump.service: main process > exited, code=exited, status=1 > Dec 21 05:28:45 hypervisor01a systemd[1]: Unit kdump.service entered > failed state. > Dec 21 05:28:45 hypervisor01a systemd[1]: Startup finished in 888ms > 157us (kernel) + 2s 521ms 289us (initrd) + 15s 577ms 672us (userspace) > = 18s 987ms 118us. > Dec 21 05:28:45 hypervisor01a ntpd[945]: 0.0.0.0 c618 08 no_sys_peer > Dec 21 05:29:04 hypervisor01a vdsm TaskManager.Task ERROR > Task=`5f51ff52-f9a4-4854-a41d-d5d33c872458`::Unexpected error > Dec 21 05:29:04 hypervisor01a vdsm Storage.Dispatcher.Protect ERROR > {'status': {'message': "Unknown pool id, pool not connected: > ('dbb49db6-9a24-4395-a8bd-c9f222eaecab',)", 'code': 309}} > Dec 21 05:29:04 hypervisor01a vdsm TaskManager.Task ERROR > Task=`7b0cf3b0-6d26-4421-a221-29f2ecaaeb1f`::Unexpected error > Dec 21 05:29:04 hypervisor01a vdsm Storage.Dispatcher.Protect ERROR > {'status': {'message': "Unknown pool id, pool not connected: > ('dbb49db6-9a24-4395-a8bd-c9f222eaecab',)", 'code': 309}} > Dec 21 05:29:04 hypervisor01a kernel: [ 37.944421] ata1: hard > resetting link > Dec 21 05:29:04 hypervisor01a kernel: [ 38.247979] ata1: SATA link > up 3.0 Gbps (SStatus 123 SControl 300) > Dec 21 05:29:04 hypervisor01a kernel: [ 38.248802] ata1.00: > configured for UDMA/133 > Dec 21 05:29:04 hypervisor01a kernel: [ 38.248807] ata1: EH complete > Dec 21 05:29:04 hypervisor01a kernel: [ 38.249013] ata2: hard > resetting link > Dec 21 05:29:04 hypervisor01a kernel: [ 38.553112] ata2: SATA link > up 3.0 Gbps (SStatus 123 SControl 300) > Dec 21 05:29:04 hypervisor01a kernel: [ 38.553881] ata2.00: > configured for UDMA/133 > Dec 21 05:29:04 hypervisor01a kernel: [ 38.553886] ata2: EH complete > Dec 21 05:29:04 hypervisor01a kernel: [ 38.554064] ata3: hard > resetting link > Dec 21 05:29:05 hypervisor01a kernel: [ 38.858275] ata3: SATA link > up 3.0 Gbps (SStatus 123 SControl 300) > Dec 21 05:29:05 hypervisor01a kernel: [ 38.861154] ata3.00: > configured for UDMA/133 > Dec 21 05:29:05 hypervisor01a kernel: [ 38.861159] ata3: EH complete > Dec 21 05:29:05 hypervisor01a kernel: [ 38.861352] ata4: hard > resetting link > Dec 21 05:29:05 hypervisor01a kernel: [ 39.165397] ata4: SATA link > up 3.0 Gbps (SStatus 123 SControl 300) > Dec 21 05:29:05 hypervisor01a kernel: [ 39.168223] ata4.00: > configured for UDMA/133 > Dec 21 05:29:05 hypervisor01a kernel: [ 39.168229] ata4: EH complete > Dec 21 05:29:05 hypervisor01a kernel: [ 39.168421] ata5: hard > resetting link > Dec 21 05:29:05 hypervisor01a kernel: [ 39.472459] ata5: SATA link > up 1.5 Gbps (SStatus 113 SControl 300) > Dec 21 05:29:05 hypervisor01a kernel: [ 39.480040] ata5.00: > configured for UDMA/100 > Dec 21 05:29:05 hypervisor01a kernel: [ 39.485478] ata5: EH complete > Dec 21 05:29:05 hypervisor01a kernel: [ 39.485642] ata6: limiting > SATA link speed to 1.5 Gbps > Dec 21 05:29:05 hypervisor01a kernel: [ 39.485647] ata6: hard > resetting link > Dec 21 05:29:06 hypervisor01a kernel: [ 39.790610] ata6: SATA link > down (SStatus 0 SControl 310)
Hey, to me this looks like some generic SATA error - did you try to run a stock Fedora on the affected machine and look if it behaves similar? Greetings fabian > # RAID-0 > mdadm --detail /dev/md127 > /dev/md127: > Version : 1.2 > Creation Time : Sun Nov 18 14:47:15 2012 > Raid Level : raid0 > Array Size : 1953524736 (1863.03 GiB 2000.41 GB) > Raid Devices : 2 > Total Devices : 2 > Persistence : Superblock is persistent > > Update Time : Sun Nov 18 14:47:15 2012 > State : clean > Active Devices : 2 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 0 > > Chunk Size : 512K > > Name : hypervisor01-a:0 (local to host hypervisor01-a) > UUID : 9eb1324d:57eed46d:c23ae815:0666e238 > Events : 0 > > Number Major Minor RaidDevice State > 0 253 2 0 active sync /dev/dm-2 > 1 253 3 1 active sync /dev/dm-3
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users