Am Freitag, den 21.12.2012, 07:46 -0500 schrieb EricD:
> I have oVirt 3.1 with two nodes, one of them is running good without
> restarting the other keep restarting, the maximum uptime that the
> server can get is 10 days before it restart, I think that it might be
> something related to the disk.
> 
> FYI, the disk are 2 disk of 1TB (RAID-0) to get 2TB.
> 
> # /var/log/messages
> Dec 21 05:28:44 hypervisor01a ntpd[945]: 0.0.0.0 c61c 0c clock_step
> +17997.588918 s
> Dec 21 05:28:44 hypervisor01a ntpd[945]: 0.0.0.0 c614 04 freq_mode
> Dec 21 05:28:45 hypervisor01a kdump: No crashkernel parameter
> specified for running kernel
> Dec 21 05:28:45 hypervisor01a kdumpctl[1366]: Starting kdump:
> Dec 21 05:28:45 hypervisor01a kdump: failed to start up
> Dec 21 05:28:45 hypervisor01a systemd[1]: kdump.service: main process
> exited, code=exited, status=1
> Dec 21 05:28:45 hypervisor01a systemd[1]: Unit kdump.service entered
> failed state.
> Dec 21 05:28:45 hypervisor01a systemd[1]: Startup finished in 888ms
> 157us (kernel) + 2s 521ms 289us (initrd) + 15s 577ms 672us (userspace)
> = 18s 987ms 118us.
> Dec 21 05:28:45 hypervisor01a ntpd[945]: 0.0.0.0 c618 08 no_sys_peer
> Dec 21 05:29:04 hypervisor01a vdsm TaskManager.Task ERROR
> Task=`5f51ff52-f9a4-4854-a41d-d5d33c872458`::Unexpected error
> Dec 21 05:29:04 hypervisor01a vdsm Storage.Dispatcher.Protect ERROR
> {'status': {'message': "Unknown pool id, pool not connected:
> ('dbb49db6-9a24-4395-a8bd-c9f222eaecab',)", 'code': 309}}
> Dec 21 05:29:04 hypervisor01a vdsm TaskManager.Task ERROR
> Task=`7b0cf3b0-6d26-4421-a221-29f2ecaaeb1f`::Unexpected error
> Dec 21 05:29:04 hypervisor01a vdsm Storage.Dispatcher.Protect ERROR
> {'status': {'message': "Unknown pool id, pool not connected:
> ('dbb49db6-9a24-4395-a8bd-c9f222eaecab',)", 'code': 309}}
> Dec 21 05:29:04 hypervisor01a kernel: [   37.944421] ata1: hard
> resetting link
> Dec 21 05:29:04 hypervisor01a kernel: [   38.247979] ata1: SATA link
> up 3.0 Gbps (SStatus 123 SControl 300)
> Dec 21 05:29:04 hypervisor01a kernel: [   38.248802] ata1.00:
> configured for UDMA/133
> Dec 21 05:29:04 hypervisor01a kernel: [   38.248807] ata1: EH complete
> Dec 21 05:29:04 hypervisor01a kernel: [   38.249013] ata2: hard
> resetting link
> Dec 21 05:29:04 hypervisor01a kernel: [   38.553112] ata2: SATA link
> up 3.0 Gbps (SStatus 123 SControl 300)
> Dec 21 05:29:04 hypervisor01a kernel: [   38.553881] ata2.00:
> configured for UDMA/133
> Dec 21 05:29:04 hypervisor01a kernel: [   38.553886] ata2: EH complete
> Dec 21 05:29:04 hypervisor01a kernel: [   38.554064] ata3: hard
> resetting link
> Dec 21 05:29:05 hypervisor01a kernel: [   38.858275] ata3: SATA link
> up 3.0 Gbps (SStatus 123 SControl 300)
> Dec 21 05:29:05 hypervisor01a kernel: [   38.861154] ata3.00:
> configured for UDMA/133
> Dec 21 05:29:05 hypervisor01a kernel: [   38.861159] ata3: EH complete
> Dec 21 05:29:05 hypervisor01a kernel: [   38.861352] ata4: hard
> resetting link
> Dec 21 05:29:05 hypervisor01a kernel: [   39.165397] ata4: SATA link
> up 3.0 Gbps (SStatus 123 SControl 300)
> Dec 21 05:29:05 hypervisor01a kernel: [   39.168223] ata4.00:
> configured for UDMA/133
> Dec 21 05:29:05 hypervisor01a kernel: [   39.168229] ata4: EH complete
> Dec 21 05:29:05 hypervisor01a kernel: [   39.168421] ata5: hard
> resetting link
> Dec 21 05:29:05 hypervisor01a kernel: [   39.472459] ata5: SATA link
> up 1.5 Gbps (SStatus 113 SControl 300)
> Dec 21 05:29:05 hypervisor01a kernel: [   39.480040] ata5.00:
> configured for UDMA/100
> Dec 21 05:29:05 hypervisor01a kernel: [   39.485478] ata5: EH complete
> Dec 21 05:29:05 hypervisor01a kernel: [   39.485642] ata6: limiting
> SATA link speed to 1.5 Gbps
> Dec 21 05:29:05 hypervisor01a kernel: [   39.485647] ata6: hard
> resetting link
> Dec 21 05:29:06 hypervisor01a kernel: [   39.790610] ata6: SATA link
> down (SStatus 0 SControl 310)

Hey,

to me this looks like some generic SATA error - did you try to run a
stock Fedora on the affected machine and look if it behaves similar?

Greetings
fabian


> # RAID-0
> mdadm --detail /dev/md127
> /dev/md127:
>         Version : 1.2
>   Creation Time : Sun Nov 18 14:47:15 2012
>      Raid Level : raid0
>      Array Size : 1953524736 (1863.03 GiB 2000.41 GB)
>    Raid Devices : 2
>   Total Devices : 2
>     Persistence : Superblock is persistent
> 
>     Update Time : Sun Nov 18 14:47:15 2012
>           State : clean 
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
> 
>      Chunk Size : 512K
> 
>            Name : hypervisor01-a:0  (local to host hypervisor01-a)
>            UUID : 9eb1324d:57eed46d:c23ae815:0666e238
>          Events : 0
> 
>     Number   Major   Minor   RaidDevice State
>        0     253        2        0      active sync   /dev/dm-2
>        1     253        3        1      active sync   /dev/dm-3

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to