----- Original Message ----- > From: "Simone Tiraboschi" <stira...@redhat.com> > To: devel@ovirt.org > Cc: "Fabian Deutsch" <fdeut...@redhat.com> > Sent: Friday, May 29, 2015 1:44:02 PM > Subject: [ovirt-devel] oVirt node 3.6 and CPU load indefinitely stuck on 100% > while vdsmd indefinitely tries to > restart > > Hi, > I tried to have hosted-engine deploying the engine appliance over oVirt node. > I think it will be quite a common scenario. > I tried with an oVirt node build from yesterday. > > Unfortunately I'm not able to conclude the setup cause oVirt node got the CPU > load indefinitely stuck on 100% and so it's almost unresponsive. > > The issue seams to be related to vdsmd daemon witch couldn't really start and > so it retries indefinitely using all the available CPU power (it also runs > with niceless -20...). > > [root@node36 admin]# grep "Unit vdsmd.service entered failed state." > /var/log/messages | wc -l > 368 > It tried 368 times in a row in a few minutes. > > With journalctl I can read: > May 29 10:06:45 node36 systemd[1]: Unit vdsmd.service entered failed state. > May 29 10:06:45 node36 systemd[1]: vdsmd.service holdoff time over, > scheduling restart. > May 29 10:06:45 node36 systemd[1]: Stopping Virtual Desktop Server Manager... > May 29 10:06:45 node36 systemd[1]: Starting Virtual Desktop Server Manager... > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running mkdirs > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > configure_coredump > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > configure_vdsm_logs > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > wait_for_network > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > run_init_hooks > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > upgraded_version_check > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > check_is_configured > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > validate_configuration > May 29 10:06:47 node36 vdsmd_init_common.sh[13697]: vdsm: Running > prepare_transient_repository > May 29 10:06:49 node36 vdsmd_init_common.sh[13697]: vdsm: Running > syslog_available > May 29 10:06:49 node36 vdsmd_init_common.sh[13697]: vdsm: Running nwfilter > May 29 10:06:50 node36 vdsmd_init_common.sh[13697]: vdsm: Running dummybr > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running > load_needed_modules > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running tune_system > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running test_space > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running test_lo > May 29 10:06:51 node36 systemd[1]: Started Virtual Desktop Server Manager. > May 29 10:06:51 node36 systemd[1]: vdsmd.service: main process exited, > code=exited, status=1/FAILURE > May 29 10:06:51 node36 vdsmd_init_common.sh[13821]: vdsm: Running > run_final_hooks > May 29 10:06:52 node36 systemd[1]: Unit vdsmd.service entered failed state. > May 29 10:06:52 node36 systemd[1]: vdsmd.service holdoff time over, > scheduling restart. > May 29 10:06:52 node36 systemd[1]: Stopping Virtual Desktop Server Manager... > May 29 10:06:52 node36 systemd[1]: Starting Virtual Desktop Server Manager... > repeated a lot of times > > /var/log/vdsm/vdsm.log is empty. > > while > [root@node36 admin]# /usr/share/vdsm/daemonAdapter -0 /dev/null -1 /dev/null > -2 /dev/null /usr/share/vdsm/vdsm; echo $? > 1
Can you try to run vdsm manually from the shell? # /usr/share/vdsm/vdsm Typically you would see a python traceback explaining the failure. Nir _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel