There wasn't actually any info in syslog other than logging that there was a python exception. I actually had to track down the python abort file to find that it was getting permission errors on the log file.
On Sun, Apr 14, 2013 at 3:28 AM, Yaniv Bronheim <ybron...@redhat.com> wrote: > If only you would accept http://gerrit.ovirt.org/#/c/10313, Tony could > manage to check the syslog for reports and fix it much faster.. :) > Both patches should be backported IMHO > > Thanks, > Yaniv. > > > ----- Original Message ----- > > From: "Dan Kenigsberg" <dan...@redhat.com> > > To: "Tony Feldmann" <trfeldm...@gmail.com>, "Yaniv Bronheim" < > ybron...@redhat.com> > > Cc: "Joop" <jvdw...@xs4all.nl>, us...@ovirt.org, > vdsm-de...@fedorahosted.org > > Sent: Friday, April 12, 2013 12:33:07 AM > > Subject: Re: [Users] vdsm unresponsive with python exception > > > > On Thu, Apr 11, 2013 at 03:51:07PM -0500, Tony Feldmann wrote: > > > That was the issue. Found out yesterday that vdsm.log was somehow > changed > > > to root:root. Just now got a chance to put it back on the mailing > list. > > > How does the ownership of that file get cahnged. When the issue > occurred I > > > am certain there was no one on the system. > > > > http://gerrit.ovirt.org/#/c/12940/ (Separating supervdsm log to > > supervdsm.log file) solves the issue. unfortunately, only on the master > > branch of vdsm. > > > > I think that this is a nasty issue that has to be backported to the > > ovirt-3.2 branch as well, and merits to be part of ovirt-3.2.2. > > > > Regards, > > Dan. > > > > > > > > > > > On Thu, Apr 11, 2013 at 2:15 PM, Joop <jvdw...@xs4all.nl> wrote: > > > > > > > Dan Kenigsberg wrote: > > > > > > > >> On Wed, Apr 10, 2013 at 08:59:01AM -0500, Tony Feldmann wrote: > > > >> > > > >> > > > >>> I am having a strange issue in my ovirt cluster. I have 2 hosts, 1 > > > >>> running > > > >>> engine and added as a host and one other system added as a host. > Both > > > >>> systems are running gluster across local disks for shared storage. > > > >>> Everything was working fine until last night, where my system that > is > > > >>> also > > > >>> running the engine when unresponsive in the admin page. All vms > were > > > >>> still > > > >>> running that were on the host. I shut down the vms that were on > the > > > >>> host > > > >>> from within the guest os as I was not able to do anything to the vm > > > >>> with > > > >>> the host in unresponsive state. After getting the vms off and > > > >>> rebooting > > > >>> the host, the vdsmd service says that it is running, but it > continually > > > >>> restarts the vdsm process and dumps out these messages: detected > > > >>> unhandled > > > >>> Python exception in '/usr/share/vdsm/vdsm'. All services say they > are > > > >>> up > > > >>> and running but the host stays in unresponsive state and the vdsm > > > >>> process > > > >>> keeps respawning. There is also no data in the vdsm.log. Can > anyone > > > >>> shed > > > >>> any light on this for me? > > > >>> > > > >>> > > > >> > > > >> vdsm-de...@fedorahosted.org may be a better place to ask > vdsm-specific > > > >> questions. > > > >> > > > >> Could you log into the non-operational host as root, and stop the > vdsm > > > >> service. > > > >> > > > >> Then become the vdsm user with > > > >> > > > >> su -s /bin/bash - vdsm > > > >> > > > >> and run /usr/share/vdsm/vdsm manually. Do you see anything in > > > >> particular? > > > >> > > > >> > > > >> > > > > Please have a look at the permissions/owner of > /var/log/vdsm/vdsm.log. > > > > Should be vdsm:kvm and not root:root > > > > > > > > Joop > > > > > > > > > > > > > _______________________________________________ > > > Users mailing list > > > us...@ovirt.org > > > http://lists.ovirt.org/mailman/listinfo/users > > > > >
_______________________________________________ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel