Hi all,

I was working through the installation of ovirt-engine today (after spending more time than I care to admit struggling with networking & DNS issues - VPNs, dnsmasq, "classic" network start-up and iptables/firewall rules can interract with each other in strange and surprising ways).

Anyway - I went through the engine set-up successfully, and got the expected message at the end: "**** Installation completed successfully ******" with a message to visit the engine web application to finish set-up.

Unfortunately, when I connected (after resolving networking issues) to the server in question, I got a "Service temporarily unavailable" error (503) from Apache.

in httpd's error.log, I have:
 [Fri Sep 21 13:37:03 2012] [error] (111)Connection refused: proxy: AJP: 
attempt to connect to 127.0.0.1:8009 (localhost) failed
 [Fri Sep 21 13:37:03 2012] [error] ap_proxy_connect_backend disabling worker 
for (localhost)
 [Fri Sep 21 13:37:03 2012] [error] proxy: AJP: failed to make connection to 
backend: localhost



When I try to restart the ovirt-engine service, I get the following in journalctl:
 Sep 21 13:34:44 clare.neary.home engine-service.py[5172]: The engine PID file 
"/var/run/ovirt-engine.pid" already exists.
 Sep 21 13:34:44 clare.neary.home systemd[1]: PID 1264 read from file 
/var/run/ovirt-engine.pid does not exist.
 Sep 21 13:34:44 clare.neary.home systemd[1]: Unit ovirt-engine.service entered 
failed state.



I tried to clean up and restart, but engine-cleanup failed:
[root@clare ovirt-engine]# engine-cleanup -u

Stopping JBoss service...                                [ DONE ]

Error: Couldn't connect to the database server.Check that connection is working 
and rerun the cleanup utility
Error: Cleanup failed.
please check log at /var/log/ovirt-engine/engine-cleanup_2012_09_21_14_02_37.log



It turns out, in /var/log/messages, that I have these error messages:
Sep 21 14:00:59 clare pg_ctl[5298]: FATAL:  could not create shared memory 
segment: Invalid argument
Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL:  Failed system call was 
shmget(key=5432001, size=36519936, 03600).
Sep 21 14:00:59 clare pg_ctl[5298]: HINT:  This error usually means that 
PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX 
parameter.  You can either reduce the request size or reconfigure the kernel 
with larger SHMMAX.  To reduce the request size (currently 36519936 bytes), 
reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or 
max_connections.
Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already small, it's 
possible that it is less than your kernel's SHMMIN parameter, in which case 
raising the request size or reconfiguring SHMMIN is called for.
Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation contains more 
information about shared memory configuration.
Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
Sep 21 14:01:03 clare systemd[1]: postgresql.service: control process exited, 
code=exited status=1
Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered failed state.

I increased the kernel's SHMMAX, and engine-cleanup worked correctly.

Has anyone else experienced this issue?


When I re-run engine-setup, I also got stuck when reconfiguring NFS - when engine-setup asked me if I wanted to configure the NFS domain, I said "yes", but then it refused to accept my input of "/mnt/iso" since it was already in /etc/exports - perhaps engine-cleanup should also remove ISO shares managed by ovirt-engine, or else handle more gracefully when someone enters an existing export? The only fix I found was to interrupt and restart the engine set-up.

Also, I have no idea whether allowing oVirt to manage iptables will keep any extra rules I have added (specifically for DNS services on port 53 UDP) which I added to the iptables config. I didn't take the risk of allowing it to reconfigure iptables the second time.

After all that, I got an error when starting the JBoss service:

Starting JBoss Service...                             [ ERROR ]
Error: Can't start the ovirt-engine service
Please check log file 
/var/log/ovirt-engine/engine-setup_2012_09_21_14_28_11.log for more information

And when I checked that log file:
2012-09-21 14:30:02::DEBUG::common_utils::790::root:: starting ovirt-engine
2012-09-21 14:30:02::DEBUG::common_utils::835::root:: executing action 
ovirt-engine on service start
2012-09-21 14:30:02::DEBUG::common_utils::309::root:: Executing command --> 
'/sbin/service ovirt-engine start'
2012-09-21 14:30:02::DEBUG::common_utils::335::root:: output =
2012-09-21 14:30:02::DEBUG::common_utils::336::root:: stderr = Redirecting to 
/bin/systemctl start  ovirt-engine.service
Job failed. See system journal and 'systemctl status' for details.

2012-09-21 14:30:02::DEBUG::common_utils::337::root:: retcode = 1
2012-09-21 14:30:02::DEBUG::setup_sequences::62::root:: Traceback (most recent 
call last):
  File "/usr/share/ovirt-engine/scripts/setup_sequences.py", line 60, in run
    function()
  File "/bin/engine-setup", line 1535, in _startJboss
    srv.start(True)
  File "/usr/share/ovirt-engine/scripts/common_utils.py", line 795, in start
    raise Exception(output_messages.ERR_FAILED_START_SERVICE % self.name)
Exception: Error: Can't start the ovirt-engine service

And when I check the system journal, we're back to the service starts, but the PID mentioned in the PID file does not exist.

Any pointers into how I might debug this issue? I haven't found anything similar in a troubleshooting page, so perhaps it's not a common error?

Cheers,
Dave.




--
Dave Neary
Community Action and Impact
Open Source and Standards, Red Hat
Ph: +33 9 50 71 55 62 / Cell: +33 6 77 01 92 13
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to