Hi all,
I was working through the installation of ovirt-engine today (after
spending more time than I care to admit struggling with networking & DNS
issues - VPNs, dnsmasq, "classic" network start-up and iptables/firewall
rules can interract with each other in strange and surprising ways).
Anyway - I went through the engine set-up successfully, and got the
expected message at the end: "**** Installation completed successfully
******" with a message to visit the engine web application to finish set-up.
Unfortunately, when I connected (after resolving networking issues) to
the server in question, I got a "Service temporarily unavailable" error
(503) from Apache.
in httpd's error.log, I have:
[Fri Sep 21 13:37:03 2012] [error] (111)Connection refused: proxy: AJP:
attempt to connect to 127.0.0.1:8009 (localhost) failed
[Fri Sep 21 13:37:03 2012] [error] ap_proxy_connect_backend disabling worker
for (localhost)
[Fri Sep 21 13:37:03 2012] [error] proxy: AJP: failed to make connection to
backend: localhost
When I try to restart the ovirt-engine service, I get the following in
journalctl:
Sep 21 13:34:44 clare.neary.home engine-service.py[5172]: The engine PID file
"/var/run/ovirt-engine.pid" already exists.
Sep 21 13:34:44 clare.neary.home systemd[1]: PID 1264 read from file
/var/run/ovirt-engine.pid does not exist.
Sep 21 13:34:44 clare.neary.home systemd[1]: Unit ovirt-engine.service entered
failed state.
I tried to clean up and restart, but engine-cleanup failed:
[root@clare ovirt-engine]# engine-cleanup -u
Stopping JBoss service... [ DONE ]
Error: Couldn't connect to the database server.Check that connection is working
and rerun the cleanup utility
Error: Cleanup failed.
please check log at /var/log/ovirt-engine/engine-cleanup_2012_09_21_14_02_37.log
It turns out, in /var/log/messages, that I have these error messages:
Sep 21 14:00:59 clare pg_ctl[5298]: FATAL: could not create shared memory
segment: Invalid argument
Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL: Failed system call was
shmget(key=5432001, size=36519936, 03600).
Sep 21 14:00:59 clare pg_ctl[5298]: HINT: This error usually means that
PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX
parameter. You can either reduce the request size or reconfigure the kernel
with larger SHMMAX. To reduce the request size (currently 36519936 bytes),
reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or
max_connections.
Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already small, it's
possible that it is less than your kernel's SHMMIN parameter, in which case
raising the request size or reconfiguring SHMMIN is called for.
Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation contains more
information about shared memory configuration.
Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
Sep 21 14:01:03 clare systemd[1]: postgresql.service: control process exited,
code=exited status=1
Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered failed state.
I increased the kernel's SHMMAX, and engine-cleanup worked correctly.
Has anyone else experienced this issue?
When I re-run engine-setup, I also got stuck when reconfiguring NFS -
when engine-setup asked me if I wanted to configure the NFS domain, I
said "yes", but then it refused to accept my input of "/mnt/iso" since
it was already in /etc/exports - perhaps engine-cleanup should also
remove ISO shares managed by ovirt-engine, or else handle more
gracefully when someone enters an existing export? The only fix I found
was to interrupt and restart the engine set-up.
Also, I have no idea whether allowing oVirt to manage iptables will keep
any extra rules I have added (specifically for DNS services on port 53
UDP) which I added to the iptables config. I didn't take the risk of
allowing it to reconfigure iptables the second time.
After all that, I got an error when starting the JBoss service:
Starting JBoss Service... [ ERROR ]
Error: Can't start the ovirt-engine service
Please check log file
/var/log/ovirt-engine/engine-setup_2012_09_21_14_28_11.log for more information
And when I checked that log file:
2012-09-21 14:30:02::DEBUG::common_utils::790::root:: starting ovirt-engine
2012-09-21 14:30:02::DEBUG::common_utils::835::root:: executing action
ovirt-engine on service start
2012-09-21 14:30:02::DEBUG::common_utils::309::root:: Executing command -->
'/sbin/service ovirt-engine start'
2012-09-21 14:30:02::DEBUG::common_utils::335::root:: output =
2012-09-21 14:30:02::DEBUG::common_utils::336::root:: stderr = Redirecting to
/bin/systemctl start ovirt-engine.service
Job failed. See system journal and 'systemctl status' for details.
2012-09-21 14:30:02::DEBUG::common_utils::337::root:: retcode = 1
2012-09-21 14:30:02::DEBUG::setup_sequences::62::root:: Traceback (most recent
call last):
File "/usr/share/ovirt-engine/scripts/setup_sequences.py", line 60, in run
function()
File "/bin/engine-setup", line 1535, in _startJboss
srv.start(True)
File "/usr/share/ovirt-engine/scripts/common_utils.py", line 795, in start
raise Exception(output_messages.ERR_FAILED_START_SERVICE % self.name)
Exception: Error: Can't start the ovirt-engine service
And when I check the system journal, we're back to the service starts,
but the PID mentioned in the PID file does not exist.
Any pointers into how I might debug this issue? I haven't found anything
similar in a troubleshooting page, so perhaps it's not a common error?
Cheers,
Dave.
--
Dave Neary
Community Action and Impact
Open Source and Standards, Red Hat
Ph: +33 9 50 71 55 62 / Cell: +33 6 77 01 92 13
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users