Hi everyone,

First of all I'd like to say that we have been using oVirt successfully for more than a year, creating an automated deploy system with help of foreman and puppet.

That being said, we're currently facing the first serious problem and we'd appreciate some help.

Everything was working fine until we exhausted the default MacPoolRanges. After looking for a solution to the error message we found this document: http://www.ovirt.org/Engine_config_examples#MacPoolRanges

Following the instructions on it we executed the following commands:

First we found out what our current pool was:

# engine-config -g MacPoolRanges
MacPoolRanges: 00:1a:4a:24:26:00-00:1a:4a:24:26:ff version: general

So we proceeded to expand it:

# engine-config -s "MacPoolRanges=00:1a:4a:24:26:00-00:1a:4a:24:27:ff"
# service ovirt-engine restart

After this we were able to create new machines but none of them seemed to have network.

After some unsuccessful troubleshooting we restored the original pool and instead added a new one:

# engine-config -s "MacPoolRanges=00:1a:4a:24:26:00-00:1a:4a:24:26:ff"
# service ovirt-engine restart

# engine-config -s "MacPoolRanges=00:1a:4a:24:26:00-00:1a:4a:24:26:ff,00:1a:4a:24:27:00-00:1a:4a:24:27:ff"
# service ovirt-engine restart

After doing this and test to create a new host everything seemed to work fine.

The problem is that after the successful creation of some hosts the original problem where the new hosts didn't seem to have network, reappeared.

Trying to narrow down the problem what we've find out so far is:

This oVirt environment kickstarts hosts via PXE, when trying to PXE boot a new host, the DHCP process fails (timeout).

Tracing the network packets, we are able to see that the virtual host sends the dhcp request, the dhcp server receives it and acknowledges it and it sends the dhcp offer back. The dhcp offer reaches the hypervisor to the vnetxx network interface BUT it doesn't go further and it doesn't reach the virtual host. This behavior is consistent through different hypervisors and vlans, including the ones that have been used/created before the problems appeared.

The only pattern that we've been able to identify so far is through issuing the command "brctl showmacs <bridge_name>"

This command list the mac addresses for the interfaces connected to the bridge. In the cases where everything works fine the output looks like this:

port no mac addr                is local?       ageing timer
2       00:1a:4a:24:27:e0       no                 0.01
2       fe:1a:4a:24:27:e0       yes                0.00

The virtual host MAC address begins with "00" and it has a corresponding address beginning with "fe" which is assigned to the "vnetxx" interface in the hypervisor.

In the cases where the virtual host doesn't get the dhcp answers the output of "brctl showmacs <bridge_name>" is:

port no mac addr                is local?       ageing timer
6       fe:1a:4a:24:27:a0       yes                0.00

This is, the actual virtual host's MAC address is missing from the bridge.

We haven't been able to find a detailed explanation on how the network internals of oVirt should work but hopefully someone in this list can point us to the right resource.

Thank you.

Xavier.







_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to