[
https://issues.apache.org/jira/browse/CLOUDSTACK-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263157#comment-15263157
]
Pierre-Luc Dion commented on CLOUDSTACK-9017:
---------------------------------------------
h3. Potential fix:
update the config VR with following:
1.set a new hostname for non primary NIC's ex: ```hostname-nic1``` to avoid
name conflict in /etc/hosts and /etc/dhcphosts.txt.
2. default dhcpclient behavior of the VM will use last dhcp interface gateway,
so we need to force the default gateway at the dhcpserver. per vm basis, by
default dnsmasq will push gateway to all dhcp request. so we would need to add
following 2 lines for a second nic:
in /etc/hosts
{code}
10.178.53.11 multihome-nic1
10.178.52.190 multihome
{code}
in /etc/dhcphosts.txt
{code}
02:00:4e:3d:00:46,10.178.52.180,multihome,infinite
02:00:7e:80:00:16,10.178.53.11,multihome-nic1,infinite,set:multihome-nic1
{code}
in /etc/dnsmasq.d/non-default-nics.conf
{code}
dhcp-option=tag:multihome-nic1,3
dhcp-option=tag:multihome-nic1,6
{code}
We should also consider that we can change the default NIC in CloudStack which
would require change of the default gateway for the VM.
I've tested this thru manual configuration changes on the VR and it work.
> VPC VR DHCP broken for multihomed guest VMs
> -------------------------------------------
>
> Key: CLOUDSTACK-9017
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9017
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: SystemVM, Virtual Router
> Affects Versions: 4.4.4, 4.5.2, 4.6.0, 4.6.1, 4.6.2, 4.7.0
> Environment: CloudStack 4.5.2, XenServer back end.
> Reporter: Dag Sonstebo
> Labels: systemvm, virtualrouter, vpc
>
> Bug: VPC VR DHCP broken for multihomed guest VMs
> Affected version: CloudStack 4.5.2 only tested
> Summary: When attaching a guest VM to more than one VPC tier DHCP will only
> work for the last NIC to be added. This is according to end user new
> behaviour after the CS4.5.2 upgrade.
> Workarounds:
> 1) Only use single NICs on VPC connected VMs and configure L3 routing and
> ACLs to handle traffic between tiers.
> 2) Configure additional tier NICs with the static IP addresses reported by
> CloudStack.
>
> ================================================================================================================
> Steps to recreate:
> 1) Create a VPC with two tiers, in this case
> - VPC on 10.3.0.0/16
> - Tier 1 on 10.3.1.0/24
> - Tier 2 on 10.3.2.0/24
> 2) Create a new VM attached to tier 1 only. This will cause a new entry to be
> written to /etc/dhcphosts.txt on the VPC VR:
> root@r-20-VM:~# cat /etc/dhcphosts.txt
> 02:00:21:fd:00:08,set:10_3_1_162,10.3.1.162,BatVM2,infinite
> root@r-20-VM:~#
> When the VM starts up the following is displayed in /var/log/dnsmasq.log when
> the VM requests it's IP address:
> Oct 30 15:50:12 dnsmasq[8246]: read /etc/hosts - 7 addresses
> Oct 30 15:50:12 dnsmasq-dhcp[8246]: read /etc/dhcphosts.txt
> Oct 30 15:50:12 dnsmasq-dhcp[8246]: read /etc/dhcpopts.txt
> Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08
> Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPOFFER(eth2) 10.3.1.162
> 02:00:21:fd:00:08
> Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162
> 02:00:21:fd:00:08
> Oct 30 15:50:44 dnsmasq-dhcp[8246]: DHCPACK(eth2) 10.3.1.162
> 02:00:21:fd:00:08 BatVM2
> The following is displayed in the dnsmasq leases file:
> root@r-20-VM:~# cat /var/lib/misc/dnsmasq.leases
> 0 02:00:21:fd:00:08 10.3.1.162 BatVM2 *
> And the following in the cloud DHCP configuration file:
> root@r-20-VM:~# cat /etc/dnsmasq.d/cloud.conf
> dhcp-hostsfile=/etc/dhcphosts.txt
> dhcp-range=interface:eth3,set:interface-eth3,10.3.2.1,static
> dhcp-option=tag:interface-eth3,15,batvpc.net
> dhcp-range=interface:eth2,set:interface-eth2,10.3.1.1,static
> dhcp-option=tag:interface-eth2,15,batvpc.net
> root@r-20-VM:~#
> 3) Checking the VM locally IP configuration will show DHCP lease in place for
> eth0.
> 4) Add a new NIC to the VM, attached to Tier 2. This results in the
> following entries in the dnsmasq log:
> Oct 30 16:23:02 dnsmasq[8246]: read /etc/hosts - 7 addresses
> Oct 30 16:23:02 dnsmasq-dhcp[8246]: read /etc/dhcphosts.txt
> Oct 30 16:23:02 dnsmasq-dhcp[8246]: read /etc/dhcpopts.txt
> Oct 30 16:23:02 dnsmasq-dhcp[8246]: not giving name BatVM2.batvpc.net to the
> DHCP lease of 10.3.1.162 because the name exists in /etc/hosts with address
> 10.3.2.111
> Oct 30 16:23:02 dnsmasq-dhcp[8246]: not giving name BatVM2 to the DHCP lease
> of 10.3.1.162 because the name exists in /etc/hosts with address 10.3.2.111
> In other words the Tier 2 address has taken precedence over the initial Tier
> 1 address.
> The /etc/dhcphosts.txt file has now lost the Tier 1 entry and now contains:
> root@r-20-VM:~# cat /etc/dhcphosts.txt
> 02:00:26:94:00:06,set:10_3_2_111,10.3.2.111,BatVM2,infinite
> 5) When restarting the VM it will fail to get a DHCP lease on eth0.
> Note: in some cases it will reuse the old lease which is cached in the local
> leases database - note this IP lease does not come from the VPC VR.
> The dnsmasq log will now display the following:
> Oct 30 16:30:36 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162
> 02:00:21:fd:00:08
> Oct 30 16:30:36 dnsmasq-dhcp[8246]: DHCPNAK(eth2) 10.3.1.162
> 02:00:21:fd:00:08 address not available
> Oct 30 16:30:44 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:30:58 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:31:13 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:31:22 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:31:32 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth3) 02:00:26:94:00:06
> Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPOFFER(eth3) 10.3.2.111
> 02:00:26:94:00:06
> Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPREQUEST(eth3) 10.3.2.111
> 02:00:26:94:00:06
> Oct 30 16:31:37 dnsmasq-dhcp[8246]: DHCPACK(eth3) 10.3.2.111
> 02:00:26:94:00:06 BatVM2
> I.e. the VM is not receiving a DHCP offer on eth0 as there are no addresses
> configured, however eth1 successfully handshakes.
> 6) Note - restart of the VPC VR / restart of network with cleanup does not
> seem to fix the issue.
> 7) Just removing the last added NIC does not fix the issue:
> The DHCP host file still contains the following, i.e. the host entry from the
> last added NIC:
> root@r-20-VM:~# cat /etc/dhcphosts.txt
> 02:00:26:94:00:06,set:10_3_2_111,10.3.2.111,BatVM2,infinite
> root@r-20-VM:~#
> Restarting the VM after removal will show:
> Oct 30 16:42:00 dnsmasq-dhcp[8246]: DHCPREQUEST(eth2) 10.3.1.162
> 02:00:21:fd:00:08
> Oct 30 16:42:00 dnsmasq-dhcp[8246]: DHCPNAK(eth2) 10.3.1.162
> 02:00:21:fd:00:08 address not available
> Oct 30 16:42:08 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:42:19 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:42:30 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> Oct 30 16:42:50 dnsmasq-dhcp[8246]: DHCPDISCOVER(eth2) 02:00:21:fd:00:08 no
> address available
> I.e. still no DHCP lease on Tier 1.
> 8) Getting DHCP to work again on the guest VM eth0 involves juggling NICs:
> - Making the last added NIC (eth1) primary.
> - Remove the first NIC (eth0) as discussed in step 7 above.
> - Readding a new NIC on Tier1.
> - At this point DHCP will work on the Tier 1 NIC, but will be broken on
> the Tier 2 NIC.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)