Re: [ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-26 Thread Nicolas Ecarnot

Le 26/01/2017 à 09:03, Gianluca Cecchi a écrit :

On Thu, Jan 26, 2017 at 8:45 AM, Pavel Gashev > wrote:

Gianluca,

It looks like VM doesn't receive broadcasts. It can be a network
topology issue.
Could you double check /sys/class/net/bond1/bonding/mode and
/sys/class/net/bond1/bonding/slaves ?

Is it possible you have another VM with the same MAC address in the
same network segment?


Pavel, I think you are right! Thanks!
I didn't take into consideration that there is another oVirt environment
that has some VMs on this vlan..
And I found a VM with the same mac 00:1a:4a:16:01:51 (and a different ip)
Now I powered off that other VM, restarted my one and things seem ok.

What is the best way to manage when more oVirt environments has VMs on
the same vlans?


I encountered the same problem some years ago, as we have multiple oVirt 
environnements.

We decided to assign specific MAC pools for each env to avoid overlapping.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-26 Thread Gianluca Cecchi
On Thu, Jan 26, 2017 at 8:45 AM, Pavel Gashev  wrote:

> Gianluca,
>
> It looks like VM doesn't receive broadcasts. It can be a network topology
> issue.
> Could you double check /sys/class/net/bond1/bonding/mode and
> /sys/class/net/bond1/bonding/slaves ?
>
> Is it possible you have another VM with the same MAC address in the same
> network segment?
>
>
Pavel, I think you are right! Thanks!
I didn't take into consideration that there is another oVirt environment
that has some VMs on this vlan..
And I found a VM with the same mac 00:1a:4a:16:01:51 (and a different ip)
Now I powered off that other VM, restarted my one and things seem ok.

What is the best way to manage when more oVirt environments has VMs on the
same vlans?

BTW: before finding the problem related to the mac I found my documents
when I used qemu-kvm on CentOS 6.3 on the same hardware pieces and I had
similar problems (it was mid 2012) and in that case I had to set
tx-checksumming off for the interfaces where I had the bridges (managed by
libvirt).
I created an init script that did the setting at boot.
I have verified that by default my network interfaces in CentOS 7.3 start
with the parameter set to on. Of course changing the setting didn't solve
my problem because was mac related...
But Is there any reference about it and what would be the best setting and
why?

Thanks again in the mean time ;-)
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-25 Thread Pavel Gashev
Gianluca,

It looks like VM doesn't receive broadcasts. It can be a network topology issue.
Could you double check /sys/class/net/bond1/bonding/mode and 
/sys/class/net/bond1/bonding/slaves ?

Is it possible you have another VM with the same MAC address in the same 
network segment?

On Thu, 2017-01-26 at 00:28 +0100, Gianluca Cecchi wrote:
Hello,
I'm on 4.0.6 with CentOS 7.3.
The hypervisor is an old blade BL685c G1 and the network adapters used to 
provide network to vm are
07:04.0 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit 
Ethernet (rev a3)
07:04.1 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit 
Ethernet (re
managed by tg3 kernel module, as I see in messages:

Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: Tigon3 
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address 00:1c:c4:46:ef:73
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: attached PHY is 5714 
(1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: RXcsums[1] 
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: dma_rwctrl[76148000] 
dma_mask[40-bit]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: Tigon3 
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address 00:1c:c4:46:ef:74
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: attached PHY is 5714 
(1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: RXcsums[1] 
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: dma_rwctrl[76148000] 
dma_mask[40-bit]

The 2 adapters are in bonding active-backup mode.
They are on vlan, so on hypervisor I have bond1.65 device and in vm the virtual 
interface is untagged

[root@ovmsrv05 ~]# ifconfig bond1.65
bond1.65: flags=4163  mtu 1500
ether 00:1c:c4:46:ef:73  txqueuelen 1000  (Ethernet)
RX packets 4368  bytes 257675 (251.6 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 238  bytes 28146 (27.4 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@ovmsrv05 ~]#


Currently Active Slave: enp7s4f0

After a few minutes I loose connectivity with the guest. In this case if I go 
in guest console and ping the gateway, the connection is resumed. And I can 
maintain it if I leave the ping running, otherwise after a little I again loose 
connectivity.
I suspect it is not important but the guest is Oracle Linux 6.5 with 
3.8.13-16.2.1.el6uek.x86_64 kernel. The adapter for the vnic is the default; 
the qemu-kvm command line generated contains this:
-netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:51,bus=pci.0,addr=0x3

I seem to remember some years ago when I used the same blades with plain 
qemu-kvm/libvirt I had to make up an ethtool setting for similar problems, but 
I don't remember what it was... and possibly I used bnx2 kernel module with the 
other embedded network interfaces, I'm not sure...
Currently, the  configuration for adapter on the blade  is

[root@ovmsrv05 ~]# ethtool -k enp7s4f0
Features for enp7s4f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [requested on]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
[root@ovmsrv05 ~]#


systool shows no particular parameters for tg3 kernel module available
[root@ovmsrv05 ~]# systool -v -m tg3
Module = "tg3"

  Attributes:
coresize= "170653"
initsize= "0"
initstate   = "live"
refcnt  = "0"
rhelversion = "7.3"
srcversion  = "D276F97F491ADECC61C8284"
taint   

Re: [ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-25 Thread Douglas Schilling Landgraf

Hi Gianluca,

On 01/25/2017 06:28 PM, Gianluca Cecchi wrote:

Hello,
I'm on 4.0.6 with CentOS 7.3.
The hypervisor is an old blade BL685c G1 and the network adapters used
to provide network to vm are
07:04.0 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit
Ethernet (rev a3)
07:04.1 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit
Ethernet (re
managed by tg3 kernel module, as I see in messages:

Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: Tigon3
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address
00:1c:c4:46:ef:73
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: attached PHY is
5714 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0:
dma_rwctrl[76148000] dma_mask[40-bit]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: Tigon3
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address
00:1c:c4:46:ef:74
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: attached PHY is
5714 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1:
dma_rwctrl[76148000] dma_mask[40-bit]


I am looking to see if I find something but meanwhile:

Which kernel version is running in the system?
Have you tried any other kernel version?



The 2 adapters are in bonding active-backup mode.
They are on vlan, so on hypervisor I have bond1.65 device and in vm the
virtual interface is untagged

[root@ovmsrv05 ~]# ifconfig bond1.65
bond1.65: flags=4163  mtu 1500
ether 00:1c:c4:46:ef:73  txqueuelen 1000  (Ethernet)
RX packets 4368  bytes 257675 (251.6 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 238  bytes 28146 (27.4 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@ovmsrv05 ~]#


Currently Active Slave: enp7s4f0

After a few minutes I loose connectivity with the guest. In this case if
I go in guest console and ping the gateway, the connection is resumed.
And I can maintain it if I leave the ping running, otherwise after a
little I again loose connectivity.
I suspect it is not important but the guest is Oracle Linux 6.5 with
3.8.13-16.2.1.el6uek.x86_64 kernel. The adapter for the vnic is the
default; the qemu-kvm command line generated contains this:
-netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:51,bus=pci.0,addr=0x3


Any error in the guest side like timeouts in messages/journald?
Have you tried different guest OS for a test only?

Adding Dan, he might have others ideas.



I seem to remember some years ago when I used the same blades with plain
qemu-kvm/libvirt I had to make up an ethtool setting for similar
problems, but I don't remember what it was... and possibly I used bnx2
kernel module with the other embedded network interfaces, I'm not sure...
Currently, the  configuration for adapter on the blade  is

[root@ovmsrv05 ~]# ethtool -k enp7s4f0
Features for enp7s4f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [requested on]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
[root@ovmsrv05 ~]#


systool shows no particular parameters for tg3 kernel module available
[root@ovmsrv05 ~]# systool -v -m tg3
Module = "tg3"

  Attributes:
coresize= "170653"
initsize= "0"
initstate   = "live"
refcnt  = "0"
rhelversion = "7.3"
srcversion  = "D276F97F491ADECC61C8284"
taint

[ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-25 Thread Gianluca Cecchi
Hello,
I'm on 4.0.6 with CentOS 7.3.
The hypervisor is an old blade BL685c G1 and the network adapters used to
provide network to vm are
07:04.0 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit
Ethernet (rev a3)
07:04.1 Ethernet controller: Broadcom Limited NetXtreme BCM5715S Gigabit
Ethernet (re
managed by tg3 kernel module, as I see in messages:

Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: Tigon3
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address
00:1c:c4:46:ef:73
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: attached PHY is
5714 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.0 eth0:
dma_rwctrl[76148000] dma_mask[40-bit]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: Tigon3
[partno(011276-001) rev 9003] (PCIX:133MHz:64-bit) MAC address
00:1c:c4:46:ef:74
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: attached PHY is
5714 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1: RXcsums[1]
LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
Jan 21 18:53:33 ovmsrv05 kernel: tg3 :07:04.1 eth1:
dma_rwctrl[76148000] dma_mask[40-bit]

The 2 adapters are in bonding active-backup mode.
They are on vlan, so on hypervisor I have bond1.65 device and in vm the
virtual interface is untagged

[root@ovmsrv05 ~]# ifconfig bond1.65
bond1.65: flags=4163  mtu 1500
ether 00:1c:c4:46:ef:73  txqueuelen 1000  (Ethernet)
RX packets 4368  bytes 257675 (251.6 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 238  bytes 28146 (27.4 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@ovmsrv05 ~]#


Currently Active Slave: enp7s4f0

After a few minutes I loose connectivity with the guest. In this case if I
go in guest console and ping the gateway, the connection is resumed. And I
can maintain it if I leave the ping running, otherwise after a little I
again loose connectivity.
I suspect it is not important but the guest is Oracle Linux 6.5 with
3.8.13-16.2.1.el6uek.x86_64 kernel. The adapter for the vnic is the
default; the qemu-kvm command line generated contains this:
-netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:51,bus=pci.0,addr=0x3

I seem to remember some years ago when I used the same blades with plain
qemu-kvm/libvirt I had to make up an ethtool setting for similar problems,
but I don't remember what it was... and possibly I used bnx2 kernel module
with the other embedded network interfaces, I'm not sure...
Currently, the  configuration for adapter on the blade  is

[root@ovmsrv05 ~]# ethtool -k enp7s4f0
Features for enp7s4f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [requested on]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
[root@ovmsrv05 ~]#


systool shows no particular parameters for tg3 kernel module available
[root@ovmsrv05 ~]# systool -v -m tg3
Module = "tg3"

  Attributes:
coresize= "170653"
initsize= "0"
initstate   = "live"
refcnt  = "0"
rhelversion = "7.3"
srcversion  = "D276F97F491ADECC61C8284"
taint   = ""
uevent  = 
version = "3.137"

  Sections:
...

Thanks,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users