Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-28 Thread Rose, Gregory V
-Original Message-
From: Gerd v. Egidy [mailto:li...@egidy.de]
Sent: Thursday, May 27, 2010 3:13 PM
To: Rose, Gregory V
Cc: e1000-devel@lists.sourceforge.net
Subject: Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

Hi Greg,

Very good. Hope you find something soon.

Alright we've found something and it seems to be a pretty solid lead.  What I'd 
like for you to do if you can find the time is repeat the tests you've been 
running but put up a window that watches the packet stats per queue.  Something 
like this:

watch -n 2 'ethtool -S ethx | grep packets'

Then watch the tx queue packet stats.  We're thinking that when the problem 
occurs that the VF is not able to communicate with the PF you will see packets 
from the PF to the VF going out on tx queue 1.  When it works you'll see 
packets going out on tx queue 0.

If you could confirm that for us it would help out a great deal.

Thanks for your help and your patience.

- Greg



--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-27 Thread Gerd v. Egidy
Hi Greg,

 So I guess it is some kind of race condition. I just rebooted the vm
 several
 times but the behavior didn't change. So I guess it has to do with the
 initialization of the host driver or setting the macs for the vms.
 
 Unfortunately the host is already in production use so I can't just
 reboot the
 host serveral times now to verify that the behavior changes on host
 reboots.
 But I think I'll be able to do that within the next days.

I just rebooted the host 10 times. The hardware, kernel, setup and test 
procedure was identical over these 10 reboots. I got

5 times: everything working
4 times: arp broadcast ok, arp unicast ok, regular (non-arp) packets not 
working
1 time: arp (broadcast and unicast) not working, regular (non-arp) packets ok

Rebooting the vms within one host-boot didn't change behavior: once the host 
was started, the behavior stayed the same. VM restarts did not alter anything.

So to me this seems to be a race condition in the host driver initialization. 
The outcome can be as expected or different erratic behavior.
 
 I'm still able to reliably reproduce this bug with the upstream 2.6.34
 kernel drivers.  If I use the drivers from sourceforge then the problem
 doesn't occur so there is definitely some driver issue that I have yet to
 find.  

This is a race condition. So if you can't reproduce it with the sourceforge 
driver, it doesn't mean the bug is not there. It could be just timing issues 
that prevent the bug to appear on your hardware.

 I'll keep working the problem.

Very good. Hope you find something soon.

Kind regards,

Gerd

--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-21 Thread Gerd v. Egidy
Hi Greg,

thanks for your quick reply.

 But the host can't send packets to the guests (or these packets are
 never
 received on the guests). When using tcpdump, I can see these packets
 leaving
 the device on the host, but they never arrive on the guest.

 So you're saying that the guests don't even receive packets from the host
 with broadcast as the destination address?

that is correct.

I just tried it again to verify:

eth1 on the host is connected to an otherwise unused switch.
The host is at 192.168.100.1, the vm at 192.168.100.2. The vm is not connected 
to any other network and has only the vf as nic (eth0).

[r...@host]# arping -I eth1 192.168.100.2

- no response

during this I ran on the vm:

[r...@vm]# tcpdump -i eth0 -n

- no packets

When looking at the packet counter on the vm, the RX-counter is still 0. So no 
broadcasts from the host can reach the vm.

When doing it the other way round:

[r...@vm]# arping -I eth0 192.168.100.1

I still get no response. But that is because the host can't answer. The host 
sees the packets and replies:

[r...@host]# tcpdump -i eth1 -n
14:05:48.473303 ARP, Request who-has 192.168.100.1 (Broadcast) tell 
192.168.100.2, length 28
14:05:48.473329 ARP, Reply 192.168.100.1 is-at 00:1b:21:60:xx:xx, length 28
14:05:49.473478 ARP, Request who-has 192.168.100.1 (Broadcast) tell 
192.168.100.2, length 28
14:05:49.473492 ARP, Reply 192.168.100.1 is-at 00:1b:21:60:xx:xx, length 28
[...]

This evening I will upgrade the vm to 2.6.34 too. But aside from that I don't 
know what else to try.

Any ideas what I could do to further trace this down? 

Kind regards,

Gerd

-- 
Address (better: trap) for people I really don't want to get mail from:
jo...@cactusamerica.com

--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-21 Thread Rose, Gregory V
-Original Message-
From: Gerd v. Egidy [mailto:li...@egidy.de]
Sent: Friday, May 21, 2010 5:24 AM
To: e1000-devel@lists.sourceforge.net
Cc: Rose, Gregory V
Subject: Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

Hi Greg,

thanks for your quick reply.

 But the host can't send packets to the guests (or these packets are
 never
 received on the guests). When using tcpdump, I can see these packets
 leaving
 the device on the host, but they never arrive on the guest.

 So you're saying that the guests don't even receive packets from the
host
 with broadcast as the destination address?

that is correct.

I just tried it again to verify:

eth1 on the host is connected to an otherwise unused switch.
The host is at 192.168.100.1, the vm at 192.168.100.2. The vm is not
connected
to any other network and has only the vf as nic (eth0).

[r...@host]# arping -I eth1 192.168.100.2

- no response

during this I ran on the vm:

[r...@vm]# tcpdump -i eth0 -n

- no packets

When looking at the packet counter on the vm, the RX-counter is still 0.
So no
broadcasts from the host can reach the vm.

When doing it the other way round:

[r...@vm]# arping -I eth0 192.168.100.1

I still get no response. But that is because the host can't answer. The
host
sees the packets and replies:

[r...@host]# tcpdump -i eth1 -n
14:05:48.473303 ARP, Request who-has 192.168.100.1 (Broadcast) tell
192.168.100.2, length 28
14:05:48.473329 ARP, Reply 192.168.100.1 is-at 00:1b:21:60:xx:xx, length
28
14:05:49.473478 ARP, Request who-has 192.168.100.1 (Broadcast) tell
192.168.100.2, length 28
14:05:49.473492 ARP, Reply 192.168.100.1 is-at 00:1b:21:60:xx:xx, length
28
[...]

This evening I will upgrade the vm to 2.6.34 too. But aside from that I
don't
know what else to try.

Any ideas what I could do to further trace this down?

Pretty strange but one thing I can think of to do is make sure that the source
MAC address in the ARP packets the host is receiving is the same one you 
assigned
via the ip link set... commands.

Other than that I'm in the process of setting something up to reproduce the 
problem.

- Greg

--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-21 Thread Rose, Gregory V
-Original Message-
From: Rose, Gregory V
Sent: Friday, May 21, 2010 12:29 PM
To: 'Gerd v. Egidy'; e1000-devel@lists.sourceforge.net
Subject: RE: [E1000-devel] guest with igbvf on 82576 can't talk to host

-Original Message-
From: Gerd v. Egidy [mailto:li...@egidy.de]
Sent: Friday, May 21, 2010 5:24 AM
To: e1000-devel@lists.sourceforge.net
Cc: Rose, Gregory V
Subject: Re: [E1000-devel] guest with igbvf on 82576 can't talk to host


[snip]

This evening I will upgrade the vm to 2.6.34 too. But aside from that I
don't
know what else to try.

Any ideas what I could do to further trace this down?

Pretty strange but one thing I can think of to do is make sure that the
source
MAC address in the ARP packets the host is receiving is the same one you
assigned
via the ip link set... commands.

Other than that I'm in the process of setting something up to reproduce
the problem.

Good news of a sort.  I can reproduce the problem.  Which means I can debug it 
and hopefully fix it.

I'll get back to you when I have some results.

- Greg


--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-21 Thread Gerd v. Egidy
  But the host can't send packets to the guests (or these packets are
  never
  received on the guests). When using tcpdump, I can see these packets
  leaving
  the device on the host, but they never arrive on the guest.

 This evening I will upgrade the vm to 2.6.34 too. But aside from that I
 don't know what else to try.

host and vm are on 2.6.34 now. But it didn't change anything, the problem 
still persists.

Don't know if this helps:
I took a look at the rx-interrupts for the virtual function on the vm. The 
number of interrupts increases by one every two seconds (estimated). The 
number of rx-interrupts is not in any way dependant on the number of packets 
sent from the host, even when sending a hundret packets the interrupts 
increase only every two seconds.

Kind regards,

Gerd

--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-21 Thread Rose, Gregory V
-Original Message-
From: Gerd v. Egidy [mailto:li...@egidy.de]
Sent: Friday, May 21, 2010 1:08 PM
To: e1000-devel@lists.sourceforge.net
Cc: Rose, Gregory V
Subject: Re: [E1000-devel] guest with igbvf on 82576 can't talk to host

  But the host can't send packets to the guests (or these packets are
  never
  received on the guests). When using tcpdump, I can see these
packets
  leaving
  the device on the host, but they never arrive on the guest.

 This evening I will upgrade the vm to 2.6.34 too. But aside from that
I
 don't know what else to try.

host and vm are on 2.6.34 now. But it didn't change anything, the
problem
still persists.

Yes, I reproduced the problem on 2.6.34.  I'm debugging it now.

Will let you know of results ASAP.

- Greg


--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


[E1000-devel] guest with igbvf on 82576 can't talk to host

2010-05-20 Thread Gerd v. Egidy
Hi,

I'm using a dualport ET (82576) running with SR-IOV on kernel 2.6.34. The 
guests are managed with kvm (qemu-kvm, current git, but I tried 0.12.4 release 
too). After some work the whole vt-d and sr-iov stuff is working.

The host can use the main card without problems to communicate with other 
hosts on the lan. The guests (using igbvf with 2.6.33.4) can communicate with 
other hosts on the lan. Two guests can talk to each other on the same virtual 
link. The guests can send packets to the host.

But the host can't send packets to the guests (or these packets are never 
received on the guests). When using tcpdump, I can see these packets leaving 
the device on the host, but they never arrive on the guest.

This does not seem to be an arp-related problem as I have tried to manually 
insert the macs into the arp table. Now it's my ping-packets that are lost, 
not the arp ones.

I have set the mac of all using iproute2 2.6.24 and 
ip link set eth0 vf 0 mac 02:00:00:00:00:00
ip link set eth0 vf 1 mac 02:00:00:00:00:01
...

I do not have set any vlan-related options, ip link show doesn't write 
anything about vlans. So I don't think it is a vlan-related problem.

Any ideas what could cause this?

Any ideas what I could try to fix it?

Thanks a lot.

Kind regards,

Gerd

--

___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired