On 11/18/2013 10:02 AM, Michael S. Tsirkin wrote: > On Tue, Nov 05, 2013 at 07:17:18PM +0800, Amos Kong wrote: >> We currently just update the HMP NIC info when the last bit of macaddr >> is written. This assumes that guest driver will write all the macaddr >> from bit 0 to bit 5 when it changes the macaddr, this is the current >> behavior of linux driver (e1000/rtl8139cp), but we can't do this >> assumption. >> >> The macaddr that is used for rx-filter will be updated when every bit >> is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC >> info when every bit is changed. It will be same as virtio-net. >> >> Signed-off-by: Amos Kong <ak...@redhat.com> > > Vlad here told me he did some research and this > does not actually match hardware behaviour > for either e1000 or rtl8139. > > Vlad, would you like to elaborate on-list?
Sure. I decided to dig into the hardware data-sheets and the OS drivers that use the HW to see what's really going on and how the hw expects this data to be programmed. Here is what I've found so far: E1000: e1000 stores each mac address in 2 registers: RAL - receive register low RAH - receive register high The highest order bit of RAH (bit 31) is called the address available bit. When this bit is set the HW will attempt to use the address for packet matching. So, when the mac address is initially programmed into HW, that AV bit is not set until RAH is written. Thus drivers really should do writes in RAL+RAH order, otherwise AV bit would be set on a partially set address. There is a slight issue when the receive address registers already have a value. Since the address is written in 2 separate writes, there is a very small window of time when the RAL is set to the new value and RAH is set to the old value with AV still set. I am talking to Intel guys about this now. So from the POV of notifying libvirt/management that address is changed, it should be done when RAH is set. RTL8139: Realtek devices have a 9346CR Command Register that gates write access to certain configuration regions on the HW. It is placed into "Configuration register write enabled" mode before driver can write to one of a set of configuration spaces. Even though the data sheet doesn't mention this, it appears that this must also must be used to guard write access to receive address register of the card. All variants of BSD and linux drivers that I've found use this along with comment that say that this is an undocumented requirement. I am not sure what the HW does to incoming frames when the command register is to this mode. I see 2 things that we might be able to do here: 1) A low-impact change might be to only notify the management when we've detected an address change and currently exiting "config write" mode. 2) A more invasive change might be to disable rx_handling while in "config wirte" mode. This would prevent attempting to match packets to a partially written mac address. I have a patch that does (1) above. Thoughts? -vlad > > I think we should revert this for 1.8 and > look at emulating actual hardware behaviour. > >> --- >> hw/net/e1000.c | 2 +- >> hw/net/rtl8139.c | 5 +---- >> 2 files changed, 2 insertions(+), 5 deletions(-) >> >> diff --git a/hw/net/e1000.c b/hw/net/e1000.c >> index ec8ecd7..2d60639 100644 >> --- a/hw/net/e1000.c >> +++ b/hw/net/e1000.c >> @@ -1110,7 +1110,7 @@ mac_writereg(E1000State *s, int index, uint32_t val) >> >> s->mac_reg[index] = val; >> >> - if (index == RA + 1) { >> + if (index == RA || index == RA + 1) { >> macaddr[0] = cpu_to_le32(s->mac_reg[RA]); >> macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]); >> qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t *)macaddr); >> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c >> index 5329f44..7f2b4db 100644 >> --- a/hw/net/rtl8139.c >> +++ b/hw/net/rtl8139.c >> @@ -2741,10 +2741,7 @@ static void rtl8139_io_writeb(void *opaque, uint8_t addr, uint32_t val) >> >> switch (addr) >> { >> - case MAC0 ... MAC0+4: >> - s->phys[addr - MAC0] = val; >> - break; >> - case MAC0+5: >> + case MAC0 ... MAC0+5: >> s->phys[addr - MAC0] = val; >> qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys); >> break; >> -- >> 1.8.3.1 >>