On Thu, Aug 30, 2012 at 9:38 AM, Bo Yang <boy...@suse.com> wrote: > On 08/30/2012 04:04 PM, Stefan Hajnoczi wrote: >> On Wed, Aug 29, 2012 at 09:17:43PM +0200, Stefan Weil wrote: >>> Am 29.08.2012 13:26, schrieb Bo Yang: >>>> This is reported by QA. When installing os with pxe, after the initial >>>> kernel and initrd are loaded, the procedure tries to copy files from >>>> install >>>> server to local harddisk, the network becomes stall because of running out >>>> of >>>> receive descriptor. >>>> >>>> Signed-off-by: Bo Yang<boy...@suse.com> >>>> --- >>>> hw/eepro100.c | 5 ++++- >>>> 1 files changed, 4 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/hw/eepro100.c b/hw/eepro100.c >>>> index 50d117e..52a18ad 100644 >>>> --- a/hw/eepro100.c >>>> +++ b/hw/eepro100.c >>>> @@ -1036,6 +1036,8 @@ static void eepro100_ru_command(EEPRO100State * s, >>>> uint8_t val) >>>> } >>>> set_ru_state(s, ru_ready); >>>> s->ru_offset = e100_read_reg4(s, SCBPointer); >>>> + qemu_flush_queued_packets(&s->nic->nc); >>>> + qemu_notify_event(); >>> >>> What would happen if the above changes were omitted? >> >> In the worst case the guest code would be unable to make progress since >> packet reception is disabled. >> >> The QEMU net subsystem needs to be kicked when rx buffers become >> available again so that any queued packets can be delivered and we can >> restart the event loop. >> >> The event loop needs to be restarted because net clients (like tap) use >> qemu_set_fd_handler2() with a read_poll() handler that returns false >> when the NIC is unable to receive. Imagine this scenario: >> >> 1. NIC runs out of rx buffers. >> 2. Event loop iteration starts and calls tap's read_poll() handler, >> which sees the NIC cannot receive and therefore does not add the tap >> file descriptor to select(2). >> 3. NIC gets new rx buffers but does not kick net subsystem/event loop. >> 4. Event loop still sitting in select(2) without the tap file >> descriptor. Therefore incoming packets are not picked up by QEMU! >> >> In practice the event loop tends to iterate due to timers, etc. But in >> the worst case we can go completely starved here. > > Yes. The fd will be added to read set in the next iteration. The delay > depends on the select timeout. it is possible to go starved here. > >> >>> Would the network show less performance? How much >>> would the test scenario (Linux installation) take longer? >> >> Yes, the lack of kicks causes reduced network performance. This is >> especially true with -netdev tap and a guest driver that runs out of rx >> buffers. If you're lucky you might not hit this depending on the >> -netdev and availability of rx buffers. >> >>> What about the other nic emulations in QEMU? >>> I observe hanging network rather often with the >>> ARM versatilepb emulation. >> >> virtio-net has been correct for some time. >> >> e1000, xen, usb, and eepro100 are now fixed in the net tree: >> http://github.com/stefanha/qemu/commits/net >> >> Other NICs may or may not be okay. Really all of them need to be >> audited. >> >>>> TRACE(OTHER, logout("val=0x%02x (rx start)\n", val)); >>>> break; >>>> case RX_RESUME: >>>> @@ -1770,7 +1772,8 @@ static ssize_t nic_receive(NetClientState *nc, const >>>> uint8_t * buf, size_t size) >>>> if (rfd_command& COMMAND_EL) { >>>> /* EL bit is set, so this was the last frame. */ >>>> logout("receive: Running out of frames\n"); >>>> - set_ru_state(s, ru_suspended); >>>> + set_ru_state(s, ru_no_resources); >>>> + eepro100_rnr_interrupt(s); >>> >>> Adding the interrupt here is correct (I have similar code in >>> http://repo.or.cz/w/qemu/ar7.git/blob/HEAD:/hw/eepro100.c >>> which is an improved version of hw/eepro100.c). >>> >>> Setting ru_no_resources looks also good, but I am not >>> sure whether removing ru_suspended is ok. Maybe it should >>> be ru_no_resources | ru_suspended. >> >> I think the datasheet talks about setting the RU to no resources and the >> CU to suspended. So there are two state machines and we only track one >> here. > > I don't think I understand this. If we run out of rx descriptor, why do > we suspend tx unit too? maybe there are reasons I am unaware of.. I > don't know.
I was wrong. The datasheet "Table 55. CU Activities Performed at the End of Execution" shows that the EL and S bit cause the CU to enter the Idle State. In terms of hw/eepro100.c I don't think we care about the CU state. RU state No Resources is correct. Stefan