Hi Eugene, Thanks for reporting this. We saw your email yesterday, though we need some time to investigate the issue. We will reply as soon as we have a conclusion.
Thanks again, Ting -----Original Message----- From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Cohen, Eugene Sent: Monday, May 23, 2016 10:07 PM To: edk2-devel@lists.01.org; Fu, Siyuan <siyuan...@intel.com>; Wu, Jiaxin <jiaxin...@intel.com>; Zhang, Lubo <lubo.zh...@intel.com> Cc: Vaughn, Gregory (IPG LJ Printer Lab) <greg.vau...@hp.com> Subject: Re: [edk2] TCP4: Failure to Acknowledge due to DPC Dispatch Nesting Adding some maintainers... We are looking forward to a response. Thanks, Eugene > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of > Cohen, Eugene > Sent: Friday, May 20, 2016 2:11 PM > To: edk2-devel@lists.01.org > Cc: Vaughn, Gregory (IPG LJ Printer Lab) <greg.vau...@hp.com> > Subject: [edk2] TCP4: Failure to Acknowledge due to DPC Dispatch > Nesting > > > We have isolated a problem where the TCP4 driver fails to acknowledge > received data under certain circumstances. > > Background on DPCs: The DPC mechanism allows functions to be called at > a later point in time at a requested TPL level. Functions are queued > through the DPC Protocol's QueueDpc interface. DPCs are dispatched > anytime a module calls the DispatchDpc function in the DPC protocol. > The dispatch process will execute all queued DPCs that were registered > to execute at or above the caller's TPL level (e.g. if caller is at > TPL_CALLBACK the DPC dispatch will execute anything between CALLBACK and > HIGH_LEVEL). > > The stack depends on DispatchDpc being called at appropriate > preemption points to advance packet processing. The dispatch function > is called in multiple layers as you can see by searching for > DispatchDpc with calls originating from ArpDxe, Ip4Dxe, MnpDxe, > Udp4Dxe, UefiPxeBcDxe, DnsDxe, HttpDxe, Ip6Dxe and Udp6Dxe. > > Currently it's possible for DPC dispatching to occur in a nested manner. > Imagine a case where an upper network stack layer queues a DPC (for > example the TCP layer's TcpTickingDpc) and in the DPC handler it makes > use of lower layers (like sending a packet through IP+MNP). As part > of packet processing these lower layers will call DispatchDpc > resulting in nested DPCs execution. > > > Here's an example of the DPC nesting, the indent level indicates the > level of nesting for the DPC > > TcpTicking > DpcDispatchDpc TPL=CALLBACK > TcpTickingDpc > TcpTickingDpc Tcb: 0x49a42dd0, DelayedAck=1, CtrlFlag: 0x1019 > TcpTickingDpc call TcpSendAck delayed is: 1 > TcpSendAck Seq=158464588 Ack=4152002304 > TcpTransmitSegment DelayedAck = 0 > Nbuf SendIpPacket: DestPort=62462 Seq=158464588 Ack=4152002304 > Window=19840 > MnpSyncSendPacket call DispatchDpc > DpcDispatchDpc TPL=CALLBACK > Ip4AccpetFrame call DispatchDpc > DpcDispatchDpc TPL=CALLBACK > Ip4FreeTxToken call DispatchDpc > DpcDispatchDpc TPL=CALLBACK > TcpInput: DestPort=8816 Seq=4152002304 Ack=158464588 Len=1460 > TcpClearTimer Tcb: 0x49a42dd0 > > Notice how the process of sending the TCP ACK via IP then MNP causes > another DispatchDpc to occur before the TCP segment transmit call returns. > This nesting continues on and a whole bunch of code has executed (all > at CALLBACK TPL). You can see near the end that we even begin > processing another TCP packet. > > > If we look in detail what the TcpSendAck does there are two key steps: > > if (TcpTransmitSegment (Tcb, Nbuf) == 0) { > TCP_CLEAR_FLG (Tcb->CtrlFlag, TCP_CTRL_ACK_NOW); > Tcb->DelayedAck = 0; > } > > It transmits the segment and after the transmit returns it clears the > DelayedAck counter since the presumption is that we sent an ACK for > the most recent segment and we are caught up. But because DPCs are > dispatched within TcpTransmitSegment the assumption that this transmit > was the last one is incorrect. > > > Here is a portion of a trace illustrating the problem: > > TcpTicking > DpcDispatchDpc TPL=CALLBACK > TcpTickingDpc > TcpTickingDpc Tcb: 0x49a42dd0, DelayedAck=1, CtrlFlag: 0x1019 > TcpTickingDpc call TcpSendAck delayed is: 1 > TcpSndAck Seq=158464588 Ack=4152002304 > TcpTransmitSegment DelayedAck = 0 > Nbuf SendIpPacket: DestPort=62462 Seq=158464588 Ack=4152002304 > Window=19840 > MnpSyncSendPacket call DispatchDpc > DpcDispatchDpc TPL=CALLBACK > > [snip - a bunch of nested DPC processing removed] > > DpcDispatchDpc Tpl= TPL=CALLBACK > TcpInput: DestPort=8816 Seq=4152019824 Ack=158464588 Len=1460 > TcpClearTimer Tcb: 0x49a42dd0 > TcpInput Seq=4152019824 Tcb: 0x49a42dd0, Tcb->DupAck = 0 > TcpToSendAck add to delayedack Seq=158464588 Ack=4152021284 > TcpToSendAck add to delayedack Tcb: 0x49a42dd0, > Seq=158464588 Ack=4152021284, DelayedAck=1 > ^^ NOTE: the DelayedAck flag has been set to one indicating > that we haven't acknowledged yet and need to soon > > [we return from 14 nested DPC calls !!] > > TcpSndAck Tcb->DelayedAck = 0 > ^^ But when the Dispatch returns from the TcpTransmitSegment we > clear DelayedAck back to zero such that we never acknowledge the last > packet we received. > > TcpTickingDpc No timer active or expired > TcpTickingDpc Tcb: 0x49918bd0, DelayedAck=0, CtrlFlag: 0x1019 > TcpTickingDpc No timer active or expired > > In cases where TCP is waiting for more data from the remote endpoint > and the endpoint is waiting for acknowledgement (like a situation > where the sliding window is small or closed) the remote endpoint will > retransmit the last packet but for reasons I don't yet understand > duplicate acks aren't sent (I think there's another bug around retransmitting > acks but I haven't isolated > it). The result is that the connection is reset by the remote endpoint > after a > timeout period elapses. > > > This seems kind of like a critical section / locking problem around > the DelayedAck member. I'm not sure if the issue stems from the fact > that the DPC model allows a form of preemption at the same TPL level > (in this case we have TCP4 at TPL_CALLBACK effectively preempting TCP4 > at TPL_CALLBACK) or if the issue is that the TCP4 driver is not > managing the DelayedAck variable correctly given the rules for DPC routines > (btw: what are the rules?). > > > I'd like an explanation for what the expected DPC preemption model is > - are we really supposed to be nesting dispatch calls and are all > network components expected to handle this level or reentrancy? > > We are testing a change to remove DPC nesting which we believe should > resolve this issue. We would be happy to provide the patch if there's > agreement that this is the right way to resolve this. > > I'd be open to a phone call / Lync meeting with the maintainers if you > feel it would be helpful. > > Thanks, > > Eugene > > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel