Hi, Eugene

The FIN_SENT flag seems OK because an incoming TCP message won't override this 
flag, while the other control flag like the TIMER_REXMIT and RTT are more 
complex which need case by case analysis. There is no DPC mechanism when 
writing the first revision TCP code, so I think we need to review the TCP code 
for the similar cases to make it more robust.

For your question, why not update DPC "such that DPCs at the same TPL didn't 
preempt each other", the answer is that's the reason why DPC is introduced. The 
design intention of DPC is to interrupt current code when the preempted code is 
in the same TPL level with the code being interrupted.

For example, consider the case that the SCSI bus and the iSCSI driver is built 
on top of TCP, which is one of the reason why we design the DPC in history.

*        The SCSI must be work as synchronous API in some situation according 
to UEFI spec, while the under layer TCP, IP and MNP are asynchronous API.

*        A block reading request comes from the FAT on top of the SCSI device, 
may run oat TPL_CALLBACK, then iSCSI driver transmit a read request message to 
the network.

*        The driver's above iSCSI is synchronous and locked on TPL_CALLBACK, 
while the network drivers can only run at TPL_CALLBACK, which comes out the 
problem: there is no more data could receive from network any more.
You can see that we must allow some event at same TPL level with the current 
running code to preempt, otherwise there is a deadlock.

Best Regards
Siyuan

From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Cohen, 
Eugene
Sent: Wednesday, May 25, 2016 1:16 AM
To: Fu, Siyuan <siyuan...@intel.com>; Ye, Ting <ting...@intel.com>; 
edk2-devel@lists.01.org; Wu, Jiaxin <jiaxin...@intel.com>; Zhang, Lubo 
<lubo.zh...@intel.com>
Cc: Vaughn, Gregory (IPG LJ Printer Lab) <greg.vau...@hp.com>
Subject: Re: [edk2] TCP4: Failure to Acknowledge due to DPC Dispatch Nesting

Siyuan,

Thanks for clarifying the intended design.

Given your response I understand that it's considered acceptable for DPC 
functions to preempt each other, even at the same registered TPL level and even 
sometimes with the same exact function (and in the case of TCP operating on the 
exact same connection).  This would seem to present a substantial challenge for 
DPCs to be written correctly because you have to assume that any protocol or 
library you call may result in a DPC dispatch and cause other routines at your 
TPL level (or even another instance of the same function) to be called again.  
Are there restrictions to when DispatchDPC is allowed to be called or can 
anyone do it anytime?  What if, for some evil reason, someone decided to use 
DPCs as part of a specialized DebugLib causing the stack to be preempted 
constantly - do you expect all DPC handlers to behave correctly under these 
circumstances?  Wouldn't it be safer to remove this form of preemption (or at 
least the same-TPL preemption)?

To change the TCP code as you mention, what do you recommend?  Right now we 
have a routine:

  if (TcpTransmitSegment (Tcb, Nbuf) == 0) {
    TCP_CLEAR_FLG (Tcb->CtrlFlag, TCP_CTRL_ACK_NOW);
    Tcb->DelayedAck = 0;
  }

That attempts the transmit and clears the DelayedAck flag as a function of the 
return status.  Are you recommending that we clear CTRL_ACK_NOW and DelayedAck 
before attempting the transmit?  In the case of an error we would need a way to 
restore the ACK_NOW and DelayedAck these flags which presents the same problem 
again.

But generally I don't think this methodology can be applied to other areas with 
the same exposure.  Consider the TcpToSendData function.  It does this:

if (TcpTransmitSegment (Tcb, Nbuf) != 0) {
    NetbufTrim (Nbuf, (Nbuf->Tcp->HeadLen << 2), NET_BUF_HEAD);
    Nbuf->Tcp = NULL;

    if ((Flag & TCP_FLG_FIN) != 0)  {
      TCP_SET_FLG (Tcb->CtrlFlag, TCP_CTRL_FIN_SENT);
    }

    goto OnError;
  }

which as you can see sets a flag (FIN_SENT) after a DPC preemption point 
(TcpTransmitSegment) and even later it sets other TCB data after the DPC 
preemption point:

  //
  // update status in TCB
  //
  Tcb->DelayedAck = 0;

  if ((Flag & TCP_FLG_FIN) != 0) {
    TCP_SET_FLG (Tcb->CtrlFlag, TCP_CTRL_FIN_SENT);
  }

So it seems like the TCB contents may be at risk in a number of places because 
of DPC preemption.

If we the DPC rules were changed such that DPCs at the same TPL didn't preempt 
each other it would seem to prevent this (to the extent that DPC callbacks use 
a consistent TPL level to provide protection).

Thanks,

Eugene

From: Fu, Siyuan [mailto:siyuan...@intel.com]
Sent: Monday, May 23, 2016 8:45 PM
To: Ye, Ting <ting...@intel.com<mailto:ting...@intel.com>>; Cohen, Eugene 
<eug...@hp.com<mailto:eug...@hp.com>>; 
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Wu, Jiaxin 
<jiaxin...@intel.com<mailto:jiaxin...@intel.com>>; Zhang, Lubo 
<lubo.zh...@intel.com<mailto:lubo.zh...@intel.com>>
Cc: Vaughn, Gregory (IPG LJ Printer Lab) 
<greg.vau...@hp.com<mailto:greg.vau...@hp.com>>
Subject: RE: TCP4: Failure to Acknowledge due to DPC Dispatch Nesting

Hi, Eugene

Thanks for your detail analysis to help understanding the problem. We think 
it's a bug about the DelayedAck flag update in TCP driver, not in DPC, and here 
is the detail.

At the time the TcpTicking() is running, the under layer (MnpSystemPoll) also 
has a timer to receive network packets and queue them to the DPC. In you 
described case, when TcpTickingDpc() calls TcpSendAck() to send a delayed ACK 
to the remote, it first calls TcpTransmitSegment to transmit the ACK (point 1), 
then clear the TCP_CTRL_ACK_NOW and set DelayedAck to zero (point 2) as below.

               TcpTicking
                              TcpSendAck
                                             TcpTransmitSegment (#1)
                                             Clear Tcb's flag (#2)

But there are some DispatchDpc between #1 and #2, it not only transmited the 
ACK message to the under layer driver, but also invoked these MNP queued DPC 
and delivered these packets to the upper layer driver (TCP), then set the Tcb's 
flag. So things come to:

               TcpTicking
                              TcpSendAck
                                             TcpTransmitSegment (#1)
                                                            DispatchDpc
                                                                           New 
TCP packet arrived, and set the Tcb's flag (#3)
                                             Clear Tcb's flag (#2)

In another word, the TcpSendAck() function transmit the packet first, then 
clear the flag later, while the flag may be set by the incoming data during the 
transmit, so the clear operation is incorrect. The correct to operate the flag 
is to set the flag first, then make transmit. In this way, if the transmit 
operation invoke the DPC and deliver some new data to upper layer, the new set 
flag won't be override.

Best Regards
Siyuan

From: Ye, Ting
Sent: Tuesday, May 24, 2016 9:23 AM
To: Cohen, Eugene 
<eug...@hp.com<mailto:eug...@hp.com<mailto:eug...@hp.com%3cmailto:eug...@hp.com>>>;
 
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>;
 Fu, Siyuan 
<siyuan...@intel.com<mailto:siyuan...@intel.com<mailto:siyuan...@intel.com%3cmailto:siyuan...@intel.com>>>;
 Wu, Jiaxin 
<jiaxin...@intel.com<mailto:jiaxin...@intel.com<mailto:jiaxin...@intel.com%3cmailto:jiaxin...@intel.com>>>;
 Zhang, Lubo 
<lubo.zh...@intel.com<mailto:lubo.zh...@intel.com<mailto:lubo.zh...@intel.com%3cmailto:lubo.zh...@intel.com>>>
Cc: Vaughn, Gregory (IPG LJ Printer Lab) 
<greg.vau...@hp.com<mailto:greg.vau...@hp.com<mailto:greg.vau...@hp.com%3cmailto:greg.vau...@hp.com>>>
Subject: RE: TCP4: Failure to Acknowledge due to DPC Dispatch Nesting

Hi Eugene,

Thanks for reporting this. We saw your email yesterday, though we need some 
time to investigate the issue. We will reply as soon as we have a conclusion.

Thanks again,
Ting

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Cohen, 
Eugene
Sent: Monday, May 23, 2016 10:07 PM
To: 
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>;
 Fu, Siyuan 
<siyuan...@intel.com<mailto:siyuan...@intel.com<mailto:siyuan...@intel.com%3cmailto:siyuan...@intel.com>>>;
 Wu, Jiaxin 
<jiaxin...@intel.com<mailto:jiaxin...@intel.com<mailto:jiaxin...@intel.com%3cmailto:jiaxin...@intel.com>>>;
 Zhang, Lubo 
<lubo.zh...@intel.com<mailto:lubo.zh...@intel.com<mailto:lubo.zh...@intel.com%3cmailto:lubo.zh...@intel.com>>>
Cc: Vaughn, Gregory (IPG LJ Printer Lab) 
<greg.vau...@hp.com<mailto:greg.vau...@hp.com<mailto:greg.vau...@hp.com%3cmailto:greg.vau...@hp.com>>>
Subject: Re: [edk2] TCP4: Failure to Acknowledge due to DPC Dispatch Nesting

Adding some maintainers...

We are looking forward to a response.

Thanks,

Eugene

> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
> Cohen, Eugene
> Sent: Friday, May 20, 2016 2:11 PM
> To: 
> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>
> Cc: Vaughn, Gregory (IPG LJ Printer Lab) 
> <greg.vau...@hp.com<mailto:greg.vau...@hp.com<mailto:greg.vau...@hp.com%3cmailto:greg.vau...@hp.com>>>
> Subject: [edk2] TCP4: Failure to Acknowledge due to DPC Dispatch
> Nesting
>
>
> We have isolated a problem where the TCP4 driver fails to acknowledge
> received data under certain circumstances.
>
> Background on DPCs: The DPC mechanism allows functions to be called at
> a later point in time at a requested TPL level.  Functions are queued
> through the DPC Protocol's QueueDpc interface.  DPCs are dispatched
> anytime a module calls the DispatchDpc function in the DPC protocol.
> The dispatch process will execute all queued DPCs that were registered
> to execute at or above the caller's TPL level (e.g. if caller is at
> TPL_CALLBACK the DPC dispatch will execute anything between CALLBACK and 
> HIGH_LEVEL).
>
> The stack depends on DispatchDpc being called at appropriate
> preemption points to advance packet processing.  The dispatch function
> is called in multiple layers as you can see by searching for
> DispatchDpc with calls originating from ArpDxe, Ip4Dxe, MnpDxe,
> Udp4Dxe, UefiPxeBcDxe, DnsDxe, HttpDxe, Ip6Dxe and Udp6Dxe.
>
> Currently it's possible for DPC dispatching to occur in a nested manner.
> Imagine a case where an upper network stack layer queues a DPC (for
> example the TCP layer's TcpTickingDpc) and in the DPC handler it makes
> use of lower layers (like sending a packet through IP+MNP).  As part
> of packet processing these lower layers will call DispatchDpc
> resulting in nested DPCs execution.
>
>
> Here's an example of the DPC nesting, the indent level indicates the
> level of nesting for the DPC
>
> TcpTicking
> DpcDispatchDpc TPL=CALLBACK
>     TcpTickingDpc
>     TcpTickingDpc Tcb: 0x49a42dd0, DelayedAck=1, CtrlFlag: 0x1019
>     TcpTickingDpc call TcpSendAck delayed is: 1
>     TcpSendAck  Seq=158464588 Ack=4152002304
>     TcpTransmitSegment DelayedAck = 0
>     Nbuf SendIpPacket: DestPort=62462 Seq=158464588 Ack=4152002304
> Window=19840
>     MnpSyncSendPacket call DispatchDpc
>     DpcDispatchDpc TPL=CALLBACK
>         Ip4AccpetFrame call DispatchDpc
>         DpcDispatchDpc TPL=CALLBACK
>             Ip4FreeTxToken call DispatchDpc
>             DpcDispatchDpc TPL=CALLBACK
>                 TcpInput: DestPort=8816 Seq=4152002304 Ack=158464588 Len=1460
>                 TcpClearTimer Tcb: 0x49a42dd0
>
> Notice how the process of sending the TCP ACK via IP then MNP causes
> another DispatchDpc to occur before the TCP segment transmit call returns.
> This nesting continues on and a whole bunch of code has executed (all
> at CALLBACK TPL).  You can see near the end that we even begin
> processing another TCP packet.
>
>
> If we look in detail what the TcpSendAck does there are two key steps:
>
>   if (TcpTransmitSegment (Tcb, Nbuf) == 0) {
>     TCP_CLEAR_FLG (Tcb->CtrlFlag, TCP_CTRL_ACK_NOW);
>     Tcb->DelayedAck = 0;
>   }
>
> It transmits the segment and after the transmit returns it clears the
> DelayedAck counter since the presumption is that we sent an ACK for
> the most recent segment and we are caught up.  But because DPCs are
> dispatched within TcpTransmitSegment the assumption that this transmit
> was the last one is incorrect.
>
>
> Here is a portion of a trace illustrating the problem:
>
> TcpTicking
> DpcDispatchDpc TPL=CALLBACK
>     TcpTickingDpc
>     TcpTickingDpc Tcb: 0x49a42dd0, DelayedAck=1, CtrlFlag: 0x1019
>     TcpTickingDpc call TcpSendAck delayed is: 1
>     TcpSndAck  Seq=158464588 Ack=4152002304
>     TcpTransmitSegment DelayedAck = 0
>     Nbuf SendIpPacket: DestPort=62462 Seq=158464588 Ack=4152002304
> Window=19840
>     MnpSyncSendPacket call DispatchDpc
>     DpcDispatchDpc TPL=CALLBACK
>
>         [snip - a bunch of nested DPC processing removed]
>
>          DpcDispatchDpc Tpl= TPL=CALLBACK
>           TcpInput: DestPort=8816 Seq=4152019824 Ack=158464588 Len=1460
>           TcpClearTimer Tcb: 0x49a42dd0
>           TcpInput Seq=4152019824 Tcb: 0x49a42dd0, Tcb->DupAck = 0
>           TcpToSendAck add to delayedack Seq=158464588 Ack=4152021284
>           TcpToSendAck add to delayedack Tcb: 0x49a42dd0,
> Seq=158464588 Ack=4152021284, DelayedAck=1
>           ^^ NOTE: the DelayedAck flag has been set to one indicating
> that we haven't acknowledged yet and need to soon
>
>       [we return from 14 nested DPC calls !!]
>
>     TcpSndAck  Tcb->DelayedAck = 0
>     ^^ But when the Dispatch returns from the TcpTransmitSegment we
> clear DelayedAck back to zero such that we never acknowledge the last
> packet we received.
>
>     TcpTickingDpc No timer active or expired
>     TcpTickingDpc Tcb: 0x49918bd0, DelayedAck=0, CtrlFlag: 0x1019
>     TcpTickingDpc No timer active or expired
>
> In cases where TCP is waiting for more data from the remote endpoint
> and the endpoint is waiting for acknowledgement (like a situation
> where the sliding window is small or closed) the remote endpoint will
> retransmit the last packet but for reasons I don't yet understand
> duplicate acks aren't sent (I think there's another bug around retransmitting 
> acks but I haven't isolated
> it).   The result is that the connection is reset by the remote endpoint 
> after a
> timeout period elapses.
>
>
> This seems kind of like a critical section / locking problem around
> the DelayedAck member.  I'm not sure if the issue stems from the fact
> that the DPC model allows a form of preemption at the same TPL level
> (in this case we have TCP4 at TPL_CALLBACK effectively preempting TCP4
> at TPL_CALLBACK) or if the issue is that the TCP4 driver is not
> managing the DelayedAck variable correctly given the rules for DPC routines 
> (btw: what are the rules?).
>
>
> I'd like an explanation for what the expected DPC preemption model is
> - are we really supposed to be nesting dispatch calls and are all
> network components expected to handle this level or reentrancy?
>
> We are testing a change to remove DPC nesting which we believe should
> resolve this issue.  We would be happy to provide the patch if there's
> agreement that this is the right way to resolve this.
>
> I'd be open to a phone call / Lync meeting with the maintainers if you
> feel it would be helpful.
>
> Thanks,
>
> Eugene
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>
> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>
https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to