Thanks Siyuan, I reviewed the series, looks good. Sriram.
-----Original Message----- From: Fu, Siyuan [mailto:siyuan...@intel.com] Sent: Wednesday, February 03, 2016 9:19 AM To: Laszlo Ersek; Subramanian, Sriram (EG Servers Platform SW) Cc: edk2-de...@ml01.01.org; Ye, Ting; Wu, Jiaxin; El-Haj-Mahmoud, Samer; Li, Ruth; Hsiung, Harry L Subject: RE: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP TX buffer pool. Hi, Laszlo and Sriram I have sent out the v2 patch, only the commit message is updated, please help to review. Thanks Siyuan -----Original Message----- From: Laszlo Ersek [mailto:ler...@redhat.com] Sent: Tuesday, February 2, 2016 7:36 PM To: Subramanian, Sriram (EG Servers Platform SW) <srira...@hpe.com>; Fu, Siyuan <siyuan...@intel.com> Cc: edk2-de...@ml01.01.org; Ye, Ting <ting...@intel.com>; Wu, Jiaxin <jiaxin...@intel.com>; El-Haj-Mahmoud, Samer <samer.el-haj-mahm...@hpe.com>; Li, Ruth <ruth...@intel.com>; Hsiung, Harry L <harry.l.hsi...@intel.com> Subject: Re: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP TX buffer pool. On 02/02/16 11:40, Subramanian, Sriram (EG Servers Platform SW) wrote: > Hi Laszlo, > > Thanks for your feedback. > > With [Patch 3/3], that fixes an issue where we could return > EFI_DEVICE_ERROR (on UNDI returning PXE_STATCODE_BUFFER_FULL), [Patch > 1/3] is no longer a pressing need. > > Otherwise, with the bug described in [Patch 3/3], even if the UNDI > returned PXE_STATCODE_BUFFER_FULL, we do *not* execute the second call > tree you mentioned, because the status returned from SNP is > *not* EFI_NOT_READY, but is EFI_DEVICE_ERROR. > >> However, there is *another* call path leading to Snp->GetStatus(). It is >> right after the above Snp->Transmit(): >> >> MnpTransmit() [MnpMain.c] >> MnpBuildTxPacket() [MnpIo.c] >> MnpAllocTxBuf() [MnpConfig.c] >> IF no room in FreeTxBufList THEN: >> MnpRecycleTxBuf() [MnpConfig.c] >> LOOP: >> Snp->GetStatus() >> MnpFreeTxBuf() [MnpConfig.c] >> MnpSyncSendPacket() [MnpIo.c] >> Snp->Transmit() >> IF (Status == EFI_NOT_READY) THEN: >> MnpRecycleTxBuf() [MnpConfig.c] >> LOOP: >> Snp->GetStatus() <--------------------------------- here >> MnpFreeTxBuf() [MnpConfig.c] >> Snp->Transmit() > > That said, my other concern at that time for pushing to a smaller > value (so GetStatus could be called sooner) was that we could expose > potentially untested code in various UNDIs (if we take longer to call > GetStatus), and so the we may end up seeing hangs, corruption, or > other weird behavior (UNDIs may not return the correct statuses for > the SNP and MNP to correctly retry/call GetStatus to clear the > buffers). We actually have NICs that have started failing after SVN > r19623 and r19624, and continue to fail even after applying the > patches in this series, something we're working with the vendors to > get them addressed. In this case, I agree that patch #1 should be included as well, but the commit message should be *very* clear that the change is being done only as a workaround for incorrect UNDI drivers. The commit message should also include your above summary. Thanks! Laszlo > > Thanks, > Sriram. > > -----Original Message----- > From: Fu, Siyuan [mailto:siyuan...@intel.com] > Sent: Tuesday, February 02, 2016 1:56 PM > To: Laszlo Ersek; Subramanian, Sriram (EG Servers Platform SW) > Cc: edk2-de...@ml01.01.org; Ye, Ting; Wu, Jiaxin; El-Haj-Mahmoud, > Samer; Li, Ruth; Hsiung, Harry L > Subject: RE: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP > TX buffer pool. > > Hi, Laszlo > > Thanks for your detailed analysis, I will update the commit log for the patch > 2/3, and let's wait Sriram to see if he has any concern about 1/3. > > Best Regards > Siyuan > > -----Original Message----- > From: Laszlo Ersek [mailto:ler...@redhat.com] > Sent: Tuesday, February 2, 2016 4:12 PM > To: Fu, Siyuan <siyuan...@intel.com>; Subramanian Sriram > <srira...@hpe.com> > Cc: edk2-de...@ml01.01.org; Ye, Ting <ting...@intel.com>; Wu, Jiaxin > <jiaxin...@intel.com>; El-Haj-Mahmoud, Samer > <samer.el-haj-mahm...@hpe.com>; Li, Ruth <ruth...@intel.com>; Hsiung, > Harry L <harry.l.hsi...@intel.com> > Subject: Re: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP > TX buffer pool. > > Hello Siyuan, > > On 02/02/16 02:03, Fu, Siyuan wrote: >> Hi, Laszlo >> >> Yes it's to make it same as the MAX_XMIT_BUFFERS in PxeGetStatus(), >> and I agree that the MNP and SNP driver are independent with each >> other. I make this change just because some people in edk2 community >> have concern about the different length in MNP buffer pool and PXE >> recycle array, you could see attached email thread for details. IMHO, >> either 32 or 64 or any other different value doesn't make much >> different, MNP_TX_BUFFER_INCREASEMENT is just the initial length of >> the buffer pool, MNP will increase the pool size if run out of it, >> and recycle buffer from SNP if the transmitted buffer queue if full. > > Thanks for the info. I think this discussion should have happened on-list to > begin with. If that's okay, I'd like to continue it on-list; I've now CC'd > the participants of the attached thread. > > So as far as I can see, there are three concerns (I'll list them first, then > react to them separately): > > (1) A potential issue where MNP will reject all caller requests to transmit, > because the UNDI underlying the SNP underlying the MNP has all outgoing > buffers full, and MNP will not try to recycle UNDI buffers (via recycling SNP > buffers). > > In the words of Sriram: > >> This is exactly my concern. We only issue the Snp->GetStatus() from >> MnpDxe after we exhaust the FreeTxBufList, so it may be a while >> before we call GetStatus again. > > (2) In the implementation of Snp->Transmit() -- PxeTransmit() in > "MdeModulePkg/Universal/Network/SnpDxe/Transmit.c" -- we handle BUFFER_FULL > as a permanent error, under the "default:" label. According to UEFI v2.6, > E.4.18.7, we should cause our SNP implementation to "Call Get Status command > to empty buffer". > > (3) In the implementation of Snp->GetStatus() -- PxeGetStatus() in > "MdeModulePkg/Universal/Network/SnpDxe/Get_status.c" --, we only provide UNDI > with 1 slot to report transmitted buffers in, even though we have room for 32 > (MAX_XMIT_BUFFERS) in the PXE_DB_GET_STATUS.TxBuffer array. > > First, this pessimizes the code -- it makes the loop at the end of > PxeGetStatus() useless --, second, some non-conformant UNDI implementation > that Siyuan tested does not care about Snp->Cdb.DBsize, and *always* fills in > 32 slots. > > --*-- > > Assuming I managed to state the problems correctly, here's what I think, in > reverse order: > > ** I agree that (3) should be modified, if for nothing else then to avoid the > buffer overflow. However, the commit message of: > > MdeModulePkg: Update DBsize in SNP GetStatus command > > should be changed to state both reasons explicitly: > - make the loop at the end of PxeGetStatus() useful > - prevent buffer overflow with the non-conformant UNDI driver Siyuan > tested > > > ** I agree that (2) should be fixed. I think the currently proposed > commit message for patch > > MdeModulePkg: Correct one return status code in SNP Transmit > function > > is correct. > > > ** I think that (3) is a non-issue. The situation described by Sriram > in > (1) -- which would amount to a "livelock", i.e., repeated > Mnp->Transmit() calls without making progress -- is *not* possible. > > I think Siyuan explained it already in the attached thread why it is not > possible, but let me try to explain it as well. > > Namely, Sriram correctly identified the following call chain (all referenced > files are under "MdeModulePkg/Universal/Network/MnpDxe/"): > > MnpTransmit() [MnpMain.c] > MnpBuildTxPacket() [MnpIo.c] > MnpAllocTxBuf() [MnpConfig.c] > IF no room in FreeTxBufList THEN: > MnpRecycleTxBuf() [MnpConfig.c] > LOOP: > Snp->GetStatus() > MnpFreeTxBuf() [MnpConfig.c] > MnpSyncSendPacket() [MnpIo.c] > Snp->Transmit() > > Sriram is correct that, if there *is* room in FreeTxBufList, then we > will *not* call MnpRecycleTxBuf() -- hence, we will not call > Snp->GetStatus(). And then Snp->Transmit() could fail, if the > Snp->underlying > UNDI has no room. > > However, there is *another* call path leading to Snp->GetStatus(). It is > right after the above Snp->Transmit(): > > MnpTransmit() [MnpMain.c] > MnpBuildTxPacket() [MnpIo.c] > MnpAllocTxBuf() [MnpConfig.c] > IF no room in FreeTxBufList THEN: > MnpRecycleTxBuf() [MnpConfig.c] > LOOP: > Snp->GetStatus() > MnpFreeTxBuf() [MnpConfig.c] > MnpSyncSendPacket() [MnpIo.c] > Snp->Transmit() > IF (Status == EFI_NOT_READY) THEN: > MnpRecycleTxBuf() [MnpConfig.c] > LOOP: > Snp->GetStatus() <--------------------------------- here > MnpFreeTxBuf() [MnpConfig.c] > Snp->Transmit() > > If the underlying UNDI fails due to lack of buffer space, then the > first > Snp->Transmit() call will return EFI_NOT_READY. Then > Snp->MnpSyncSendPacket() > will immediately call MnpRecycleTxBuf() -- *regardless* of whether on entry > to MnpTransmit(), "FreeTxBufList" was empty or not --, and UNDI gets a chance > to return transmitted buffers. > > And then Snp->Transmit() is retried. > > Again, I think Siyuan explained it well in his email already: > >> We do have method to know whether the UNDI queue is full. The UNDI >> Transmit command should return an error CDB.StatCode (BUFFER_FULL >> according to UEFI spec in E.4.18.7, while some UNDI return QUEUE_FULL >> instead of BUFFER_FULL per my test result), then the SNP->Transmit() >> should return EFI_NOT_READY to indicate the caller that the network >> interface is too busy to accept this transmit request (see >> Sn->Transmit API description and return status table in UEFI spec). >> That's why the MNP driver also call Snp->GetStatus() to recycle the >> buffer if a Snp->Transmit() return EFI_NOT_READY. > > But I hope that the above call trees make it easier to understand. > Therefore I propose to drop patch > > MdeModulePkg: Update the default size of MNP TX buffer pool > > from the series. > > Sriram, do you agree with this analysis? > > > ** In summary: > > - "[Patch 1/3] MdeModulePkg: Update the default size of MNP TX buffer pool." > --> I propose to drop this patch. Sriram, do you agree? > > - [edk2] [Patch 2/3] MdeModulePkg: Update DBsize in SNP GetStatus command. > --> please explain in the commit message *both* reasons why this patch > is needed (i.e., to make the loop in PxeGetStatus() useful, and to prevent > buffer overflow from non-conformant UNDI drivers). > > With that spelled out, for patch 2: > > Reviewed-by: Laszlo Ersek <ler...@redhat.com> > > - [edk2] [Patch 3/3] MdeModulePkg: Correct one return status code in SNP > Transmit function. > --> patch is good as-is. > > Reviewed-by: Laszlo Ersek <ler...@redhat.com> > > Thanks! > Laszlo > >> >> Best Regards >> Siyuan >> >> -----Original Message----- >> From: Laszlo Ersek [mailto:ler...@redhat.com] >> Sent: Tuesday, February 2, 2016 12:13 AM >> To: Fu, Siyuan <siyuan...@intel.com>; edk2-de...@ml01.01.org >> Cc: Ye, Ting <ting...@intel.com>; Wu, Jiaxin <jiaxin...@intel.com> >> Subject: Re: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP >> TX buffer pool. >> >> On 02/01/16 03:51, Fu Siyuan wrote: >>> This patch update the default MNP TX buffer increasement to 32, so >>> the default TX pool length is same as the maximum recycled buffer >>> numbers in one UNDI GetStatus command. >> >> Is that MAX_XMIT_BUFFERS in PxeGetStatus()? >> >>> Contributed-under: TianoCore Contribution Agreement 1.0 >>> Signed-off-by: Fu Siyuan <siyuan...@intel.com> >>> CC: Ye Ting <ting...@intel.com> >>> CC: Wu Jiaxin <jiaxin...@intel.com> >>> --- >>> MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h >>> b/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h >>> index c66be64..51391af 100644 >>> --- a/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h >>> +++ b/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h >>> @@ -27,7 +27,7 @@ WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER >>> EXPRESS OR IMPLIED. >>> #define MNP_INIT_NET_BUFFER_NUM 512 >>> #define MNP_NET_BUFFER_INCREASEMENT 64 >>> #define MNP_MAX_NET_BUFFER_NUM 65536 >>> -#define MNP_TX_BUFFER_INCREASEMENT 64 >>> +#define MNP_TX_BUFFER_INCREASEMENT 32 // Same as the recycling Q >>> length for xmit_done in UNDI command. >>> #define MNP_MAX_TX_BUFFER_NUM 65536 >>> >>> #define MNP_MAX_RCVD_PACKET_QUE_SIZE 256 >>> >> >> Out of curiosity: is this a bugfix or an optimization? >> >> If a bugfix, then can you please explain why those two things (i.e., >> MNP_TX_BUFFER_INCREASEMENT in MnpAllocTxBuf(), and MAX_XMIT_BUFFERS >> in >> PxeGetStatus) should match? >> >> As a perf optimization I guess this makes sense, but otherwise, SNP >> internals and MNP should be independent, shouldn't they? >> >> Thanks >> Laszlo >> > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel