On 11/08/18 03:58, Eric Dong wrote:
> In current implementation, core level semaphore use same container
> with package level semaphore. This design will let the core level
> semaphore not works as expected in below case:
> 1. Feature A has CPU_FEATURE_CORE_BEFORE dependence with Feature B.
> 2. Feature C has CPU_FEATURE_PACKAGE_AFTER dependence with Feature B.
> in this case an core level semaphore will be add between A and B, and
> an package level semaphore will be add between B and C.
> 
> For a CPU has one package, two cores and 4 threads. Execute like below:
> 
>   Thread 1          Thread 2    .....     Thread 4
> ReleaseSemaph(1,2)  -|
> WaitForSemaph(1(2)) -|<-----------------------These two are Core Semaph
>                   ReleaseSemaph(1,2) -|
>                   WaitForSemaph(2)   -| <---  Core Semaph
> 
> ReleaseSemaph (1,2,3,4) -|
> WaitForSemaph (1(4))    -| <----------------  Package Semaph
> 
>                                       ReleaseSemaph(3,4)
>                                       WaitForSemaph(4(2)) <- Core Semaph
> 
> In above case, for thread 4, when it executes a core semaphore, i will
> found WaitForSemaph(4(2)) is met because Thread 1 has execute a package
> semaphore and ReleaseSemaph(4) for it before. This is not an expect
> behavior. Thread 4 should wait for thread 3 to do this.
> 
> Fix this issue by separate the semaphore container for core level and
> package level.
> 
> Cc: Laszlo Ersek <ler...@redhat.com>
> Cc: Ruiyu Ni <ruiyu...@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Eric Dong <eric.d...@intel.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c 
> b/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> index a45e2dd3d7..65461485a4 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> @@ -41,9 +41,10 @@ typedef struct {
>  // Flags used when program the register.
>  //
>  typedef struct {
> -  volatile UINTN           ConsoleLogLock;       // Spinlock used to control 
> console.
> -  volatile UINTN           MemoryMappedLock;     // Spinlock used to program 
> mmio
> -  volatile UINT32          *SemaphoreCount;      // Semaphore used to 
> program semaphore.
> +  volatile UINTN           ConsoleLogLock;          // Spinlock used to 
> control console.
> +  volatile UINTN           MemoryMappedLock;        // Spinlock used to 
> program mmio
> +  volatile UINT32          *CoreSemaphoreCount;     // Semaphore used to 
> program semaphore.
> +  volatile UINT32          *PackageSemaphoreCount;  // Semaphore used to 
> program semaphore.
>  } PROGRAM_CPU_REGISTER_FLAGS;
>  
>  //
> @@ -348,11 +349,12 @@ ProgramProcessorRegister (
>        ASSERT (
>          (ApLocation != NULL) &&
>          (CpuStatus->ValidCoreCountPerPackage != 0) &&
> -        (CpuFlags->SemaphoreCount) != NULL
> +        (CpuFlags->CoreSemaphoreCount != NULL) &&
> +        (CpuFlags->PackageSemaphoreCount != NULL)
>          );
> -      SemaphorePtr = CpuFlags->SemaphoreCount;
>        switch (RegisterTableEntry->Value) {
>        case CoreDepType:
> +        SemaphorePtr = CpuFlags->CoreSemaphoreCount;
>          //
>          // Get Offset info for the first thread in the core which current 
> thread belongs to.
>          //
> @@ -373,6 +375,7 @@ ProgramProcessorRegister (
>          break;
>  
>        case PackageDepType:
> +        SemaphorePtr = CpuFlags->PackageSemaphoreCount;
>          ValidCoreCountPerPackage = (UINT32 
> *)(UINTN)CpuStatus->ValidCoreCountPerPackage;
>          //
>          // Get Offset info for the first thread in the package which current 
> thread belongs to.
> @@ -1037,10 +1040,14 @@ GetAcpiCpuData (
>      ASSERT (mAcpiCpuData.ApLocation != 0);
>    }
>    if (CpuStatus->PackageCount != 0) {
> -    mCpuFlags.SemaphoreCount = AllocateZeroPool (
> +    mCpuFlags.CoreSemaphoreCount = AllocateZeroPool (
>                                   sizeof (UINT32) * CpuStatus->PackageCount *
>                                   CpuStatus->MaxCoreCount * 
> CpuStatus->MaxThreadCount);
> -    ASSERT (mCpuFlags.SemaphoreCount != NULL);
> +    ASSERT (mCpuFlags.CoreSemaphoreCount != NULL);
> +    mCpuFlags.PackageSemaphoreCount = AllocateZeroPool (
> +                                 sizeof (UINT32) * CpuStatus->PackageCount *
> +                                 CpuStatus->MaxCoreCount * 
> CpuStatus->MaxThreadCount);
> +    ASSERT (mCpuFlags.PackageSemaphoreCount != NULL);
>    }
>    InitializeSpinLock((SPIN_LOCK*) &mCpuFlags.MemoryMappedLock);
>    InitializeSpinLock((SPIN_LOCK*) &mCpuFlags.ConsoleLogLock);
> 

The patch looks OK, superficially speaking.

Also this looks like a bugfix to a new feature already committed in this
development cycle, so I think it may go in during the hard feature freeze.

I have some requests (no need to repost):

(1) Please make sure there is a TianoCore BZ for this issue.

(2) Please reference said BZ in the commit message.

(For example, commit c60d36b4d1, for
<https://bugzilla.tianocore.org/show_bug.cgi?id=1307> missed the
reference to BZ#1307.)

(3) Before pushing, please fix up the indentation of the
AllocateZeroPool() arguments (both calls).

(4) Can you please file the bugzilla now about unifying the
implementation between RegisterCpuFeaturesLib and PiSmmCpuDxeSmm?

(That's for the next development cycle, but we should report the BZ for
it at this point, I believe.)

Thanks!
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to