I will separate the Quark platform code cleanup work from this serial patches. 
Thanks.


Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522


> -----Original Message-----
> From: Kinney, Michael D
> Sent: Thursday, June 30, 2016 2:10 AM
> To: af...@apple.com; Shi, Steven <steven....@intel.com>; Kinney, Michael D
> <michael.d.kin...@intel.com>
> Cc: edk2-devel <edk2-devel@lists.01.org>; Gao, Liming
> <liming....@intel.com>; Zimmer, Vincent <vincent.zim...@intel.com>;
> Doran, Mark <mark.do...@intel.com>; Mudusuru, Giri P
> <giri.p.mudus...@intel.com>
> Subject: RE: [PATCH 5/7] QuarkSocPkg-MemoryInitPei: Enable compiler
> builtin and disable CLANG LTO
> 
> Andrew,
> 
> I agree.  I will work with Steven on this module to improve its compiler
> compatibility.
> 
> Mike
> 
> > -----Original Message-----
> > From: af...@apple.com [mailto:af...@apple.com]
> > Sent: Wednesday, June 29, 2016 10:57 AM
> > To: Shi, Steven <steven....@intel.com>
> > Cc: edk2-devel <edk2-devel@lists.01.org>; Gao, Liming
> <liming....@intel.com>; Kinney,
> > Michael D <michael.d.kin...@intel.com>; Zimmer, Vincent
> <vincent.zim...@intel.com>;
> > Doran, Mark <mark.do...@intel.com>; Mudusuru, Giri P
> <giri.p.mudus...@intel.com>
> > Subject: Re: [PATCH 5/7] QuarkSocPkg-MemoryInitPei: Enable compiler
> builtin and disable
> > CLANG LTO
> >
> >
> > > On Jun 28, 2016, at 8:18 AM, Shi, Steven <steven....@intel.com> wrote:
> > >
> > > The MRC use the memset() in the code and assume the compiler to offer
> > > such builtin fuctions. Add work around to permit compiler add builtin
> > > functions in this module. Also MRC mix some inline assembly,
> > > e.g. asm("rdtsc":"=a"(tscL),"=d"(tscH)),in the C code, and CLANG38
> > > could wrongly remove them if enable LTO with high optimization level.
> > > Add work around to simply disable the LTO in this module.
> > >
> >
> > Steven,
> >
> > If this code is in the edk2 it should follow the edk2 coding rules and we
> should clean
> > it up and not propagate the workarounds to new compilers. Not following
> the rules in
> > the 1st place is what made this code harder to port. Actually if you look at
> the code
> > it started off using VC++ compiler intrinsics and inline assembler syntax,
> and then it
> > had to add some hacky #ifdefs to compile with GCC. If the code just used
> BaseLib and
> > BaseMemoryLib it would not require compiler specific hacks.
> >
> > Why not fix the code to NOT depend on compiler intrinsics and use
> ZeroMem()?
> > /work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git
> grep memcpy
> > Pei/meminit_utils.h:94:void *memcpy(void *d, const void *s, size_t n);
> >
> ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git
> grep memset
> > Pei/MemoryInitPei.inf:79:  # /Oi option to use the intrinsic memset
> function in source
> > code.
> > Pei/meminit.c:1111:  memset((void *) (final_delay), 0x00, (size_t)
> > sizeof(final_delay));
> > Pei/meminit.c:1396:  memset((void *) (final_delay), 0x00, (size_t)
> > sizeof(final_delay));
> > Pei/meminit.c:1650:  memset((void *) (final_delay), 0x00, (size_t)
> > sizeof(final_delay));
> > Pei/meminit.c:1972:  memset((void *) (final_delay), 0x00, (size_t)
> > sizeof(final_delay));
> > Pei/meminit_utils.h:93:void *memset(void *d, int c, size_t n);
> >
> > All the inline assembly code is duplicating BaseLib functionality so why not
> fix it?
> >
> ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git
> grep -W  asm
> > Pei/meminit.c:2240:static void asm_wbinvd(void)
> > Pei/meminit.c-2241-{
> > Pei/meminit.c-2242-#if defined (SIM) || defined (GCC)
> > Pei/meminit.c:2243:  asm(
> > Pei/meminit.c-2244-    "wbinvd;"
> > Pei/meminit.c-2245-  );
> > Pei/meminit.c-2246-#else
> > Pei/meminit.c:2247:  __asm wbinvd;
> > Pei/meminit.c-2248-#endif
> > Pei/meminit.c-2249-}
> > Pei/meminit.c-2250-
> > Pei/meminit.c-2251-// cache invalidate
> > Pei/meminit.c:2252:static void asm_invd(void)
> > Pei/meminit.c-2253-{
> > Pei/meminit.c-2254-#if defined (SIM) || defined (GCC)
> > Pei/meminit.c:2255:  asm(
> > Pei/meminit.c-2256-      "invd;"
> > Pei/meminit.c-2257-  );
> > Pei/meminit.c-2258-#else
> > Pei/meminit.c:2259:  __asm invd;
> > Pei/meminit.c-2260-#endif
> > Pei/meminit.c-2261-}
> > Pei/meminit.c-2262-
> > Pei/meminit.c-2263-
> > Pei/meminit.c=2264=static void cpu_read(void)
> > Pei/meminit.c-2265-{
> > Pei/meminit.c-2266-  uint32_t adr, dat, limit;
> > Pei/meminit.c-2267-
> > Pei/meminit.c:2268:  asm_invd();
> > Pei/meminit.c-2269-
> > Pei/meminit.c-2270-  limit = 8 * 1024;
> > Pei/meminit.c-2271-  for (adr = 0; adr < limit; adr += 4)
> > Pei/meminit.c-2272-  {
> > Pei/meminit.c-2273-    dat = *(uint32_t*) adr;
> > Pei/meminit.c-2274-    if ((adr & 0x0F) == 0)
> > Pei/meminit.c-2275-    {
> > Pei/meminit.c-2276-      DPF(D_INFO, "\n%x : ", adr);
> > Pei/meminit.c-2277-    }
> > Pei/meminit.c-2278-    DPF(D_INFO, "%x ", dat);
> > Pei/meminit.c-2279-  }
> > Pei/meminit.c-2280-  DPF(D_INFO, "\n");
> > Pei/meminit.c-2281-
> > Pei/meminit.c-2282-  DPF(D_INFO, "CPU read done\n");
> > Pei/meminit.c-2283-}
> > Pei/meminit.c-2284-
> > Pei/meminit.c-2285-
> > Pei/meminit.c=2286=static void cpu_write(void)
> > Pei/meminit.c-2287-{
> > Pei/meminit.c-2288-  uint32_t adr, limit;
> > Pei/meminit.c-2289-
> > Pei/meminit.c-2290-  limit = 8 * 1024;
> > Pei/meminit.c-2291-  for (adr = 0; adr < limit; adr += 4)
> > Pei/meminit.c-2292-  {
> > Pei/meminit.c-2293-    *(uint32_t*) adr = 0xDEAD0000 + adr;
> > Pei/meminit.c-2294-  }
> > Pei/meminit.c-2295-
> > Pei/meminit.c:2296:  asm_wbinvd();
> > Pei/meminit.c-2297-
> > Pei/meminit.c-2298-  DPF(D_INFO, "CPU write done\n");
> > Pei/meminit.c-2299-}
> > Pei/meminit.c-2300-
> > Pei/meminit.c-2301-
> > Pei/meminit.c=2302=static void cpu_memory_test(
> > Pei/meminit.c-2303-    MRCParams_t *mrc_params)
> > Pei/meminit.c-2304-{
> > Pei/meminit.c-2305-  uint32_t result = 0;
> > Pei/meminit.c-2306-  uint32_t val, dat, adr, adr0, step, limit;
> > Pei/meminit.c-2307-  uint64_t my_tsc;
> > Pei/meminit.c-2308-
> > Pei/meminit.c-2309-  ENTERFN();
> > Pei/meminit.c-2310-
> > Pei/meminit.c:2311:  asm_invd();
> > Pei/meminit.c-2312-
> > Pei/meminit.c-2313-  adr0 = 1 * 1024 * 1024;
> > Pei/meminit.c-2314-  limit = 256 * 1024 * 1024;
> > Pei/meminit.c-2315-
> > Pei/meminit.c-2316-  for (step = 0; step <= 4; step++)
> > Pei/meminit.c-2317-  {
> > Pei/meminit.c-2318-    DPF(D_INFO, "Mem test step %d starting
> from %xh\n", step, adr0);
> > Pei/meminit.c-2319-
> > Pei/meminit.c-2320-    my_tsc = read_tsc();
> > Pei/meminit.c-2321-    for (adr = adr0; adr < limit; adr += 
> > sizeof(uint32_t))
> > Pei/meminit.c-2322-    {
> > Pei/meminit.c-2323-      if (step == 0)      dat = adr;
> > Pei/meminit.c-2324-      else if (step == 1) dat = (1 << ((adr >> 2) & 
> > 0x1f));
> > Pei/meminit.c-2325-      else if (step == 2) dat = ~(1 << ((adr >> 2) & 
> > 0x1f));
> > Pei/meminit.c-2326-      else if (step == 3) dat = 0x5555AAAA;
> > Pei/meminit.c-2327-      else if (step == 4) dat = 0xAAAA5555;
> > Pei/meminit.c-2328-
> > Pei/meminit.c-2329-      *(uint32_t*) adr = dat;
> > Pei/meminit.c-2330-    }
> > Pei/meminit.c-2331-    DPF(D_INFO, "Write time %llXh\n", read_tsc() -
> my_tsc);
> > Pei/meminit.c-2332-
> > Pei/meminit.c-2333-    my_tsc = read_tsc();
> > Pei/meminit.c-2334-    for (adr = adr0; adr < limit; adr += 
> > sizeof(uint32_t))
> > Pei/meminit.c-2335-    {
> > Pei/meminit.c-2336-      if (step == 0)      dat = adr;
> > Pei/meminit.c-2337-      else if (step == 1) dat = (1 << ((adr >> 2) & 
> > 0x1f));
> > Pei/meminit.c-2338-      else if (step == 2) dat = ~(1 << ((adr >> 2) & 
> > 0x1f));
> > Pei/meminit.c-2339-      else if (step == 3) dat = 0x5555AAAA;
> > Pei/meminit.c-2340-      else if (step == 4) dat = 0xAAAA5555;
> > Pei/meminit.c-2341-
> > Pei/meminit.c-2342-      val = *(uint32_t*) adr;
> > Pei/meminit.c-2343-
> > Pei/meminit.c-2344-      if (val != dat)
> > Pei/meminit.c-2345-      {
> > Pei/meminit.c-2346-        DPF(D_INFO, "%x vs. %x@%x\n", dat, val, adr);
> > Pei/meminit.c-2347-        result = adr|BIT31;
> > Pei/meminit.c-2348-      }
> > Pei/meminit.c-2349-    }
> > Pei/meminit.c-2350-    DPF(D_INFO, "Read time %llXh\n", read_tsc() -
> my_tsc);
> > Pei/meminit.c-2351-  }
> > Pei/meminit.c-2352-
> > Pei/meminit.c-2353-  DPF( D_INFO, "Memory test result %x\n", result);
> > Pei/meminit.c-2354-  LEAVEFN();
> > Pei/meminit.c-2355-}
> > Pei/meminit.c-2356-#endif // MRC_SV
> > Pei/meminit.c-2357-
> > Pei/meminit.c-2358-
> > Pei/meminit.c-2359-// Execute memory test, if error dtected it is
> > Pei/meminit.c-2360-// indicated in mrc_params->status.
> > --
> > Pei/meminit_utils.c=1184=uint64_t read_tsc(
> > Pei/meminit_utils.c-1185-    void)
> > Pei/meminit_utils.c-1186-{
> > Pei/meminit_utils.c-1187-  volatile uint64_t tsc;  // EDX:EAX
> > Pei/meminit_utils.c-1188-
> > Pei/meminit_utils.c-1189-#if defined (SIM) || defined (GCC)
> > Pei/meminit_utils.c-1190-  volatile uint32_t tscH; // EDX
> > Pei/meminit_utils.c-1191-  volatile uint32_t tscL;// EAX
> > Pei/meminit_utils.c-1192-
> > Pei/meminit_utils.c:1193:  asm("rdtsc":"=a"(tscL),"=d"(tscH));
> > Pei/meminit_utils.c-1194-  tsc = tscH;
> > Pei/meminit_utils.c-1195-  tsc = (tsc<<32)|tscL;
> > Pei/meminit_utils.c-1196-#else
> > Pei/meminit_utils.c-1197-  tsc = __rdtsc();
> > Pei/meminit_utils.c-1198-#endif
> > Pei/meminit_utils.c-1199-
> > Pei/meminit_utils.c-1200-  return tsc;
> > Pei/meminit_utils.c-1201-}
> > Pei/meminit_utils.c-1202-
> > Pei/meminit_utils.c-1203-// get_tsc_freq:
> > Pei/meminit_utils.c-1204-//
> > Pei/meminit_utils.c-1205-// This function returns the TSC frequency in MHz
> > ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>
> >
> >
> >
> > Also clang should not remove inline assemble code as asm blocks are
> treated as volatile
> > by the compiler. Clang will place the inline assemble code in the bitcode
> stream and
> > the optimizer will drop it in the correct locations when code is generated.
> >
> > ~/work/Compiler>cat asm.c
> > int
> > main ()
> > {
> >   volatile unsigned int tscH;  // EDX
> >   volatile unsigned int tscL;  // EAX
> >
> >   asm("rdtsc":"=a"(tscL),"=d"(tscH));
> >   return 0;
> > }
> >
> > ~/work/Compiler>clang -Os -S -emit-llvm asm.c
> > ~/work/Compiler>cat asm.ll
> > ; ModuleID = 'asm.c'
> > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> > target triple = "x86_64-apple-macosx10.11.0"
> >
> > ; Function Attrs: nounwind optsize ssp uwtable
> > define i32 @main() #0 {
> >   %tscH = alloca i32, align 4
> >   %tscL = alloca i32, align 4
> >   %1 = tail call { i32, i32 } asm "rdtsc",
> "={ax},={dx},~{dirflag},~{fpsr},~{flags}"()
> > #1, !srcloc !1
> >   %2 = extractvalue { i32, i32 } %1, 0
> >   %3 = extractvalue { i32, i32 } %1, 1
> >   store volatile i32 %2, i32* %tscL, align 4
> >   store volatile i32 %3, i32* %tscH, align 4
> >   ret i32 0
> > }
> >
> > attributes #0 = { nounwind optsize ssp uwtable "less-precise-
> fpmad"="false" "no-frame-
> > pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-
> math"="false" "no-
> > nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-
> math"="false" "use-
> > soft-float"="false" }
> > attributes #1 = { nounwind }
> >
> > !llvm.ident = !{!0}
> >
> > !0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53)
> (based on LLVM
> > 3.6.0svn)"}
> > !1 = metadata !{i32 99}
> > ~/work/Compiler>
> >
> > If code is optimized out it is usually due to undefined behavior, and the
> most common
> > error is writing to zero.
> >
> > ~/work/Compiler>cat asm.c
> > int
> > main ()
> > {
> >   int *ptr = 0;
> >   volatile unsigned int tscH;  // EDX
> >   volatile unsigned int tscL;  // EAX
> >
> >   *ptr = 1;
> >   asm("rdtsc":"=a"(tscL),"=d"(tscH));
> >   return 0;
> > }
> >
> > ~/work/Compiler>clang -Os -S -emit-llvm asm.c
> > ~/work/Compiler>cat asm.ll
> > ; ModuleID = 'asm.c'
> > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> > target triple = "x86_64-apple-macosx10.11.0"
> >
> > ; Function Attrs: noreturn nounwind optsize ssp uwtable
> > define i32 @main() #0 {
> >   tail call void @llvm.trap()
> >   unreachable
> > }
> >
> > ; Function Attrs: noreturn nounwind
> > declare void @llvm.trap() #1
> >
> > attributes #0 = { noreturn nounwind optsize ssp uwtable "less-precise-
> fpmad"="false"
> > "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-
> infs-fp-
> > math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-
> size"="8" "unsafe-fp-
> > math"="false" "use-soft-float"="false" }
> > attributes #1 = { noreturn nounwind }
> >
> > !llvm.ident = !{!0}
> >
> > !0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53)
> (based on LLVM
> > 3.6.0svn)"}
> > ~/work/Compiler>
> >
> >
> > Thanks,
> >
> > Andrew Fish
> >
> > > Contributed-under: TianoCore Contribution Agreement 1.0
> > > Signed-off-by: Steven Shi <steven....@intel.com>
> > > ---
> > > QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf | 3
> +++
> > > 1 file changed, 3 insertions(+)
> > > mode change 100644 => 100755
> > QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf
> > >
> > > diff --git
> a/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf
> > b/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf
> > > old mode 100644
> > > new mode 100755
> > > index 78821f5..f593ef1
> > > ---
> a/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf
> > > +++
> b/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf
> > > @@ -38,6 +38,9 @@
> > >   GCC:RELEASE_*_*_CC_FLAGS             = -DNDEBUG -DGCC -Wno-unused-
> function
> > >   INTEL:RELEASE_*_*_CC_FLAGS           = /D NDEBUG
> > >   MSFT:RELEASE_*_*_CC_FLAGS            = /D NDEBUG
> > > +  GCC:*_CLANG38_IA32_CC_FLAGS          = -fno-lto -fbuiltin
> > > +  GCC:*_CLANGSCAN38_IA32_CC_FLAGS      = -fno-lto -fbuiltin
> > > +  GCC:*_GCC53_IA32_CC_FLAGS            = -fbuiltin
> > >
> > > [Sources]
> > >   memory_options.h
> > > --
> > > 2.7.4
> > >

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to