I will separate the Quark platform code cleanup work from this serial patches. Thanks.
Steven Shi Intel\SSG\STO\UEFI Firmware Tel: +86 021-61166522 iNet: 821-6522 > -----Original Message----- > From: Kinney, Michael D > Sent: Thursday, June 30, 2016 2:10 AM > To: af...@apple.com; Shi, Steven <steven....@intel.com>; Kinney, Michael D > <michael.d.kin...@intel.com> > Cc: edk2-devel <edk2-devel@lists.01.org>; Gao, Liming > <liming....@intel.com>; Zimmer, Vincent <vincent.zim...@intel.com>; > Doran, Mark <mark.do...@intel.com>; Mudusuru, Giri P > <giri.p.mudus...@intel.com> > Subject: RE: [PATCH 5/7] QuarkSocPkg-MemoryInitPei: Enable compiler > builtin and disable CLANG LTO > > Andrew, > > I agree. I will work with Steven on this module to improve its compiler > compatibility. > > Mike > > > -----Original Message----- > > From: af...@apple.com [mailto:af...@apple.com] > > Sent: Wednesday, June 29, 2016 10:57 AM > > To: Shi, Steven <steven....@intel.com> > > Cc: edk2-devel <edk2-devel@lists.01.org>; Gao, Liming > <liming....@intel.com>; Kinney, > > Michael D <michael.d.kin...@intel.com>; Zimmer, Vincent > <vincent.zim...@intel.com>; > > Doran, Mark <mark.do...@intel.com>; Mudusuru, Giri P > <giri.p.mudus...@intel.com> > > Subject: Re: [PATCH 5/7] QuarkSocPkg-MemoryInitPei: Enable compiler > builtin and disable > > CLANG LTO > > > > > > > On Jun 28, 2016, at 8:18 AM, Shi, Steven <steven....@intel.com> wrote: > > > > > > The MRC use the memset() in the code and assume the compiler to offer > > > such builtin fuctions. Add work around to permit compiler add builtin > > > functions in this module. Also MRC mix some inline assembly, > > > e.g. asm("rdtsc":"=a"(tscL),"=d"(tscH)),in the C code, and CLANG38 > > > could wrongly remove them if enable LTO with high optimization level. > > > Add work around to simply disable the LTO in this module. > > > > > > > Steven, > > > > If this code is in the edk2 it should follow the edk2 coding rules and we > should clean > > it up and not propagate the workarounds to new compilers. Not following > the rules in > > the 1st place is what made this code harder to port. Actually if you look at > the code > > it started off using VC++ compiler intrinsics and inline assembler syntax, > and then it > > had to add some hacky #ifdefs to compile with GCC. If the code just used > BaseLib and > > BaseMemoryLib it would not require compiler specific hacks. > > > > Why not fix the code to NOT depend on compiler intrinsics and use > ZeroMem()? > > /work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git > grep memcpy > > Pei/meminit_utils.h:94:void *memcpy(void *d, const void *s, size_t n); > > > ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git > grep memset > > Pei/MemoryInitPei.inf:79: # /Oi option to use the intrinsic memset > function in source > > code. > > Pei/meminit.c:1111: memset((void *) (final_delay), 0x00, (size_t) > > sizeof(final_delay)); > > Pei/meminit.c:1396: memset((void *) (final_delay), 0x00, (size_t) > > sizeof(final_delay)); > > Pei/meminit.c:1650: memset((void *) (final_delay), 0x00, (size_t) > > sizeof(final_delay)); > > Pei/meminit.c:1972: memset((void *) (final_delay), 0x00, (size_t) > > sizeof(final_delay)); > > Pei/meminit_utils.h:93:void *memset(void *d, int c, size_t n); > > > > All the inline assembly code is duplicating BaseLib functionality so why not > fix it? > > > ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)>git > grep -W asm > > Pei/meminit.c:2240:static void asm_wbinvd(void) > > Pei/meminit.c-2241-{ > > Pei/meminit.c-2242-#if defined (SIM) || defined (GCC) > > Pei/meminit.c:2243: asm( > > Pei/meminit.c-2244- "wbinvd;" > > Pei/meminit.c-2245- ); > > Pei/meminit.c-2246-#else > > Pei/meminit.c:2247: __asm wbinvd; > > Pei/meminit.c-2248-#endif > > Pei/meminit.c-2249-} > > Pei/meminit.c-2250- > > Pei/meminit.c-2251-// cache invalidate > > Pei/meminit.c:2252:static void asm_invd(void) > > Pei/meminit.c-2253-{ > > Pei/meminit.c-2254-#if defined (SIM) || defined (GCC) > > Pei/meminit.c:2255: asm( > > Pei/meminit.c-2256- "invd;" > > Pei/meminit.c-2257- ); > > Pei/meminit.c-2258-#else > > Pei/meminit.c:2259: __asm invd; > > Pei/meminit.c-2260-#endif > > Pei/meminit.c-2261-} > > Pei/meminit.c-2262- > > Pei/meminit.c-2263- > > Pei/meminit.c=2264=static void cpu_read(void) > > Pei/meminit.c-2265-{ > > Pei/meminit.c-2266- uint32_t adr, dat, limit; > > Pei/meminit.c-2267- > > Pei/meminit.c:2268: asm_invd(); > > Pei/meminit.c-2269- > > Pei/meminit.c-2270- limit = 8 * 1024; > > Pei/meminit.c-2271- for (adr = 0; adr < limit; adr += 4) > > Pei/meminit.c-2272- { > > Pei/meminit.c-2273- dat = *(uint32_t*) adr; > > Pei/meminit.c-2274- if ((adr & 0x0F) == 0) > > Pei/meminit.c-2275- { > > Pei/meminit.c-2276- DPF(D_INFO, "\n%x : ", adr); > > Pei/meminit.c-2277- } > > Pei/meminit.c-2278- DPF(D_INFO, "%x ", dat); > > Pei/meminit.c-2279- } > > Pei/meminit.c-2280- DPF(D_INFO, "\n"); > > Pei/meminit.c-2281- > > Pei/meminit.c-2282- DPF(D_INFO, "CPU read done\n"); > > Pei/meminit.c-2283-} > > Pei/meminit.c-2284- > > Pei/meminit.c-2285- > > Pei/meminit.c=2286=static void cpu_write(void) > > Pei/meminit.c-2287-{ > > Pei/meminit.c-2288- uint32_t adr, limit; > > Pei/meminit.c-2289- > > Pei/meminit.c-2290- limit = 8 * 1024; > > Pei/meminit.c-2291- for (adr = 0; adr < limit; adr += 4) > > Pei/meminit.c-2292- { > > Pei/meminit.c-2293- *(uint32_t*) adr = 0xDEAD0000 + adr; > > Pei/meminit.c-2294- } > > Pei/meminit.c-2295- > > Pei/meminit.c:2296: asm_wbinvd(); > > Pei/meminit.c-2297- > > Pei/meminit.c-2298- DPF(D_INFO, "CPU write done\n"); > > Pei/meminit.c-2299-} > > Pei/meminit.c-2300- > > Pei/meminit.c-2301- > > Pei/meminit.c=2302=static void cpu_memory_test( > > Pei/meminit.c-2303- MRCParams_t *mrc_params) > > Pei/meminit.c-2304-{ > > Pei/meminit.c-2305- uint32_t result = 0; > > Pei/meminit.c-2306- uint32_t val, dat, adr, adr0, step, limit; > > Pei/meminit.c-2307- uint64_t my_tsc; > > Pei/meminit.c-2308- > > Pei/meminit.c-2309- ENTERFN(); > > Pei/meminit.c-2310- > > Pei/meminit.c:2311: asm_invd(); > > Pei/meminit.c-2312- > > Pei/meminit.c-2313- adr0 = 1 * 1024 * 1024; > > Pei/meminit.c-2314- limit = 256 * 1024 * 1024; > > Pei/meminit.c-2315- > > Pei/meminit.c-2316- for (step = 0; step <= 4; step++) > > Pei/meminit.c-2317- { > > Pei/meminit.c-2318- DPF(D_INFO, "Mem test step %d starting > from %xh\n", step, adr0); > > Pei/meminit.c-2319- > > Pei/meminit.c-2320- my_tsc = read_tsc(); > > Pei/meminit.c-2321- for (adr = adr0; adr < limit; adr += > > sizeof(uint32_t)) > > Pei/meminit.c-2322- { > > Pei/meminit.c-2323- if (step == 0) dat = adr; > > Pei/meminit.c-2324- else if (step == 1) dat = (1 << ((adr >> 2) & > > 0x1f)); > > Pei/meminit.c-2325- else if (step == 2) dat = ~(1 << ((adr >> 2) & > > 0x1f)); > > Pei/meminit.c-2326- else if (step == 3) dat = 0x5555AAAA; > > Pei/meminit.c-2327- else if (step == 4) dat = 0xAAAA5555; > > Pei/meminit.c-2328- > > Pei/meminit.c-2329- *(uint32_t*) adr = dat; > > Pei/meminit.c-2330- } > > Pei/meminit.c-2331- DPF(D_INFO, "Write time %llXh\n", read_tsc() - > my_tsc); > > Pei/meminit.c-2332- > > Pei/meminit.c-2333- my_tsc = read_tsc(); > > Pei/meminit.c-2334- for (adr = adr0; adr < limit; adr += > > sizeof(uint32_t)) > > Pei/meminit.c-2335- { > > Pei/meminit.c-2336- if (step == 0) dat = adr; > > Pei/meminit.c-2337- else if (step == 1) dat = (1 << ((adr >> 2) & > > 0x1f)); > > Pei/meminit.c-2338- else if (step == 2) dat = ~(1 << ((adr >> 2) & > > 0x1f)); > > Pei/meminit.c-2339- else if (step == 3) dat = 0x5555AAAA; > > Pei/meminit.c-2340- else if (step == 4) dat = 0xAAAA5555; > > Pei/meminit.c-2341- > > Pei/meminit.c-2342- val = *(uint32_t*) adr; > > Pei/meminit.c-2343- > > Pei/meminit.c-2344- if (val != dat) > > Pei/meminit.c-2345- { > > Pei/meminit.c-2346- DPF(D_INFO, "%x vs. %x@%x\n", dat, val, adr); > > Pei/meminit.c-2347- result = adr|BIT31; > > Pei/meminit.c-2348- } > > Pei/meminit.c-2349- } > > Pei/meminit.c-2350- DPF(D_INFO, "Read time %llXh\n", read_tsc() - > my_tsc); > > Pei/meminit.c-2351- } > > Pei/meminit.c-2352- > > Pei/meminit.c-2353- DPF( D_INFO, "Memory test result %x\n", result); > > Pei/meminit.c-2354- LEAVEFN(); > > Pei/meminit.c-2355-} > > Pei/meminit.c-2356-#endif // MRC_SV > > Pei/meminit.c-2357- > > Pei/meminit.c-2358- > > Pei/meminit.c-2359-// Execute memory test, if error dtected it is > > Pei/meminit.c-2360-// indicated in mrc_params->status. > > -- > > Pei/meminit_utils.c=1184=uint64_t read_tsc( > > Pei/meminit_utils.c-1185- void) > > Pei/meminit_utils.c-1186-{ > > Pei/meminit_utils.c-1187- volatile uint64_t tsc; // EDX:EAX > > Pei/meminit_utils.c-1188- > > Pei/meminit_utils.c-1189-#if defined (SIM) || defined (GCC) > > Pei/meminit_utils.c-1190- volatile uint32_t tscH; // EDX > > Pei/meminit_utils.c-1191- volatile uint32_t tscL;// EAX > > Pei/meminit_utils.c-1192- > > Pei/meminit_utils.c:1193: asm("rdtsc":"=a"(tscL),"=d"(tscH)); > > Pei/meminit_utils.c-1194- tsc = tscH; > > Pei/meminit_utils.c-1195- tsc = (tsc<<32)|tscL; > > Pei/meminit_utils.c-1196-#else > > Pei/meminit_utils.c-1197- tsc = __rdtsc(); > > Pei/meminit_utils.c-1198-#endif > > Pei/meminit_utils.c-1199- > > Pei/meminit_utils.c-1200- return tsc; > > Pei/meminit_utils.c-1201-} > > Pei/meminit_utils.c-1202- > > Pei/meminit_utils.c-1203-// get_tsc_freq: > > Pei/meminit_utils.c-1204-// > > Pei/meminit_utils.c-1205-// This function returns the TSC frequency in MHz > > ~/work/src/edk2/QuarkSocPkg/QuarkNorthCluster/MemoryInit(master)> > > > > > > > > Also clang should not remove inline assemble code as asm blocks are > treated as volatile > > by the compiler. Clang will place the inline assemble code in the bitcode > stream and > > the optimizer will drop it in the correct locations when code is generated. > > > > ~/work/Compiler>cat asm.c > > int > > main () > > { > > volatile unsigned int tscH; // EDX > > volatile unsigned int tscL; // EAX > > > > asm("rdtsc":"=a"(tscL),"=d"(tscH)); > > return 0; > > } > > > > ~/work/Compiler>clang -Os -S -emit-llvm asm.c > > ~/work/Compiler>cat asm.ll > > ; ModuleID = 'asm.c' > > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > > target triple = "x86_64-apple-macosx10.11.0" > > > > ; Function Attrs: nounwind optsize ssp uwtable > > define i32 @main() #0 { > > %tscH = alloca i32, align 4 > > %tscL = alloca i32, align 4 > > %1 = tail call { i32, i32 } asm "rdtsc", > "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() > > #1, !srcloc !1 > > %2 = extractvalue { i32, i32 } %1, 0 > > %3 = extractvalue { i32, i32 } %1, 1 > > store volatile i32 %2, i32* %tscL, align 4 > > store volatile i32 %3, i32* %tscH, align 4 > > ret i32 0 > > } > > > > attributes #0 = { nounwind optsize ssp uwtable "less-precise- > fpmad"="false" "no-frame- > > pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp- > math"="false" "no- > > nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp- > math"="false" "use- > > soft-float"="false" } > > attributes #1 = { nounwind } > > > > !llvm.ident = !{!0} > > > > !0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53) > (based on LLVM > > 3.6.0svn)"} > > !1 = metadata !{i32 99} > > ~/work/Compiler> > > > > If code is optimized out it is usually due to undefined behavior, and the > most common > > error is writing to zero. > > > > ~/work/Compiler>cat asm.c > > int > > main () > > { > > int *ptr = 0; > > volatile unsigned int tscH; // EDX > > volatile unsigned int tscL; // EAX > > > > *ptr = 1; > > asm("rdtsc":"=a"(tscL),"=d"(tscH)); > > return 0; > > } > > > > ~/work/Compiler>clang -Os -S -emit-llvm asm.c > > ~/work/Compiler>cat asm.ll > > ; ModuleID = 'asm.c' > > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > > target triple = "x86_64-apple-macosx10.11.0" > > > > ; Function Attrs: noreturn nounwind optsize ssp uwtable > > define i32 @main() #0 { > > tail call void @llvm.trap() > > unreachable > > } > > > > ; Function Attrs: noreturn nounwind > > declare void @llvm.trap() #1 > > > > attributes #0 = { noreturn nounwind optsize ssp uwtable "less-precise- > fpmad"="false" > > "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no- > infs-fp- > > math"="false" "no-nans-fp-math"="false" "stack-protector-buffer- > size"="8" "unsafe-fp- > > math"="false" "use-soft-float"="false" } > > attributes #1 = { noreturn nounwind } > > > > !llvm.ident = !{!0} > > > > !0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53) > (based on LLVM > > 3.6.0svn)"} > > ~/work/Compiler> > > > > > > Thanks, > > > > Andrew Fish > > > > > Contributed-under: TianoCore Contribution Agreement 1.0 > > > Signed-off-by: Steven Shi <steven....@intel.com> > > > --- > > > QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf | 3 > +++ > > > 1 file changed, 3 insertions(+) > > > mode change 100644 => 100755 > > QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf > > > > > > diff --git > a/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf > > b/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf > > > old mode 100644 > > > new mode 100755 > > > index 78821f5..f593ef1 > > > --- > a/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf > > > +++ > b/QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/MemoryInitPei.inf > > > @@ -38,6 +38,9 @@ > > > GCC:RELEASE_*_*_CC_FLAGS = -DNDEBUG -DGCC -Wno-unused- > function > > > INTEL:RELEASE_*_*_CC_FLAGS = /D NDEBUG > > > MSFT:RELEASE_*_*_CC_FLAGS = /D NDEBUG > > > + GCC:*_CLANG38_IA32_CC_FLAGS = -fno-lto -fbuiltin > > > + GCC:*_CLANGSCAN38_IA32_CC_FLAGS = -fno-lto -fbuiltin > > > + GCC:*_GCC53_IA32_CC_FLAGS = -fbuiltin > > > > > > [Sources] > > > memory_options.h > > > -- > > > 2.7.4 > > > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel