Re: [LLVMdev] clang .code16 with -Os producing larger code that it needs to
Your task, should you choose to accept it, is to make it cope with other forms of relaxation where necessary. And if not, please open a bug :-) There are a few other missing cases that cause MC to produce code that is more relaxed than it needs to be. Cheers, Rafael ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: clang .code16 with -Os producing larger code that it needs to
On 20.02.2015 15:58, Vladimir 'φ-coder/phcoder' Serbinenko wrote: When experimenting with compiling GRUB2 with clang using integrated as, I found out that it generates a 16-bit code bigger than gas counterpart and result gets too big for size constraints of bootsector. This was traced mainly to 2 problems. 32-bit access to 16-bit addresses. source: movlLOCAL(kernel_sector), %ebx movl%ebx, 8(%si) clang: 7cbc: 67 66 8b 1d 5c 7c 00addr32 mov 0x7c5c,%ebx 7cc3: 00 7cc4: 66 89 5c 08 mov%ebx,0x8(%si) gas: 7cbc: 66 8b 1e 5c 7c mov0x7c5c,%ebx 7cc1: 66 89 5c 08 mov%ebx,0x8(%si) 32-bit jump. source: jnb LOCAL(floppy_probe) clang: +7cb5:66 0f 83 07 01 00 00jae7dc3 L_floppy_probe gas: -7cb5:0f 83 0a 01 jae7dc3 L_floppy_probe Minimal example would be: .code16 jmp 1f .space 256 1: nop clang: 0: 66 e9 00 01 00 00 jmpl 0x106 ... 106: 90 nop gcc: 0: e9 00 01jmp0x103 ... 103: 90 nop The last one is particularly problematic as it never makes sense to issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per jump. Is it possible to force clang to generate 16-bit jumps? On bright side if I remove error strings the code is functional. signature.asc Description: OpenPGP digital signature ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
clang .code16 with -Os producing larger code that it needs to
When experimenting with compiling GRUB2 with clang using integrated as, I found out that it generates a 16-bit code bigger than gas counterpart and result gets too big for size constraints of bootsector. This was traced mainly to 2 problems. 32-bit access to 16-bit addresses. source: movlLOCAL(kernel_sector), %ebx movl%ebx, 8(%si) clang: 7cbc: 67 66 8b 1d 5c 7c 00addr32 mov 0x7c5c,%ebx 7cc3: 00 7cc4: 66 89 5c 08 mov%ebx,0x8(%si) gas: 7cbc: 66 8b 1e 5c 7c mov0x7c5c,%ebx 7cc1: 66 89 5c 08 mov%ebx,0x8(%si) 32-bit jump. source: jnb LOCAL(floppy_probe) clang: +7cb5: 66 0f 83 07 01 00 00jae7dc3 L_floppy_probe gas: -7cb5: 0f 83 0a 01 jae7dc3 L_floppy_probe The last one is particularly problematic as it never makes sense to issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per jump. Is it possible to force clang to generate 16-bit jumps? On bright side if I remove error strings the code is functional. signature.asc Description: OpenPGP digital signature ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH 04/18] xen/x86: add multiboot2 protocol support
On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote: --- a/xen/arch/x86/boot/Makefile +++ b/xen/arch/x86/boot/Makefile @@ -1,6 +1,7 @@ obj-bin-y += head.o -RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h $(BASEDIR)/include/xen/multiboot.h +RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h $(BASEDIR)/include/xen/compiler.h \ + $(BASEDIR)/include/xen/multiboot.h $(BASEDIR)/include/xen/multiboot2.h Perhaps we should rather have the compiler generate the dependencies for us when compiling reloc.o? @@ -32,6 +33,59 @@ ENTRY(start) .long MULTIBOOT_HEADER_FLAGS /* Checksum: must be the negated sum of the first two fields. */ .long -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS) + +/*** MULTIBOOT2 HEADER / +/* Some ideas are taken from grub-2.00/grub-core/tests/boot/kernel-i386.S file. */ +.align MULTIBOOT2_HEADER_ALIGN + +.Lmultiboot2_header: +/* Magic number indicating a Multiboot2 header. */ +.long MULTIBOOT2_HEADER_MAGIC +/* Architecture: i386. */ +.long MULTIBOOT2_ARCHITECTURE_I386 +/* Multiboot2 header length. */ +.long .Lmultiboot2_header_end - .Lmultiboot2_header +/* Multiboot2 header checksum. */ +.long -(MULTIBOOT2_HEADER_MAGIC + MULTIBOOT2_ARCHITECTURE_I386 + \ +(.Lmultiboot2_header_end - .Lmultiboot2_header)) So here ... +/* Multiboot2 tags... */ +.Lmultiboot2_info_req: +/* Multiboot2 information request tag. */ +.short MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST +.short MULTIBOOT2_HEADER_TAG_REQUIRED +.long .Lmultiboot2_info_req_end - .Lmultiboot2_info_req .. and here you properly calculate the length, while ... +.long MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO +.long MULTIBOOT2_TAG_TYPE_MMAP +.Lmultiboot2_info_req_end: + +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_MODULE_ALIGN +.short MULTIBOOT2_HEADER_TAG_REQUIRED +.long 8 /* Tag size. */ ... here and further down you hard-code it. Please be consistent (and if you go the calculation route, perhaps introduce some redundancy reducing macro). + +/* Console flags tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_CONSOLE_FLAGS +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 12 /* Tag size. */ +.long MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED + +/* Framebuffer tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_FRAMEBUFFER +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 20 /* Tag size. */ +.long 0 /* Number of the columns - no preference. */ +.long 0 /* Number of the lines - no preference. */ +.long 0 /* Number of bits per pixel - no preference. */ + +/* Multiboot2 header end tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_END +.short 0 +.long 8 /* Tag size. */ +.Lmultiboot2_header_end: .section .init.rodata, a, @progbits .align 4 @@ -81,10 +135,51 @@ __start: mov %ecx,%es mov %ecx,%ss +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */ Above you meet coding style requirements, but here you stop to do so (ongoing below)? Even if pre-existing neighboring comments aren't well formed, please don't make the situation worse. Also please don't say 12 here - I first even mistook this as an array index, but you seem to mean 1 or 2. Please use {1,2} instead. +xor %edx,%edx + /* Check for Multiboot bootloader */ cmp $MULTIBOOT_BOOTLOADER_MAGIC,%eax -jne not_multiboot +je multiboot1_proto +/* Check for Multiboot2 bootloader */ +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax +je multiboot2_proto + +jmp not_multiboot + +multiboot1_proto: +/* Get mem_lower from Multiboot information */ +testb $MBI_MEMLIMITS,(%ebx) MB_flags(%ebx) +jz trampoline_setup/* not available? BDA value will be fine */ + +mov MB_mem_lower(%ebx),%edx Please use cmovnz or or instead of jz and mov. +jmp trampoline_setup + +multiboot2_proto: +/* Skip Multiboot2 information fixed part */ +lea MB2_fixed_sizeof(%ebx),%ecx + +0: +/* Get mem_lower from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx) +jne 1f + +mov MB2_mem_lower(%ecx),%edx +jmp trampoline_setup + +1: +/* Is it the end of Multiboot2 information? */ +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx) This lacks the use a suitable MB2_* definition (even if that ends up being zero). ---
Re: [PATCH 05/18] efi: split efi_enabled to efi_platform and efi_loader
On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote: We need more fine grained knowledge about EFI environment and check for EFI platform and EFI loader separately to properly support multiboot2 protocol. ... because of ... (i.e. I can't see from the description what the separation is good for). Looking at the comments you placed aside the variables doesn't help me either. In general Xen loaded by this protocol uses memory mappings and loaded modules in simliar way to Xen loaded by multiboot (v1) protocol. Hence, split efi_enabled to efi_platform and efi_loader. And if I'm guessing things right, then introducing efi_loader but leaving efi_enabled alone (only converting where needed) would seem the right approach. --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -431,7 +431,7 @@ static void __init parse_video_info(void) struct boot_video_info *bvi = bootsym(boot_vid_info); /* The EFI loader fills vga_console_info directly. */ -if ( efi_enabled ) +if ( efi_platform ) This looks wrong considering the comment. @@ -663,7 +663,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) if ( ((unsigned long)cpu0_stack (STACK_SIZE-1)) != 0 ) panic(Misaligned CPU0 stack.); -if ( efi_enabled ) +if ( efi_loader ) { set_pdx_range(xen_phys_start PAGE_SHIFT, (xen_phys_start + BOOTSTRAP_MAP_BASE) PAGE_SHIFT); @@ -774,7 +774,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) * we can relocate the dom0 kernel and other multiboot modules. Also, on * x86/64, we relocate Xen to higher memory. */ -for ( i = 0; !efi_enabled i mbi-mods_count; i++ ) +for ( i = 0; !efi_loader i mbi-mods_count; i++ ) { if ( mod[i].mod_start (PAGE_SIZE - 1) ) panic(Bootloader didn't honor module alignment request.); @@ -962,7 +962,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) if ( !xen_phys_start ) panic(Not enough memory to relocate Xen.); -reserve_e820_ram(boot_e820, efi_enabled ? mbi-mem_upper : __pa(_start), +reserve_e820_ram(boot_e820, efi_loader ? mbi-mem_upper : __pa(_start), For all of these it's really hard to tell whether they're right without knowing which parts of the current EFI loading code you intend to re-use. Perhaps switching these would better be done along with adding the re-using code (or splitting out the not re-used parts)? --- a/xen/include/xen/efi.h +++ b/xen/include/xen/efi.h @@ -5,7 +5,11 @@ #include xen/types.h #endif -extern const bool_t efi_enabled; +/* If true then Xen runs on EFI platform. */ +extern bool_t efi_platform; + +/* If true then Xen was loaded by native EFI loader as PE application. */ +extern bool_t efi_loader; I don't see why you're losing the constness. Jan ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH 07/18] efi: run EFI specific code on EFI platform only
On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote: --- a/xen/arch/x86/shutdown.c +++ b/xen/arch/x86/shutdown.c @@ -504,7 +504,8 @@ void machine_restart(unsigned int delay_millisecs) tboot_shutdown(TB_SHUTDOWN_REBOOT); } -efi_reset_system(reboot_mode != 0); +if ( efi_platform ) +efi_reset_system(reboot_mode != 0); Please sync with Konrad on this one - it's supposed to get moved and conditionalized anyway. --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -1154,6 +1154,11 @@ void __init efi_init_memory(void) } *extra, *extra_head = NULL; #endif +#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */ +if ( !efi_platform ) +return; +#endif At least for this one I'd rather see the call site to have the conditional. --- a/xen/common/efi/runtime.c +++ b/xen/common/efi/runtime.c @@ -164,6 +164,9 @@ int efi_get_info(uint32_t idx, union xenpf_efi_info *info) { unsigned int i, n; +if ( !efi_platform ) +return -ENOSYS; + switch ( idx ) { case XEN_FW_EFI_VERSION: @@ -298,6 +301,9 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op) EFI_STATUS status = EFI_NOT_STARTED; int rc = 0; +if ( !efi_platform ) +return -ENOSYS; EOPNOTSUPP in both cases please, consistent with when runtime service use is disabled. Jan ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: clang .code16 with -Os producing larger code that it needs to
On Fri, 2015-02-20 at 16:05 +, David Woodhouse wrote: It's been a while since I looked at this... but I think for the short jumps we just emit the 8-bit version and there's a fixup which can go back and re-emit the instruction in 32-bit mode if it finds it doesn't fit? Do we just need to support a similar fixup for promoting 16-bit to 32-bit relocations? OK, the term I was looking for was 'relaxation'. Look in lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp for X86AsmBackend::relaxInstruction() and related methods. Observe that it will cope with 'relaxing' 8-bit PC-relative relocations to 32-bit PC-relative, but it doesn't cope with anything else. Your task, should you choose to accept it, is to make it cope with other forms of relaxation where necessary. Note that the existing cases end up emitting a new instruction with a *new* opcode. In your case it won't be doing that. It's the *same* opcode, but you'll have to set a flag to tell the emitter to use the 32-bit addressing mode (for data and/or addr as appropriate) this time. And while you're doing that, you should note that that's the *same* flag that'll be needed to support explicit addr32/data32 prefixes in the asm source. So you might as well support those too. I might suggest doing them *first*, in fact. -- David WoodhouseOpen Source Technology Centre david.woodho...@intel.com Intel Corporation smime.p7s Description: S/MIME cryptographic signature ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: clang .code16 with -Os producing larger code that it needs to
On Fri, 2015-02-20 at 15:58 +0100, Vladimir 'φ-coder/phcoder' Serbinenko wrote: When experimenting with compiling GRUB2 with clang using integrated as, I found out that it generates a 16-bit code bigger than gas counterpart and result gets too big for size constraints of bootsector. This was traced mainly to 2 problems. ... 32-bit access to 16-bit addresses. clang: 7cbc: 67 66 8b 1d 5c 7c 00 00 addr32 mov 0x7c5c,%ebx gas: 7cbc: 66 8b 1e 5c 7c mov0x7c5c,%ebx 32-bit jump. clang: +7cb5:66 0f 83 07 01 00 00jae7dc3 L_floppy_probe gas: -7cb5:0f 83 0a 01 jae7dc3 L_floppy_probe To a large extent, those are the *same* problem. We don't know that it's eventually going to fit into a 16-bit offset, so we emit it with a fixup record which can cope with 32 bits. Arguably, the jump is *particularly* gratuitous in many cases... but in 'big real' mode is the IP *really* limited to 16 bits? We could make it default to 16-bit, as gas does. But then we'd be screwed in the cases where we really *do* need 32-bit. What we actually need to do is implement handling for the explicit addr32 prefix. Then we can do what gas does and default to 16-bit but *also* have a way to do 32-bit when it's needed. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel
Re: clang .code16 with -Os producing larger code that it needs to
On Fri, 2015-02-20 at 16:46 +0100, Vladimir 'φ-coder/phcoder' Serbinenko wrote: All labels are local to the source file. If I use %eax instead of %ebx in first example I get the short code. For the second example how does clang detect that offset fits into one byte for issuing EB XX sequence which is issued in resulting file in several places. Can we use the same mechanism to detect when issuing 16-bit reference and keep 32-bit one for external references? It's been a while since I looked at this... but I think for the short jumps we just emit the 8-bit version and there's a fixup which can go back and re-emit the instruction in 32-bit mode if it finds it doesn't fit? Do we just need to support a similar fixup for promoting 16-bit to 32-bit relocations? -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel