Re: [LLVMdev] clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread Rafael Espíndola
 Your task, should you choose to accept it, is to make it cope with other
 forms of relaxation where necessary.

And if not, please open a bug :-)

There are a few other missing cases that cause MC to produce code that
is more relaxed than it needs to be.

Cheers,
Rafael

___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 20.02.2015 15:58, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
 When experimenting with compiling GRUB2 with clang using integrated as,
 I found out that it generates a 16-bit code bigger than gas counterpart
 and result gets too big for size constraints of bootsector. This was
 traced mainly to 2 problems.
 32-bit access to 16-bit addresses.
 source:
   movlLOCAL(kernel_sector), %ebx
   movl%ebx, 8(%si)
 clang:
 7cbc: 67 66 8b 1d 5c 7c 00addr32 mov 0x7c5c,%ebx
 7cc3: 00
 7cc4: 66 89 5c 08 mov%ebx,0x8(%si)
 
 gas:
 7cbc: 66 8b 1e 5c 7c  mov0x7c5c,%ebx
 7cc1: 66 89 5c 08 mov%ebx,0x8(%si)
 32-bit jump.
 source:
   jnb LOCAL(floppy_probe)
 clang:
 +7cb5:66 0f 83 07 01 00 00jae7dc3 L_floppy_probe
 gas:
 -7cb5:0f 83 0a 01 jae7dc3 L_floppy_probe
Minimal example would be:
.code16
jmp 1f
.space 256
1:  nop
clang:
   0:   66 e9 00 01 00 00   jmpl   0x106
...
 106:   90  nop
gcc:
   0:   e9 00 01jmp0x103
...
 103:   90  nop

 The last one is particularly problematic as it never makes sense to
 issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per
 jump. Is it possible to force clang to generate 16-bit jumps?
 On bright side if I remove error strings the code is functional.
 




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread Vladimir 'φ-coder/phcoder' Serbinenko
When experimenting with compiling GRUB2 with clang using integrated as,
I found out that it generates a 16-bit code bigger than gas counterpart
and result gets too big for size constraints of bootsector. This was
traced mainly to 2 problems.
32-bit access to 16-bit addresses.
source:
movlLOCAL(kernel_sector), %ebx
movl%ebx, 8(%si)
clang:
7cbc:   67 66 8b 1d 5c 7c 00addr32 mov 0x7c5c,%ebx
7cc3:   00
7cc4:   66 89 5c 08 mov%ebx,0x8(%si)

gas:
7cbc:   66 8b 1e 5c 7c  mov0x7c5c,%ebx
7cc1:   66 89 5c 08 mov%ebx,0x8(%si)
32-bit jump.
source:
jnb LOCAL(floppy_probe)
clang:
+7cb5:  66 0f 83 07 01 00 00jae7dc3 L_floppy_probe
gas:
-7cb5:  0f 83 0a 01 jae7dc3 L_floppy_probe
The last one is particularly problematic as it never makes sense to
issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per
jump. Is it possible to force clang to generate 16-bit jumps?
On bright side if I remove error strings the code is functional.



signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH 04/18] xen/x86: add multiboot2 protocol support

2015-02-20 Thread Jan Beulich
 On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote:
 --- a/xen/arch/x86/boot/Makefile
 +++ b/xen/arch/x86/boot/Makefile
 @@ -1,6 +1,7 @@
  obj-bin-y += head.o
  
 -RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h 
 $(BASEDIR)/include/xen/multiboot.h
 +RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h 
 $(BASEDIR)/include/xen/compiler.h \
 +  $(BASEDIR)/include/xen/multiboot.h 
 $(BASEDIR)/include/xen/multiboot2.h

Perhaps we should rather have the compiler generate the
dependencies for us when compiling reloc.o?

 @@ -32,6 +33,59 @@ ENTRY(start)
  .long   MULTIBOOT_HEADER_FLAGS
  /* Checksum: must be the negated sum of the first two fields. */
  .long   -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS)
 + 
 +/*** MULTIBOOT2 HEADER /
 +/* Some ideas are taken from grub-2.00/grub-core/tests/boot/kernel-i386.S 
 file. */
 +.align  MULTIBOOT2_HEADER_ALIGN
 +
 +.Lmultiboot2_header:
 +/* Magic number indicating a Multiboot2 header. */
 +.long   MULTIBOOT2_HEADER_MAGIC
 +/* Architecture: i386. */
 +.long   MULTIBOOT2_ARCHITECTURE_I386
 +/* Multiboot2 header length. */
 +.long   .Lmultiboot2_header_end - .Lmultiboot2_header
 +/* Multiboot2 header checksum. */
 +.long   -(MULTIBOOT2_HEADER_MAGIC + MULTIBOOT2_ARCHITECTURE_I386 + \
 +(.Lmultiboot2_header_end - .Lmultiboot2_header))

So here ...

 +/* Multiboot2 tags... */
 +.Lmultiboot2_info_req:
 +/* Multiboot2 information request tag. */
 +.short  MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST
 +.short  MULTIBOOT2_HEADER_TAG_REQUIRED
 +.long   .Lmultiboot2_info_req_end - .Lmultiboot2_info_req

.. and here you properly calculate the length, while ...

 +.long   MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
 +.long   MULTIBOOT2_TAG_TYPE_MMAP
 +.Lmultiboot2_info_req_end:
 +
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_MODULE_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_REQUIRED
 +.long   8 /* Tag size. */

... here and further down you hard-code it. Please be consistent
(and if you go the calculation route, perhaps introduce some
redundancy reducing macro).

 +
 +/* Console flags tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_CONSOLE_FLAGS
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   12 /* Tag size. */
 +.long   MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED
 +
 +/* Framebuffer tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_FRAMEBUFFER
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   20 /* Tag size. */
 +.long   0 /* Number of the columns - no preference. */
 +.long   0 /* Number of the lines - no preference. */
 +.long   0 /* Number of bits per pixel - no preference. */
 +
 +/* Multiboot2 header end tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_END
 +.short  0
 +.long   8 /* Tag size. */
 +.Lmultiboot2_header_end:
  
  .section .init.rodata, a, @progbits
  .align 4
 @@ -81,10 +135,51 @@ __start:
  mov %ecx,%es
  mov %ecx,%ss
  
 +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */

Above you meet coding style requirements, but here you stop to do
so (ongoing below)? Even if pre-existing neighboring comments aren't
well formed, please don't make the situation worse.

Also please don't say 12 here - I first even mistook this as an array
index, but you seem to mean 1 or 2. Please use {1,2} instead.

 +xor %edx,%edx
 +
  /* Check for Multiboot bootloader */
  cmp $MULTIBOOT_BOOTLOADER_MAGIC,%eax
 -jne not_multiboot
 +je  multiboot1_proto
  
 +/* Check for Multiboot2 bootloader */
 +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
 +je  multiboot2_proto
 +
 +jmp not_multiboot
 +
 +multiboot1_proto:
 +/* Get mem_lower from Multiboot information */
 +testb   $MBI_MEMLIMITS,(%ebx)

MB_flags(%ebx)

 +jz  trampoline_setup/* not available? BDA value will be fine 
 */
 +
 +mov MB_mem_lower(%ebx),%edx

Please use cmovnz or or instead of jz and mov.

 +jmp trampoline_setup
 +
 +multiboot2_proto:
 +/* Skip Multiboot2 information fixed part */
 +lea MB2_fixed_sizeof(%ebx),%ecx
 +
 +0:
 +/* Get mem_lower from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx)
 +jne 1f
 +
 +mov MB2_mem_lower(%ecx),%edx
 +jmp trampoline_setup
 +
 +1:
 +/* Is it the end of Multiboot2 information? */
 +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx)

This lacks the use a suitable MB2_* definition (even if that ends up
being zero).

 --- 

Re: [PATCH 05/18] efi: split efi_enabled to efi_platform and efi_loader

2015-02-20 Thread Jan Beulich
 On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote:
 We need more fine grained knowledge about EFI environment and check
 for EFI platform and EFI loader separately to properly support
 multiboot2 protocol.

... because of ... (i.e. I can't see from the description what the
separation is good for). Looking at the comments you placed
aside the variables doesn't help me either.

 In general Xen loaded by this protocol uses
 memory mappings and loaded modules in simliar way to Xen loaded
 by multiboot (v1) protocol. Hence, split efi_enabled to efi_platform
 and efi_loader.

And if I'm guessing things right, then introducing efi_loader but
leaving efi_enabled alone (only converting where needed) would
seem the right approach.

 --- a/xen/arch/x86/setup.c
 +++ b/xen/arch/x86/setup.c
 @@ -431,7 +431,7 @@ static void __init parse_video_info(void)
  struct boot_video_info *bvi = bootsym(boot_vid_info);
  
  /* The EFI loader fills vga_console_info directly. */
 -if ( efi_enabled )
 +if ( efi_platform )

This looks wrong considering the comment.

 @@ -663,7 +663,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  if ( ((unsigned long)cpu0_stack  (STACK_SIZE-1)) != 0 )
  panic(Misaligned CPU0 stack.);
  
 -if ( efi_enabled )
 +if ( efi_loader )
  {
  set_pdx_range(xen_phys_start  PAGE_SHIFT,
(xen_phys_start + BOOTSTRAP_MAP_BASE)  PAGE_SHIFT);
 @@ -774,7 +774,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
   * we can relocate the dom0 kernel and other multiboot modules. Also, on
   * x86/64, we relocate Xen to higher memory.
   */
 -for ( i = 0; !efi_enabled  i  mbi-mods_count; i++ )
 +for ( i = 0; !efi_loader  i  mbi-mods_count; i++ )
  {
  if ( mod[i].mod_start  (PAGE_SIZE - 1) )
  panic(Bootloader didn't honor module alignment request.);
 @@ -962,7 +962,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  
  if ( !xen_phys_start )
  panic(Not enough memory to relocate Xen.);
 -reserve_e820_ram(boot_e820, efi_enabled ? mbi-mem_upper : 
 __pa(_start),
 +reserve_e820_ram(boot_e820, efi_loader ? mbi-mem_upper : __pa(_start),

For all of these it's really hard to tell whether they're right without
knowing which parts of the current EFI loading code you intend to
re-use. Perhaps switching these would better be done along with
adding the re-using code (or splitting out the not re-used parts)?

 --- a/xen/include/xen/efi.h
 +++ b/xen/include/xen/efi.h
 @@ -5,7 +5,11 @@
  #include xen/types.h
  #endif
  
 -extern const bool_t efi_enabled;
 +/* If true then Xen runs on EFI platform. */
 +extern bool_t efi_platform;
 +
 +/* If true then Xen was loaded by native EFI loader as PE application. */
 +extern bool_t efi_loader;

I don't see why you're losing the constness.

Jan


___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH 07/18] efi: run EFI specific code on EFI platform only

2015-02-20 Thread Jan Beulich
 On 30.01.15 at 18:54, daniel.ki...@oracle.com wrote:
 --- a/xen/arch/x86/shutdown.c
 +++ b/xen/arch/x86/shutdown.c
 @@ -504,7 +504,8 @@ void machine_restart(unsigned int delay_millisecs)
  tboot_shutdown(TB_SHUTDOWN_REBOOT);
  }
  
 -efi_reset_system(reboot_mode != 0);
 +if ( efi_platform )
 +efi_reset_system(reboot_mode != 0);

Please sync with Konrad on this one - it's supposed to get moved
and conditionalized anyway.

 --- a/xen/common/efi/boot.c
 +++ b/xen/common/efi/boot.c
 @@ -1154,6 +1154,11 @@ void __init efi_init_memory(void)
  } *extra, *extra_head = NULL;
  #endif
  
 +#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
 +if ( !efi_platform )
 +return;
 +#endif

At least for this one I'd rather see the call site to have the
conditional.

 --- a/xen/common/efi/runtime.c
 +++ b/xen/common/efi/runtime.c
 @@ -164,6 +164,9 @@ int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
  {
  unsigned int i, n;
  
 +if ( !efi_platform )
 +return -ENOSYS;
 +
  switch ( idx )
  {
  case XEN_FW_EFI_VERSION:
 @@ -298,6 +301,9 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op)
  EFI_STATUS status = EFI_NOT_STARTED;
  int rc = 0;
  
 +if ( !efi_platform )
 +return -ENOSYS;

EOPNOTSUPP in both cases please, consistent with when runtime
service use is disabled.

Jan


___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread David Woodhouse
On Fri, 2015-02-20 at 16:05 +, David Woodhouse wrote:
 
 It's been a while since I looked at this... but I think for the short
 jumps we just emit the 8-bit version and there's a fixup which can go
 back and re-emit the instruction in 32-bit mode if it finds it doesn't
 fit?
 
 Do we just need to support a similar fixup for promoting 16-bit to
 32-bit relocations?

OK, the term I was looking for was 'relaxation'. Look in
lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp for
X86AsmBackend::relaxInstruction() and related methods.

Observe that it will cope with 'relaxing' 8-bit PC-relative relocations
to 32-bit PC-relative, but it doesn't cope with anything else.

Your task, should you choose to accept it, is to make it cope with other
forms of relaxation where necessary.

Note that the existing cases end up emitting a new instruction with a
*new* opcode. In your case it won't be doing that. It's the *same*
opcode, but you'll have to set a flag to tell the emitter to use the
32-bit addressing mode (for data and/or addr as appropriate) this time.

And while you're doing that, you should note that that's the *same* flag
that'll be needed to support explicit addr32/data32 prefixes in the asm
source. So you might as well support those too. I might suggest doing
them *first*, in fact.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread David Woodhouse
On Fri, 2015-02-20 at 15:58 +0100, Vladimir 'φ-coder/phcoder' Serbinenko
wrote:
 When experimenting with compiling GRUB2 with clang using integrated as,
 I found out that it generates a 16-bit code bigger than gas counterpart
 and result gets too big for size constraints of bootsector. This was
 traced mainly to 2 problems.

...

 32-bit access to 16-bit addresses.
 clang:
 7cbc: 67 66 8b 1d 5c 7c 00 00 addr32 mov 0x7c5c,%ebx
 gas:
 7cbc: 66 8b 1e 5c 7c  mov0x7c5c,%ebx

 32-bit jump.
 clang:
 +7cb5:66 0f 83 07 01 00 00jae7dc3 L_floppy_probe
 gas:
 -7cb5:0f 83 0a 01 jae7dc3 L_floppy_probe

To a large extent, those are the *same* problem. We don't know that it's
eventually going to fit into a 16-bit offset, so we emit it with a fixup
record which can cope with 32 bits.

Arguably, the jump is *particularly* gratuitous in many cases... but in
'big real' mode is the IP *really* limited to 16 bits?

We could make it default to 16-bit, as gas does. But then we'd be
screwed in the cases where we really *do* need 32-bit.

What we actually need to do is implement handling for the explicit
addr32 prefix. Then we can do what gas does and default to 16-bit but
*also* have a way to do 32-bit when it's needed.

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: clang .code16 with -Os producing larger code that it needs to

2015-02-20 Thread David Woodhouse
On Fri, 2015-02-20 at 16:46 +0100, Vladimir 'φ-coder/phcoder' Serbinenko
wrote:
 
 All labels are local to the source file. If I use %eax instead of %ebx
 in first example I get the short code. For the second example how does
 clang detect that offset fits into one byte for issuing EB XX sequence
 which is issued in resulting file in several places. Can we use the
 same mechanism to detect when issuing 16-bit reference and keep 32-bit
 one for external references?

It's been a while since I looked at this... but I think for the short
jumps we just emit the 8-bit version and there's a fixup which can go
back and re-emit the instruction in 32-bit mode if it finds it doesn't
fit?

Do we just need to support a similar fixup for promoting 16-bit to
32-bit relocations?

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel