RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc
Ping? > -Original Message- > From: Tony Wang [mailto:tony.w...@arm.com] > Sent: Tuesday, September 16, 2014 11:01 AM > To: 'gcc-patches@gcc.gnu.org' > Cc: Richard Earnshaw; Ramana Radhakrishnan > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > and dmul/ddiv function in libgcc > > Ping? > > > -Original Message- > > From: Tony Wang [mailto:tony.w...@arm.com] > > Sent: Thursday, September 04, 2014 10:15 AM > > To: 'gcc-patches@gcc.gnu.org' > > Cc: Richard Earnshaw; Ramana Radhakrishnan > > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > > and dmul/ddiv function in libgcc > > > > Ping 2? > > > > > -Original Message- > > > From: Tony Wang [mailto:tony.w...@arm.com] > > > Sent: Thursday, August 28, 2014 2:02 PM > > > To: 'gcc-patches@gcc.gnu.org' > > > Cc: Richard Earnshaw; Ramana Radhakrishnan > > > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the > > > fmul/fdiv and dmul/ddiv function in libgcc > > > > > > Ping? > > > > > > > -----Original Message- > > > > From: Tony Wang [mailto:tony.w...@arm.com] > > > > Sent: Thursday, August 21, 2014 2:15 PM > > > > To: 'gcc-patches@gcc.gnu.org' > > > > Subject: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > > > > and dmul/ddiv function in libgcc > > > > > > > > Hi there, > > > > > > > > In libgcc the file ieee754-sf.S and ieee754-df.S have some function > > > > pairs which will be bundled into one .o > file > > > and > > > > sharing the same .text section. For example, the fmul and fdiv, the > > > > libgcc makefile will build them into > one .o > > > file > > > > and archived into libgcc.a. So when user only call single float point > > > > multiply functions, the fdiv function will > > also > > > be > > > > linked, and as fmul and fdiv share the same .text section, linker > > > > option --gc-sections or -flot can't remove > the > > > > dead code. > > > > > > > > So this optimization just separates the function pair(fmul/fdiv and > > > > dmul/ddiv) into different sections, > > following > > > > the naming pattern of -ffunction-sections(.text.__functionname), > > > > through which the unused sections of > > > > fdiv/ddiv can be eliminated through option --gcc-sections when users > > > > only use fmul/dmul.The solution is to > > > add > > > > a conditional statement in the macro FUNC_START, which will conditional > > > > change the section of a function > > > > from .text to .text.__\name. when compiling with the L_arm_muldivsf3 or > > > > L_arm_muldivdf3 macro. > > > > > > > > GCC regression test has been done on QEMU for Cortex-M3. No new > > > > regressions when turn on this patch. > > > > > > > > The code reduction for thumb2 on cortex-m3 is: > > > > 1. When user only use single float point multiply: > > > > fmul+fdiv => fmul will have a code size reduction of 318 bytes. > > > > > > > > 2. When user only use double float point multiply: > > > > dmul+ddiv => dmul will have a code size reduction of 474 bytes. > > > > > > > > Ok for trunk? > > > > > > > > BR, > > > > Tony > > > > > > > > Step 1: Provide another option: sp-scetion to control whether to split > > > > the section of a function pair into two > > > part. > > > > > > > > gcc/libgcc/ChangeLog: > > > > 2014-08-21 Tony Wang > > > > > > > > * config/arm/lib1funcs.S (FUNC_START): Add conditional section > > > > redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3 > > > > (SYM_END, ARM_SYM_START): Add macros used to expose function > > > > Symbols > > > > > > > > diff --git a/libgcc/config/arm/lib1funcs.S > > > > b/libgcc/config/arm/lib1funcs.S > > > > index b617137..0f87111 100644 > > > > --- a/libgcc/config/arm/lib1funcs.S > > > > +++ b/libgcc/config/arm/lib1funcs.S > > > > @@ -418,8 +418,12 @@ SYM (\name): > > > > #define THUMB_SYNTAX > > > > #endif > > > > > >
RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc
Ping? > -Original Message- > From: Tony Wang [mailto:tony.w...@arm.com] > Sent: Thursday, September 04, 2014 10:15 AM > To: 'gcc-patches@gcc.gnu.org' > Cc: Richard Earnshaw; Ramana Radhakrishnan > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > and dmul/ddiv function in libgcc > > Ping 2? > > > -Original Message- > > From: Tony Wang [mailto:tony.w...@arm.com] > > Sent: Thursday, August 28, 2014 2:02 PM > > To: 'gcc-patches@gcc.gnu.org' > > Cc: Richard Earnshaw; Ramana Radhakrishnan > > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > > and dmul/ddiv function in libgcc > > > > Ping? > > > > > -Original Message- > > > From: Tony Wang [mailto:tony.w...@arm.com] > > > Sent: Thursday, August 21, 2014 2:15 PM > > > To: 'gcc-patches@gcc.gnu.org' > > > Subject: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > > > and dmul/ddiv function in libgcc > > > > > > Hi there, > > > > > > In libgcc the file ieee754-sf.S and ieee754-df.S have some function pairs > > > which will be bundled into one .o file > > and > > > sharing the same .text section. For example, the fmul and fdiv, the > > > libgcc makefile will build them into one .o > > file > > > and archived into libgcc.a. So when user only call single float point > > > multiply functions, the fdiv function will > also > > be > > > linked, and as fmul and fdiv share the same .text section, linker option > > > --gc-sections or -flot can't remove the > > > dead code. > > > > > > So this optimization just separates the function pair(fmul/fdiv and > > > dmul/ddiv) into different sections, > following > > > the naming pattern of -ffunction-sections(.text.__functionname), through > > > which the unused sections of > > > fdiv/ddiv can be eliminated through option --gcc-sections when users only > > > use fmul/dmul.The solution is to > > add > > > a conditional statement in the macro FUNC_START, which will conditional > > > change the section of a function > > > from .text to .text.__\name. when compiling with the L_arm_muldivsf3 or > > > L_arm_muldivdf3 macro. > > > > > > GCC regression test has been done on QEMU for Cortex-M3. No new > > > regressions when turn on this patch. > > > > > > The code reduction for thumb2 on cortex-m3 is: > > > 1. When user only use single float point multiply: > > > fmul+fdiv => fmul will have a code size reduction of 318 bytes. > > > > > > 2. When user only use double float point multiply: > > > dmul+ddiv => dmul will have a code size reduction of 474 bytes. > > > > > > Ok for trunk? > > > > > > BR, > > > Tony > > > > > > Step 1: Provide another option: sp-scetion to control whether to split > > > the section of a function pair into two > > part. > > > > > > gcc/libgcc/ChangeLog: > > > 2014-08-21 Tony Wang > > > > > > * config/arm/lib1funcs.S (FUNC_START): Add conditional section > > > redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3 > > > (SYM_END, ARM_SYM_START): Add macros used to expose function > > > Symbols > > > > > > diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S > > > index b617137..0f87111 100644 > > > --- a/libgcc/config/arm/lib1funcs.S > > > +++ b/libgcc/config/arm/lib1funcs.S > > > @@ -418,8 +418,12 @@ SYM (\name): > > > #define THUMB_SYNTAX > > > #endif > > > > > > -.macro FUNC_START name > > > +.macro FUNC_START name sp_section= > > > + .ifc \sp_section, function_section > > > + .section.text.__\name,"ax",%progbits > > > + .else > > > .text > > > + .endif > > > .globl SYM (__\name) > > > TYPE (__\name) > > > .align 0 > > > @@ -429,14 +433,24 @@ SYM (\name): > > > SYM (__\name): > > > .endm > > > > > > +.macro ARM_SYM_START name > > > + TYPE (\name) > > > + .align 0 > > > +SYM (\name): > > > +.endm > > > + > > > +.macro SYM_END name > > > + SIZE (\name) > > > +.endm > > > + > > > /* Special function that will always be coded in ARM assembly, even if > > > in Thumb-only compilation. */ > > > > > > #if defined(__thumb2__) > > > > > > /* For Thumb-2 we build everything in thumb mode. */ > > > -.macro ARM_FUNC_START name > > > - FUNC_START \name > > > +.macro ARM_FUNC_START name sp_section= > > > + FUNC_START \name \sp_section > > > .syntax unified > > > .endm > > > #define EQUIV .thumb_set > > > @@ -467,8 +481,12 @@ _L__\name: > > > #ifdef __ARM_ARCH_6M__ > > > #define EQUIV .thumb_set > > > #else > > > -.macro ARM_FUNC_START name > > > +.macro ARM_FUNC_START name sp_section= > > > + .ifc \sp_section, function_section > > > + .section.text.__\name,"ax",%progbits > > > + .else > > > .text > > > + .endif > > > .globl SYM (__\name) > > > TYPE (__\name) > > > .align 0
RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc
Ping 2? > -Original Message- > From: Tony Wang [mailto:tony.w...@arm.com] > Sent: Thursday, August 28, 2014 2:02 PM > To: 'gcc-patches@gcc.gnu.org' > Cc: Richard Earnshaw; Ramana Radhakrishnan > Subject: RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv > and dmul/ddiv function in libgcc > > Ping? > > > -Original Message- > > From: Tony Wang [mailto:tony.w...@arm.com] > > Sent: Thursday, August 21, 2014 2:15 PM > > To: 'gcc-patches@gcc.gnu.org' > > Subject: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and > > dmul/ddiv function in libgcc > > > > Hi there, > > > > In libgcc the file ieee754-sf.S and ieee754-df.S have some function pairs > > which will be bundled into one .o file > and > > sharing the same .text section. For example, the fmul and fdiv, the libgcc > > makefile will build them into one .o > file > > and archived into libgcc.a. So when user only call single float point > > multiply functions, the fdiv function will also > be > > linked, and as fmul and fdiv share the same .text section, linker option > > --gc-sections or -flot can't remove the > > dead code. > > > > So this optimization just separates the function pair(fmul/fdiv and > > dmul/ddiv) into different sections, following > > the naming pattern of -ffunction-sections(.text.__functionname), through > > which the unused sections of > > fdiv/ddiv can be eliminated through option --gcc-sections when users only > > use fmul/dmul.The solution is to > add > > a conditional statement in the macro FUNC_START, which will conditional > > change the section of a function > > from .text to .text.__\name. when compiling with the L_arm_muldivsf3 or > > L_arm_muldivdf3 macro. > > > > GCC regression test has been done on QEMU for Cortex-M3. No new regressions > > when turn on this patch. > > > > The code reduction for thumb2 on cortex-m3 is: > > 1. When user only use single float point multiply: > > fmul+fdiv => fmul will have a code size reduction of 318 bytes. > > > > 2. When user only use double float point multiply: > > dmul+ddiv => dmul will have a code size reduction of 474 bytes. > > > > Ok for trunk? > > > > BR, > > Tony > > > > Step 1: Provide another option: sp-scetion to control whether to split the > > section of a function pair into two > part. > > > > gcc/libgcc/ChangeLog: > > 2014-08-21 Tony Wang > > > > * config/arm/lib1funcs.S (FUNC_START): Add conditional section > > redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3 > > (SYM_END, ARM_SYM_START): Add macros used to expose function > > Symbols > > > > diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S > > index b617137..0f87111 100644 > > --- a/libgcc/config/arm/lib1funcs.S > > +++ b/libgcc/config/arm/lib1funcs.S > > @@ -418,8 +418,12 @@ SYM (\name): > > #define THUMB_SYNTAX > > #endif > > > > -.macro FUNC_START name > > +.macro FUNC_START name sp_section= > > + .ifc \sp_section, function_section > > + .section.text.__\name,"ax",%progbits > > + .else > > .text > > + .endif > > .globl SYM (__\name) > > TYPE (__\name) > > .align 0 > > @@ -429,14 +433,24 @@ SYM (\name): > > SYM (__\name): > > .endm > > > > +.macro ARM_SYM_START name > > + TYPE (\name) > > + .align 0 > > +SYM (\name): > > +.endm > > + > > +.macro SYM_END name > > + SIZE (\name) > > +.endm > > + > > /* Special function that will always be coded in ARM assembly, even if > > in Thumb-only compilation. */ > > > > #if defined(__thumb2__) > > > > /* For Thumb-2 we build everything in thumb mode. */ > > -.macro ARM_FUNC_START name > > - FUNC_START \name > > +.macro ARM_FUNC_START name sp_section= > > + FUNC_START \name \sp_section > > .syntax unified > > .endm > > #define EQUIV .thumb_set > > @@ -467,8 +481,12 @@ _L__\name: > > #ifdef __ARM_ARCH_6M__ > > #define EQUIV .thumb_set > > #else > > -.macro ARM_FUNC_START name > > +.macro ARM_FUNC_START name sp_section= > > + .ifc \sp_section, function_section > > + .section.text.__\name,"ax",%progbits > > + .else > > .text > > + .endif > > .globl SYM (__\name) > > TYPE (__\name) > > .align 0
RE: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc
Ping? > -Original Message- > From: Tony Wang [mailto:tony.w...@arm.com] > Sent: Thursday, August 21, 2014 2:15 PM > To: 'gcc-patches@gcc.gnu.org' > Subject: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and > dmul/ddiv function in libgcc > > Hi there, > > In libgcc the file ieee754-sf.S and ieee754-df.S have some function pairs > which will be bundled into one .o file and > sharing the same .text section. For example, the fmul and fdiv, the libgcc > makefile will build them into one .o file > and archived into libgcc.a. So when user only call single float point > multiply functions, the fdiv function will also be > linked, and as fmul and fdiv share the same .text section, linker option > --gc-sections or -flot can't remove the > dead code. > > So this optimization just separates the function pair(fmul/fdiv and > dmul/ddiv) into different sections, following > the naming pattern of -ffunction-sections(.text.__functionname), through > which the unused sections of > fdiv/ddiv can be eliminated through option --gcc-sections when users only use > fmul/dmul.The solution is to add > a conditional statement in the macro FUNC_START, which will conditional > change the section of a function > from .text to .text.__\name. when compiling with the L_arm_muldivsf3 or > L_arm_muldivdf3 macro. > > GCC regression test has been done on QEMU for Cortex-M3. No new regressions > when turn on this patch. > > The code reduction for thumb2 on cortex-m3 is: > 1. When user only use single float point multiply: > fmul+fdiv => fmul will have a code size reduction of 318 bytes. > > 2. When user only use double float point multiply: > dmul+ddiv => dmul will have a code size reduction of 474 bytes. > > Ok for trunk? > > BR, > Tony > > Step 1: Provide another option: sp-scetion to control whether to split the > section of a function pair into two part. > > gcc/libgcc/ChangeLog: > 2014-08-21 Tony Wang > > * config/arm/lib1funcs.S (FUNC_START): Add conditional section > redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3 > (SYM_END, ARM_SYM_START): Add macros used to expose function > Symbols > > diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S > index b617137..0f87111 100644 > --- a/libgcc/config/arm/lib1funcs.S > +++ b/libgcc/config/arm/lib1funcs.S > @@ -418,8 +418,12 @@ SYM (\name): > #define THUMB_SYNTAX > #endif > > -.macro FUNC_START name > +.macro FUNC_START name sp_section= > + .ifc \sp_section, function_section > + .section.text.__\name,"ax",%progbits > + .else > .text > + .endif > .globl SYM (__\name) > TYPE (__\name) > .align 0 > @@ -429,14 +433,24 @@ SYM (\name): > SYM (__\name): > .endm > > +.macro ARM_SYM_START name > + TYPE (\name) > + .align 0 > +SYM (\name): > +.endm > + > +.macro SYM_END name > + SIZE (\name) > +.endm > + > /* Special function that will always be coded in ARM assembly, even if > in Thumb-only compilation. */ > > #if defined(__thumb2__) > > /* For Thumb-2 we build everything in thumb mode. */ > -.macro ARM_FUNC_START name > - FUNC_START \name > +.macro ARM_FUNC_START name sp_section= > + FUNC_START \name \sp_section > .syntax unified > .endm > #define EQUIV .thumb_set > @@ -467,8 +481,12 @@ _L__\name: > #ifdef __ARM_ARCH_6M__ > #define EQUIV .thumb_set > #else > -.macro ARM_FUNC_START name > +.macro ARM_FUNC_START name sp_section= > + .ifc \sp_section, function_section > + .section.text.__\name,"ax",%progbits > + .else > .text > + .endif > .globl SYM (__\name) > TYPE (__\name) > .align 0
[PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc
Hi there, In libgcc the file ieee754-sf.S and ieee754-df.S have some function pairs which will be bundled into one .o file and sharing the same .text section. For example, the fmul and fdiv, the libgcc makefile will build them into one .o file and archived into libgcc.a. So when user only call single float point multiply functions, the fdiv function will also be linked, and as fmul and fdiv share the same .text section, linker option --gc-sections or -flot can't remove the dead code. So this optimization just separates the function pair(fmul/fdiv and dmul/ddiv) into different sections, following the naming pattern of -ffunction-sections(.text.__functionname), through which the unused sections of fdiv/ddiv can be eliminated through option --gcc-sections when users only use fmul/dmul.The solution is to add a conditional statement in the macro FUNC_START, which will conditional change the section of a function from .text to .text.__\name. when compiling with the L_arm_muldivsf3 or L_arm_muldivdf3 macro. GCC regression test has been done on QEMU for Cortex-M3. No new regressions when turn on this patch. The code reduction for thumb2 on cortex-m3 is: 1. When user only use single float point multiply: fmul+fdiv => fmul will have a code size reduction of 318 bytes. 2. When user only use double float point multiply: dmul+ddiv => dmul will have a code size reduction of 474 bytes. Ok for trunk? BR, Tony Step 1: Provide another option: sp-scetion to control whether to split the section of a function pair into two part. gcc/libgcc/ChangeLog: 2014-08-21 Tony Wang * config/arm/lib1funcs.S (FUNC_START): Add conditional section redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3 (SYM_END, ARM_SYM_START): Add macros used to expose function Symbols diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index b617137..0f87111 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -418,8 +418,12 @@ SYM (\name): #define THUMB_SYNTAX #endif -.macro FUNC_START name +.macro FUNC_START name sp_section= + .ifc \sp_section, function_section + .section.text.__\name,"ax",%progbits + .else .text + .endif .globl SYM (__\name) TYPE (__\name) .align 0 @@ -429,14 +433,24 @@ SYM (\name): SYM (__\name): .endm +.macro ARM_SYM_START name + TYPE (\name) + .align 0 +SYM (\name): +.endm + +.macro SYM_END name + SIZE (\name) +.endm + /* Special function that will always be coded in ARM assembly, even if in Thumb-only compilation. */ #if defined(__thumb2__) /* For Thumb-2 we build everything in thumb mode. */ -.macro ARM_FUNC_START name - FUNC_START \name +.macro ARM_FUNC_START name sp_section= + FUNC_START \name \sp_section .syntax unified .endm #define EQUIV .thumb_set @@ -467,8 +481,12 @@ _L__\name: #ifdef __ARM_ARCH_6M__ #define EQUIV .thumb_set #else -.macro ARM_FUNC_START name +.macro ARM_FUNC_START name sp_section= + .ifc \sp_section, function_section + .section.text.__\name,"ax",%progbits + .else .text + .endif .globl SYM (__\name) TYPE (__\name) .align 0 libgcc_mul_div_code_size_reduction_1.diff Description: Binary data