Ping [ARM back-end and middle-end patch] stack check for threads

2012-01-09 Thread Thomas Klein

Hello

Even if I'm going to bore everyone.
Is it possible to get an answer from the ARM maintainers?
This is a serious question, and I hope there will be a serious answer.

Regards
 Thomas Klein

reference:
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01774.html



Ping [ARM back-end and middle-end patch] stack check for threads

2011-12-28 Thread Thomas Klein

ping

I would like to introduce two new -fstack-check options named direct and 
indirect.
Targets that did not supporting the new stack checking options will work 
as before.

At the ARM platform the old generic options is working as before.
(Including that is now possible to have a checking code sequence even if 
optimization

is switched on.)
The check against a given limit value while doing dynamic stack 
allocation is now

also working, too.
This was not the case due to missing trap function.
For this case I've added a code sequence to let generic act like the 
dynamic part

doing a compare against a given limit value.
I'm treating this as keeping old stuff alive.

Back to my new options I like to have here.
Maybe you are happy with the above, but I'm not.
Sometimes you do not have a one single limit value that is valid for all.
For example if you are having an environment with threads and each 
threads is using

its own stack at an different location.
In case all functions should have a common knowledge about a global 
limit variable

which is holding the limit value.
This limit value can be used to check if a stack overflow has occurred 
or not.


There are two ways to inform the compiler about this limit variable.
 If it is an ordinary variable (located somewhere in data space) you 
should

 use the option combination
 -fstack-check=indirect and -fstack-limit-symbol=global_stack_limit

 If it is a register global variable you should use the option combination
 -fstack-check=direct} and -fstack-limit-register=r6
 In this case you have to make sure that this register isn't be used by 
others.

 For example you can add the option -ffixed-r6 to all files that are
 not going to do stack checking.
The OS is responsible to insert the correct limit value.
For example at the end of a context switch.

I've added a little bit of documentation, too.
This may not be as god as you expect, but it the best I can do.
Sorry for that.

I have added some tests running on arm simulator and linux arm target 
machine.
I'm using ../src/configure --target=arm-elf and --target=arm-elf-eabi 
cross compilers

and running tests with:
gmake check-gcc RUNTESTFLAGS=--target_board=arm-sim arm_stack_check.exp
Also using a native linux compiler (on armv7-a machine) and running 
tests with:

gmake check-gcc RUNTESTFLAGS=arm_stack_check.exp

Each test case is done with:
- stack checking variants
  generic using a limit-symbol, generic using a limit register,
  direct using a limit-symbol, direct using a limit register and
  indirect using a limit-symbol
- various modes ARM, Thumb, (and if possible with Thumb-2)
- With and without optimization.
- Without -fpic, with -fpic and with -fpic -msingle-pic-base

I have also detected a minor bug if using combination:
-fpic -mpic-register=r9 -march=armv4t -mthumb.
(A move of the hi register to a lo register is missing here.)
So I've added the few lines of code in here, too.
Maybe you think that this is a nasty hack, so insert a better one instead.

All tests succeeds.

I'm still thinking that my idea isn't that bad.
How ever any feedback from the ARM maintainers would be god.
Even if is something like: We hate this bull shit at all.
Any feedback is better than no feedback.

Regards
  Thomas Klein

references
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01261.html
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00310.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00216.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00281.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00149.html
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01872.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html


ChangeLog.check.bz2
Description: Binary data


gcc.diff_chk.bz2
Description: Binary data


gcc.diff_dep.bz2
Description: Binary data


ChangeLog.test.bz2
Description: Binary data


gcc.diff_test.bz2
Description: Binary data


Ping [ARM back-end and middle-end patch] stack check for threads

2011-09-21 Thread Thomas Klein

ping

rename subject line due to
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01189.html

references
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00310.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00216.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00281.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00149.html
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01872.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html

gcc/ChangeLog

2011-09-21  Thomas Klein th.r.kl...@web.de
* opts.c (common_handle_option): introduce new parameters direct and
indirect
* flag-types.h (enum stack_check_type): Likewise

* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter direct, indirect or if a
stack-limit is given
- do additional read of limit value if parameter indirect and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range porbe
emit generic_limit_check_stack

* config/arm/arm.c
(stack_check_work_registers): new function to find possible working
registers [only used by stack check]
(emit_push_regs): add push RTL instruction without keeping regnumber
and frame memory in mind.
(emit_pop_regs): add pop RTL instruction to revert the above push
(emit_stack_check_insns): new function to write RTL instructions for
stack check at prologue stage.
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_output_function_prologue): stack check integration for Thumb-1

* config/arm/arm.md
(cbranchsi4_insn): allow compare and branch using stack pointer
register [at thumb mode]
(arm_cmpsi_insn): allow comparing using stack pointer register [at arm]
(probe_stack): do not emit code when parameters direct or indirect
is given, emit move code way same as in gcc/explow.c [function
emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter generic is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_failure): failure call used in stack check functions
emit_stack_check_insns, generic_limit_check_stack or
allocate_dynamic_stack_space [similar to a trap but avoid conflict with
builtin_trap]

Index: gcc/opts.c
===
--- gcc/opts.c  (revision 179052)
+++ gcc/opts.c  (working copy)
@@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts,
   : STACK_CHECK_STATIC_BUILTIN
 ? STATIC_BUILTIN_STACK_CHECK
 : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 179052)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,15 @@ enum stack_check_type
 
   /* Check the stack and entirely rely on the target configuration
  files, i.e. do not use the generic mechanism at all.  */
-  FULL_BUILTIN_STACK_CHECK
+  FULL_BUILTIN_STACK_CHECK,
+
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is directly given e.g. by address
+ of a symbol */
+  DIRECT_STACK_CHECK,
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is given by global variable. */
+  INDIRECT_STACK_CHECK
 };
 
 /* Names for the different levels of -Wstrict-overflow=N.  The numeric
Index: gcc/explow.c
===
--- gcc/explow.c(revision 179052)
+++ gcc/explow.c(working copy)
@@ -1386,7 +1386,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s
 
   /* If needed, check that we have the required amount of stack.  Take into
  account what has already been checked.  */
-  if (STACK_CHECK_MOVING_SP)
+  if (  STACK_CHECK_MOVING_SP 
+#ifdef HAVE_generic_limit_check_stack
+ || crtl-limit_stack
+#endif
+ || flag_stack_check == DIRECT_STACK_CHECK
+ || flag_stack_check == INDIRECT_STACK_CHECK)
 ;
   else if (flag_stack_check == GENERIC_STACK_CHECK)
 probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
@@ -1423,19 +1428,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s
   /* Check stack bounds if necessary.  */
   if (crtl-limit_stack

Ping: C-family stack check for threads

2011-09-20 Thread Thomas Klein

ping

references
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00310.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00216.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00281.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00149.html
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01872.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html

gcc/ChangeLog

2011-09-20  Thomas Klein th.r.kl...@web.de
* opts.c (common_handle_option): introduce new parameters direct and
indirect
* flag-types.h (enum stack_check_type): Likewise

* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter direct, indirect or if a
stack-limit is given
- do additional read of limit value if parameter indirect and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range porbe
emit generic_limit_check_stack

* config/arm/arm.c
(stack_check_work_registers): new function to find possible working
registers [only used by stack check]
(emit_push_regs): add push RTL instruction without keeping regnumber
and frame memory in mind.
(emit_pop_regs): add pop RTL instruction to revert the above push
(emit_stack_check_insns): new function to write RTL instructions for
stack check at prologue stage.
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_output_function_prologue): stack check integration for Thumb-1

* config/arm/arm.md
(cbranchsi4_insn): allow compare and branch using stack pointer
register [at thumb mode]
(arm_cmpsi_insn): allow comparing using stack pointer register [at 
arm]

(probe_stack): do not emit code when parameters direct or indirect
is given, emit move code way same as in gcc/explow.c [function
emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter generic is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_failure): failure call used in stack check functions
emit_stack_check_insns, generic_limit_check_stack or
allocate_dynamic_stack_space [similar to a trap but avoid conflict 
with

builtin_trap]


Index: gcc/opts.c
===
--- gcc/opts.c  (revision 179007)
+++ gcc/opts.c  (working copy)
@@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts,
   : STACK_CHECK_STATIC_BUILTIN
 ? STATIC_BUILTIN_STACK_CHECK
 : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 179007)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,15 @@ enum stack_check_type
 
   /* Check the stack and entirely rely on the target configuration
  files, i.e. do not use the generic mechanism at all.  */
-  FULL_BUILTIN_STACK_CHECK
+  FULL_BUILTIN_STACK_CHECK,
+
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is directly given e.g. by address
+ of a symbol */
+  DIRECT_STACK_CHECK,
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is given by global variable. */
+  INDIRECT_STACK_CHECK
 };
 
 /* Names for the different levels of -Wstrict-overflow=N.  The numeric
Index: gcc/explow.c
===
--- gcc/explow.c(revision 179007)
+++ gcc/explow.c(working copy)
@@ -1386,7 +1386,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s
 
   /* If needed, check that we have the required amount of stack.  Take into
  account what has already been checked.  */
-  if (STACK_CHECK_MOVING_SP)
+  if (  STACK_CHECK_MOVING_SP 
+#ifdef HAVE_generic_limit_check_stack
+ || crtl-limit_stack
+#endif
+ || flag_stack_check == DIRECT_STACK_CHECK
+ || flag_stack_check == INDIRECT_STACK_CHECK)
 ;
   else if (flag_stack_check == GENERIC_STACK_CHECK)
 probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
@@ -1423,19 +1428,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s
   /* Check stack bounds if necessary.  */
   if (crtl-limit_stack)
{
+  rtx limit_rtx;
  rtx available;
  rtx

Re: Ping: C-family stack check for threads

2011-09-05 Thread Thomas Klein

On 09/05/11 09:45, Ye Joey wrote:

+  /* check if we can use one of the argument registers r0..r3 as long as they
+   * not holding data*/
+  for (reg = 0; reg= LAST_ARG_REGNUM  i  2; reg++)
...

+  n = (reg + 1) % 4;

Avoid immediate register number.
use ARG_REGISTER (1) to replace reg 0
use NUM_ARG_REGS to replace 4



The 4 is the number of argument registers so you are right to use 
NUM_ARG_REGS here.

The calculation should give the next possible argument register.
E.g. if the current register is r0, the next register is r1.
Except if the current register is r3, then the next register is r0.

I think the ARG_REGISTER macro will not reduce confusion.
  n = ( ARG_REGISTER(reg+1) + 1) % NUM_ARG_REGS;
identical to
  n = (reg + 1) % NUM_ARG_REGS;

regards
  Thomas Klein

gcc/ChangeLog

2011-09-05  Thomas Klein th.r.kl...@web.de
* opts.c (common_handle_option): introduce new parameters direct and
indirect
* flag-types.h (enum stack_check_type): Likewise

* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter direct, indirect or if a
stack-limit is given
- do additional read of limit value if parameter indirect and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range porbe
emit generic_limit_check_stack

* config/arm/arm.c
(stack_check_work_registers): new function to find possible working
registers [only used by stack check]
(emit_push_regs): add push RTL instruction without keeping regnumber
and frame memory in mind.
(emit_pop_regs): add pop RTL instruction to revert the above push
(emit_stack_check_insns): new function to write RTL instructions for
stack check at prologue stage.
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_output_function_prologue): stack check integration for Thumb-1

* config/arm/arm.md
(cbranchsi4_insn): allow compare and branch using stack pointer
register [at thumb mode]
(arm_cmpsi_insn): allow comparing using stack pointer register [at arm]
(probe_stack): do not emit code when parameters direct or indirect
is given, emit move code way same as in gcc/explow.c [function
emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter generic is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_failure): failure call used in stack check functions
emit_stack_check_insns, generic_limit_check_stack or
allocate_dynamic_stack_space [similar to a trap but avoid conflict with
builtin_trap]

Index: gcc/opts.c
===
--- gcc/opts.c  (revision 178554)
+++ gcc/opts.c  (working copy)
@@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts,
   : STACK_CHECK_STATIC_BUILTIN
 ? STATIC_BUILTIN_STACK_CHECK
 : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+   /* This is an other stack checking method.  */
+   opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 178554)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,15 @@ enum stack_check_type
 
   /* Check the stack and entirely rely on the target configuration
  files, i.e. do not use the generic mechanism at all.  */
-  FULL_BUILTIN_STACK_CHECK
+  FULL_BUILTIN_STACK_CHECK,
+
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is directly given e.g. by address
+ of a symbol */
+  DIRECT_STACK_CHECK,
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is given by global variable. */
+  INDIRECT_STACK_CHECK
 };
 
 /* Names for the different levels of -Wstrict-overflow=N.  The numeric
Index: gcc/explow.c
===
--- gcc/explow.c(revision 178554)
+++ gcc/explow.c(working copy)
@@ -1372,7 +1372,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s
 
   /* If needed, check that we have the required amount of stack.  Take into
  account what has already been checked.  */
-  if (STACK_CHECK_MOVING_SP)
+  if (  STACK_CHECK_MOVING_SP 
+#ifdef HAVE_generic_limit_check_stack
+ || crtl-limit_stack
+#endif
+ || flag_stack_check

Re: Ping: C-family stack check for threads

2011-08-02 Thread Thomas Klein

Hello

Here is my next try to put the stack check into rtl at prologue stage.
To me, it was not as easy as I hoped.
I've had little problems to get push/pop and the compare/jump working.
Hoping the way i choose is acceptable.
With rtl no extra pool to hold pointer or size values is required any more.
That's fine.
So this movement to rtl dose make sense.

Regards
  Thomas Klein


Index: gcc/opts.c
===
--- gcc/opts.c(revision 176974)
+++ gcc/opts.c(working copy)
@@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts,
: STACK_CHECK_STATIC_BUILTIN
  ? STATIC_BUILTIN_STACK_CHECK
  : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
 warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 176974)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,15 @@ enum stack_check_type

   /* Check the stack and entirely rely on the target configuration
  files, i.e. do not use the generic mechanism at all.  */
-  FULL_BUILTIN_STACK_CHECK
+  FULL_BUILTIN_STACK_CHECK,
+
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is directly given e.g. by address
+ of a symbol */
+  DIRECT_STACK_CHECK,
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is given by global variable. */
+  INDIRECT_STACK_CHECK
 };

 /* Names for the different levels of -Wstrict-overflow=N.  The numeric
Index: gcc/explow.c
===
--- gcc/explow.c(revision 176974)
+++ gcc/explow.c(working copy)
@@ -1358,7 +1358,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s

   /* If needed, check that we have the required amount of stack.  Take 
into

  account what has already been checked.  */
-  if (STACK_CHECK_MOVING_SP)
+  if (  STACK_CHECK_MOVING_SP
+#ifdef HAVE_generic_limit_check_stack
+ || crtl-limit_stack
+#endif
+ || flag_stack_check == DIRECT_STACK_CHECK
+ || flag_stack_check == INDIRECT_STACK_CHECK)
 ;
   else if (flag_stack_check == GENERIC_STACK_CHECK)
 probe_stack_range (STACK_OLD_CHECK_PROTECT + 
STACK_CHECK_MAX_FRAME_SIZE,

@@ -1392,19 +1397,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s
   /* Check stack bounds if necessary.  */
   if (crtl-limit_stack)
 {
+  rtx limit_rtx;
   rtx available;
   rtx space_available = gen_label_rtx ();
+  if (  GET_CODE (stack_limit_rtx) == SYMBOL_REF
+  flag_stack_check == INDIRECT_STACK_CHECK)
+limit_rtx = expand_unop (Pmode, mov_optab,
+gen_rtx_MEM (Pmode, stack_limit_rtx),
+NULL_RTX, 1);
+  else
+limit_rtx = stack_limit_rtx;
 #ifdef STACK_GROWS_DOWNWARD
   available = expand_binop (Pmode, sub_optab,
-stack_pointer_rtx, stack_limit_rtx,
+stack_pointer_rtx, limit_rtx,
 NULL_RTX, 1, OPTAB_WIDEN);
 #else
   available = expand_binop (Pmode, sub_optab,
-stack_limit_rtx, stack_pointer_rtx,
+limit_rtx, stack_pointer_rtx,
 NULL_RTX, 1, OPTAB_WIDEN);
 #endif
   emit_cmp_and_jump_insns (available, size, GEU, NULL_RTX, Pmode, 1,
space_available);
+#ifdef HAVE_stack_failure
+  if (HAVE_stack_failure)
+emit_insn (gen_stack_failure ());
+  else
+#endif
 #ifdef HAVE_trap
   if (HAVE_trap)
 emit_insn (gen_trap ());
@@ -1547,6 +1565,13 @@ probe_stack_range (HOST_WIDE_INT first, rtx size)
 return;
 }
 #endif
+#ifdef HAVE_generic_limit_check_stack
+  else if (HAVE_generic_limit_check_stack)
+{
+  rtx addr = memory_address (Pmode,stack_pointer_rtx);
+  emit_insn (gen_generic_limit_check_stack (addr));
+}
+#endif

   /* Otherwise we have to generate explicit probes.  If we have a constant
  small number of them to generate, that's the easy case.  */
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 176974)
+++ gcc/config/arm/arm.c(working copy)
@@ -15809,6 +15809,299 @@ thumb_set_frame_pointer (arm_stack_offsets *offset
   RTX_FRAME_RELATED_P (insn) = 1;
 }

+/*search for possible work registers for stack-check operation at prologue
+ return the number of register that can be used without extra push/pop */
+
+static int

Re: Ping: C-family stack check for threads

2011-07-04 Thread Thomas Klein

Richard Henderson wrote:

 On 07/03/2011 08:06 AM, Thomas Klein wrote:
   +/*
   + * Write prolouge part of stack check into asm file.
   + * For Thumb this may look like this:
   + *   push {rsym,ramn}
   + *   ldr rsym, .LSPCHK0
   + *   ldr rsym, [rsym]
   + *   ldr ramn, .LSPCHK0 + 4
   + *   add rsym, rsym, ramn
   + *   cmp sp, rsym
   + *   bhs .LSPCHK1
   + *   push {lr}
   + *   bl __thumb_stack_failure
   + * .align 2
   + * .LSPCHK0:
   + *   .word symbol_addr_of(stack_limit_rtx)
   + *   .word lenght_of(amount)
+ * .LSPCHK1:
   + *   pop {rsym,ramn}
   + */
   +void
   +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount,
   + unsigned numregs)
   +{

 Is there an exceedingly good reason you're emitting this much code
 as text, rather than as rtl?


To me, the stack check is one coherent operation.
This is placed after an initial push, which can't be eliminated, but before a 
major stack adjustment.

I have, had some problems with rtl at prologue stage.
Is there a way to encapsulate a rtl sequence within prologue.
There is a emit_multi_reg_push but is there something like emit_multi_reg_pop, 
too.
Are the other operations (compare, branche, ..) still allowed?


 In particular, you adjust the stack but not the unwind info.  So
 if one puts a breakpoint at your __thumb_stack_failure function,
 the unwind information will be incorrect.


Yes, if the failure function is taken the info will be wrong.
If this is a major problem do I have to add this info after any push and pop 
operation?
Will the rtl push/pop do this already for me?

Regards
 Thomas Klein




Re: Ping: C-family stack check for threads

2011-07-03 Thread Thomas Klein

Ye Joey wrote:

 Thomas,

 I think your are working on a very useful feature. I have ARM MCU
 applications running of out stack space and resulting strange
 behaviors silently. I'd like to try your patch and probably give
 further comments

 - Joey


Hi
Due to convention of of thumb prologue to rtl, this patch needs to be modified 
too.

Regards
  Thomas Klein

gcc/ChangeLog
2011-07-03  Thomas Kleinth.r.kl...@web.de  mailto:th.r.kl...@web.de

* opts.c (common_handle_option): introduce additional stack checking
parameters direct and indirect
* flag-types.h (enum stack_check_type): Likewise
* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter direct, indirect or if a
stack-limit is given
- do additional read of limit value if parameter indirect and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range probe
emit generic_limit_check_stack
* config/arm/arm.c (stack_check_output_function): new function to write
the stack check code sequence to the assember file (inside prologue)
(stack_check_work_registers): new function to find possible working
registers [only used by stack check]
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_expand_prologue): stack check integration for Thumb-1
* config/arm/arm.md (probe_stack): do not emit code when parameters
direct or indirect given, emit move code as in gcc/explow.c
[function emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter generic is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_check): ARM/Thumb-2/Thumb-1 insn to output function
stack_check_output_function
(stack_failure): failure call used in function
allocate_dynamic_stack_space [similar to a trap but avoid conflict with
builtin_trap]

Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 175786)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,15 @@ enum stack_check_type

   /* Check the stack and entirely rely on the target configuration
  files, i.e. do not use the generic mechanism at all.  */
-  FULL_BUILTIN_STACK_CHECK
+  FULL_BUILTIN_STACK_CHECK,
+
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is directly given e.g. by address
+ of a symbol */
+  DIRECT_STACK_CHECK,
+  /* Check the stack (if possible) before allocation of local variables at
+ each function entry. The stack limit is given by global variable. */
+  INDIRECT_STACK_CHECK
 };

 /* Names for the different levels of -Wstrict-overflow=N.  The numeric
Index: gcc/explow.c
===
--- gcc/explow.c(revision 175786)
+++ gcc/explow.c(working copy)
@@ -1358,7 +1358,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s

   /* If needed, check that we have the required amount of stack.  Take into
  account what has already been checked.  */
-  if (STACK_CHECK_MOVING_SP)
+  if (  STACK_CHECK_MOVING_SP
+#ifdef HAVE_generic_limit_check_stack
+ || crtl-limit_stack
+#endif
+ || flag_stack_check == DIRECT_STACK_CHECK
+ || flag_stack_check == INDIRECT_STACK_CHECK)
 ;
   else if (flag_stack_check == GENERIC_STACK_CHECK)
 probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
@@ -1392,19 +1397,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s
   /* Check stack bounds if necessary.  */
   if (crtl-limit_stack)
{
+  rtx limit_rtx;
  rtx available;
  rtx space_available = gen_label_rtx ();
+  if (  GET_CODE (stack_limit_rtx) == SYMBOL_REF
+  flag_stack_check == INDIRECT_STACK_CHECK)
+limit_rtx = expand_unop (Pmode, mov_optab,
+   gen_rtx_MEM (Pmode, stack_limit_rtx),
+   NULL_RTX, 1);
+  else
+limit_rtx = stack_limit_rtx;
 #ifdef STACK_GROWS_DOWNWARD
  available = expand_binop (Pmode, sub_optab,
-   stack_pointer_rtx, stack_limit_rtx,
+   stack_pointer_rtx, limit_rtx,
NULL_RTX, 1, OPTAB_WIDEN);
 #else
  available = expand_binop (Pmode, sub_optab,
-   stack_limit_rtx, stack_pointer_rtx,
+   limit_rtx, stack_pointer_rtx,
NULL_RTX, 1, OPTAB_WIDEN);
 #endif
  emit_cmp_and_jump_insns (available, size, GEU, NULL_RTX, Pmode, 1,
   space_available);
+#ifdef

C-family stack check for threads

2011-03-20 Thread Thomas Klein

Hi

I would like to have a stack check for threads with small stack space 
for each thread.
(I'm using a ARM Cortex-M3 microcontroller with a stack size of a 1 
KByte per Thread.)

Each thread having its own limit address.
The thread scheduler can then calculate the limit and store this value 
inside of a global variable.
The compiler may generate code to check the stack for overflow at 
function entry.

In principal this can be done this way:
  - push registers as usual
  - figure out if one or two work registers, that can be used directly 
without extra push

  - if not enough registers found push required work registers to stack
  - load limit address into first working register
  - load value of limit address (into the same register)
  - if stack pointer will go to extend the stack (e.g. for local variables)
load this size value too (here the second work register can be used)
  - compare for overflow
  - if overflow occur call stack_failure function
  - pop work registers that are pushed before
  - continue function prologue as usual e.g. extend stack pointer

The ARM target has an option -mapcs-stack-check but this is more or 
less not working. (implementaion missing)

There are also architecture independent options like
-fstack-check=generic, -fstack-limit-symbol=current_stack_limit or 
-fstack-limit-register=r6

that can be used.

The generic stack check is doing a probe at end of function prologue phase
(e.g by writing 12K ahead the current stack pointer position).
If this stack space is not available the probe may generates a fault.
This require that the CPU is having a MPU or a MMU.
For machines with small memory space an additional mechanism should be
available.

The option -fstack-check can be extend by the switches direct and 
indirect to emit compare code in function prologue.
If switch direct is given the address of -fstack-limit-symbol 
represents the limit itself.

If switch indirect is given -fstack-limit-symbol is a kind of global
variable that needs be read before compare.

I have add an proposal to show how an integrateion of this behavior can
be done at ARM architecture.

The generated code look like this
e.g. if using -fstack-check=indirect -fstack-limit-symbol=stack_limit_var
-push {r0}
-ldr  r0, =stack_limit_var
-ldr  r0, [r0]
-cmp  sp, r0
-bhs  1f
-push {lr}
-bl__thumb_stack_failure@ stack check
-.align
-.ltorg
-1:
-pop{r0}

Regards
  Thomas Klein

gcc/ChangeLog

2011-03-20  Thomas Klein th.r.kl...@web.de
* opts.c (common_handle_option): introduce new parameters direct and
indirect
* flag-types.h (enum stack_check_type): Likewise

* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter direct, indirect or if a
stack-limit is given
- do additional read of limit value if parameter indirect and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range porbe
emit generic_limit_check_stack

* config/arm/arm.c (stack_check_output_function): new function to write
the stack check code sequence to the assember file
(stack_check_work_registers): new function to find possible working
registers [only used by stack check]
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_output_function_prologue): stack check integration for Thumb-1

* config/arm/arm.md (probe_stack): do not emit code when parameters
direct or indirect given, emit move code as in gcc/explow.c
[function emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter generic is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_check): ARM/Thumb-2 insn to output function
stack_check_output_function
(stack_failure): failure call used in function
allocate_dynamic_stack_space [similar to a trap but avoid conflict with
builtin_trap]

Index: gcc/opts.c
===
--- gcc/opts.c(revision 171194)
+++ gcc/opts.c(working copy)
@@ -1618,6 +1618,12 @@ common_handle_option (struct gcc_options *opts,
: STACK_CHECK_STATIC_BUILTIN
  ? STATIC_BUILTIN_STACK_CHECK
  : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
 warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h

[PATCH] PR 47836, Some Cross Compiler can't build target-libiberty or target-zlib

2011-03-08 Thread Thomas Klein

Hello

This is more generalized way to give a user the ability to override
the generation of target libraries, that are enabled per default.
For example with the configure switch --disable-target-zlib,
target-zlib is added to the script variable noconfigdirs and this
target library will not be built.

regards
  Thomas

PS.
If it helps. I've already done copyright assignment for future changes.
But, not explicit for the configure scripts.
Also I don't have write permission, (nor I'm requesting for this).

2011-03-08  Thomas Kleinth.r.kl...@web.de

PR 47836
* configure.ac: accept --disable-target-.. from user
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 170774)
+++ configure.ac(working copy)
@@ -2081,6 +2081,28 @@ case ,${enable_languages},:${enable_objc_gc} in
 ;;
 esac

+# a user forced --disable-target-.. was given
+# add this to the ingnore list if not already present
+for target_lib_var in $target_libraries
+do
+  var=`$as_echo $target_lib_var | sed 's/[[-+.]]/_/g'`
+  eval is_enabled=\$enable_$var
+  if test x$is_enabled = xno ; then
+append_var=yes
+for var in $noconfigdirs $skipdirs
+do
+  if test x$var = x$target_lib_var ; then
+append_var=no
+   break
+  fi
+done
+if test x$append_var = xyes ; then
+  noconfigdirs=$noconfigdirs $target_lib_var
+  echo add $target_lib_var to noconfigdirs
+fi
+  fi
+done
+
 # Remove the entries in $skipdirs and $noconfigdirs from $configdirs,
 # $build_configdirs and $target_configdirs.
 # If we have the source for $noconfigdirs entries, add them to $notsupp.
Index: configure
===
--- configure   (revision 170774)
+++ configure   (working copy)
@@ -6546,6 +6546,28 @@ case ,${enable_languages},:${enable_objc_gc} in
 ;;
 esac

+# a user forced --disable-target-.. was given
+# add this to the ingnore list if not already present
+for target_lib_var in $target_libraries
+do
+  var=`$as_echo $target_lib_var | sed 's/[-+.]/_/g'`
+  eval is_enabled=\$enable_$var
+  if test x$is_enabled = xno ; then
+append_var=yes
+for var in $noconfigdirs $skipdirs
+do
+  if test x$var = x$target_lib_var ; then
+append_var=no
+   break
+  fi
+done
+if test x$append_var = xyes ; then
+  noconfigdirs=$noconfigdirs $target_lib_var
+  echo add $target_lib_var to noconfigdirs
+fi
+  fi
+done
+
 # Remove the entries in $skipdirs and $noconfigdirs from $configdirs,
 # $build_configdirs and $target_configdirs.
 # If we have the source for $noconfigdirs entries, add them to $notsupp.



C-family stack check for threads

2011-01-13 Thread Thomas Klein

Hi

I would like to have a stack check for threads with small stack space 
for each thread.
(I'm using a ARM Cortex-M3 microcontroller with a stack size of a 1 
KByte per Thread.)

Each thread having its own limit address.
The thread scheduler can then calculate the limit and store this value 
inside of a global variable.
The compiler may generate code to check the stack for overflow at 
function entry.

In principal this can be done this way:
 - push registers as usual
 - figure out if one or two work registers, that can be used directly 
without extra push

 - if not enough registers found push required work registers to stack
 - load limit address into first working register
 - load value of limit address (into the same register)
 - if stack pointer will go to extend the stack (e.g. for local 
variables) load this size value too

   (here the second work register can be used)
 - compare for overflow
 - if overflow occur call stack_failure function
 - pop work registers that are pushed before
 - continue function prologue as usual e.g. extend stack pointer

The ARM target has an option -mapcs-stack-check but this is more or 
less not working. (implementaion missing)

There are also architecture independent options like
-fstack-check=generic, -fstack-limit-symbol=current_stack_limit or 
-fstack-limit-register=r6

that can be used.

The generic stack check is doing a probe at end of function prologue phase
(e.g by writing 12K ahead the current stack pointer position).
If this stack space is not available the probe may generates a fault.
This require that the CPU is having a MPU or a MMU.
For machines with small memory space an additional mechanism should be 
available.


The option -fstack-check can be extend by the switches direct and 
indirect to emit compare code in function prologue.
If switch direct is given the address of -fstack-limit-symbol 
represents the limit itself.
If switch indirect is given -fstack-limit-symbol is a kind of global 
variable that needs be read before compare.


I have add an proposal to show how an integrateion of this behavior can 
be done at ARM architecture.


Is there interest to have such a feature at GCC side?
Is there someone with write permission who is willing to play the role 
as a volunteer for this task?
Is the code still small enough to be acceptable or is additional 
paperwork required first?

The generated code itself will be small
e.g. if using -fstack-check=indirect -fstack-limit-symbol=stack_limit_var
-push{r0}
-ldrr0, =stack_limit_var
-ldrr0, [r0]
-cmpsp, r0
-bhs1f
-push{lr}
-bl__thumb_stack_failure@ stack check
-.align
-.ltorg
-1:
-pop{r0}
The rest of the implementation overhead is only GCC specific.

Regards
 Thomas Klein

PS
Here are some more implementation hints.
introduce new parameters direct and indirect in gcc/opts.c and 
gcc/flag-types.h


gcc/explow.c function allocate_dynamic_stack_space:
 - suppress stack probing if parameter direct, indirect or if a 
stack-limit is given
 - do additional read of limit value if parameter indirect and a 
stack-limit symbol is given


gcc/config/arm/arm.c
 - new function stack_check_output_function to write the stack check 
to the assember file
 - new function stack_check_work_registers to find possible working 
registers (only used by stack check)

 - integration for ARM and Thumb-2 in function arm_expand_prologue
 - integration for Thumb-1 in function thumb1_output_function_prologue

gcc/config/arm/arm.md
 - probe_stack: do not emit code when parameters direct or indirect 
given

emit code as in gcc/explow.c
 - probe_stack_done: dummy to make sure probe_stack insns are not 
optimized away
 - check_stack: if stack-limit and parameter generic is given use the 
limit the same way as in function allocate_dynamic_stack_space
 - stack_check: ARM/Thumb-2 insn to output function 
stack_check_output_function

 - trap: failure call used in function allocate_dynamic_stack_space


Index: gcc/opts.c
===
--- gcc/opts.c(revision 168762)
+++ gcc/opts.c(working copy)
@@ -1616,6 +1616,12 @@ common_handle_option (struct gcc_options *opts,
: STACK_CHECK_STATIC_BUILTIN
  ? STATIC_BUILTIN_STACK_CHECK
  : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, indirect))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, direct))
+/* This is an other stack checking method.  */
+opts-x_flag_stack_check = DIRECT_STACK_CHECK;
   else
 warning_at (loc, 0, unknown stack check parameter \%s\, arg);
   break;
Index: gcc/flag-types.h
===
--- gcc/flag-types.h(revision 168762)
+++ gcc/flag-types.h(working copy)
@@ -153,7 +153,11 @@ enum stack_check_type

   /* Check

Request for clarification on how a contribution to gcc can be made

2010-12-13 Thread Thomas Klein

Hello

To me it looks like that what is described in the online document 
http://gcc.gnu.org/contribute.html  is either not correct or is being 
misinterpreted at least by me.
It's not clear to me at which point the FSF is trusting an individual 
(or organization or company) and why it is mistrusting an individual per 
default.

Is there a way to suggest a code changes.
What kind of paper work is required for small code changes and what for 
huge code changes.
If a potential change is reviewed and accepted by a maintainer, who has 
to commit the change and when are they made.
(In assumption the person who is asking for a change usually did not 
have svn write permission.)

A clarification at GCC side would reduce frustration for people like me.

Regards
  Thomas


Add static size report when using -fstack-usage for ARM targets

2010-11-21 Thread Thomas Klein

Hello

With GCC 4.6 a new switch -fstack-usage has been added.
Some target architectures have support for this.
To give ARM targets support for this feature only a few lines of code 
are missing.

Is it possible to add this or something similar?

regards
  Thomas


2010-11-21  Thomas Klein th.r.kl...@web.de

* config/arm/arm.c (arm_expand_prologue): Report the static  ..
* config/arm/arm.c (thumb1_expand_prologue): .. stack size if 
-fstack-usage is used.


Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 167002)
+++ gcc/config/arm/arm.c(working copy)
@@ -15722,6 +15722,13 @@ arm_expand_prologue (void)
}
 }

+  if (flag_stack_usage)
+{
+  HOST_WIDE_INT stack_size = saved_regs;
Hello

With GCC 4.6 a new switch -fstack-usage has been added.
Some target architectures have support for this.
To give ARM targets support for this feature only a few lines of code 
are missing.

Is it possible to add this or something similar?

Regards
  Thomas


2010-11-21  Thomas Klein th.r.kl...@web.de

* config/arm/arm.c (arm_expand_prologue): Report the static  ..
* config/arm/arm.c (thumb1_expand_prologue): .. stack size if 
-fstack-usage is used.


Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 167002)
+++ gcc/config/arm/arm.c(working copy)
@@ -15722,6 +15722,13 @@ arm_expand_prologue (void)
}
 }

+  if (flag_stack_usage)
+{
+  HOST_WIDE_INT stack_size = saved_regs;
+  current_function_static_stack_size = stack_size;
+}
+
+
   if (offsets-outgoing_args != offsets-saved_args + saved_regs)
 {
   /* This add can produce multiple insns for a large constant, so we
@@ -15733,6 +15740,12 @@ arm_expand_prologue (void)

   insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
amount));
+  if (flag_stack_usage)
+{
+   HOST_WIDE_INT stack_size = offsets-outgoing_args - 
(offsets-saved_args + saved_regs);

+   current_function_static_stack_size += stack_size;
+}
+
   do
{
  last = last ? NEXT_INSN (last) : get_insns ();
@@ -20535,6 +20548,10 @@ thumb1_expand_prologue (void)
stack_pointer_rtx);

   amount = offsets-outgoing_args - offsets-saved_regs;
+  if (flag_stack_usage)
+{
+   current_function_static_stack_size = amount;
+}
   amount -= 4 * thumb1_extra_regs_pushed (offsets, true);
   if (amount)
 {
+  current_function_static_stack_size = stack_size;
+}
+
+
   if (offsets-outgoing_args != offsets-saved_args + saved_regs)
 {
   /* This add can produce multiple insns for a large constant, so we
@@ -15733,6 +15740,12 @@ arm_expand_prologue (void)

   insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
amount));
+  if (flag_stack_usage)
+{
+   HOST_WIDE_INT stack_size = offsets-outgoing_args - 
(offsets-saved_args + saved_regs);

+   current_function_static_stack_size += stack_size;
+}
+
   do
{
  last = last ? NEXT_INSN (last) : get_insns ();
@@ -20535,6 +20548,10 @@ thumb1_expand_prologue (void)
stack_pointer_rtx);

   amount = offsets-outgoing_args - offsets-saved_regs;
+  if (flag_stack_usage)
+{
+   current_function_static_stack_size = amount;
+}
   amount -= 4 * thumb1_extra_regs_pushed (offsets, true);
   if (amount)
 {



TLS support on ARM

2009-12-03 Thread Thomas Klein

Hello

To me it looks like, that support for Thread Local Storage exists on ARM 
cpu's.
When needed the compiler is going to fetch the base pointer by a 
internal __builtin_thread_pointer() call.
This is either a call to __aeabi_read_tp() or a Coprocessor fetch 
instruction.


If I'm going to implement __aeabi_read_tp() as a standard C-function, I 
will get in trouble since the registers r1 to r3 are not saved before.

This behaviour is commented in file arm.md:
..
;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
(define_insn load_tp_soft
..

Dose anyone know the reason why they are not clobbered?
Is there a way to save r1-r3 at function entry? (e.g. 
__attribute__((save_noreturn_args)) )


The next point is that the __builtin_thread_pointer() call isn't 
ARM/Thumb interwork save.
To use the hard Coprocessor fetch instruction the calling function 
must run in ARM mode.
To use soft implementation caller and __aeabi_read_tp() must run in 
the same mode.


Is the implementation still incomplete?

regards
Thomas



RE: TLS support on ARM

2009-12-03 Thread Thomas Klein

Hello

  Dose anyone know the reason why they are not clobbered?

 So that they don't have to be saved.  This function is supposed to be
 very fast.  If you want to use a slow implementation, write an
 assembly wrapper which saves additional registers.

This might be the initial plan.
But is this true?

Without clobbering the registers r1-r3
the compiler generates something like this:
ldr r3,[pc, #48]
bl __aeabi_read_tp
addsr7, r0, r3
..
Additional a push and pop of r1-r3 in function __aeabi_read_tp () might 
be required.


With clobbering I can see:
bl __aeabi_read_tp
ldr r3,[pc, #48]
addsr7, r0, r3
..
Here the clobbered version is faster.
Maybe there is an other reason not to clobber.

  The next point is that the __builtin_thread_pointer() call isn't
  ARM/Thumb interwork save.
  To use the hard Coprocessor fetch instruction the calling function
  must run in ARM mode.

 True (or Thumb-2, I think).

  To use soft implementation caller and __aeabi_read_tp() must run in
  the same mode.

 I don't believe that this is true.  In what way is it not safe?

A bl __aeabi_read_tp call does not exchanging the mode.
So the program simply crashes.
Using a blx instruction dose the mode exchange,
but this instruction only exists since ArchV5, so this won't help for 
ArchV4T (aka ARM7TDMI).


Long calls also seems not to be handled here.
(There might be reason not to handle this.)

That's why I'm asking.
Is the implementation still incomplete?

regards
Thomas



RE: TLS support on ARM

2009-12-03 Thread Thomas Klein

Hello

  But is this true?

 It is true because a typical implementation of this function has no
 need to clobber registers.  For instance, glibc's calls a kernel
 helper this way:

Ah. now I understand, you require to have a virtual memory system (or 
similar) that is translating the call into a system call.

Without VM I still can use e.g.
svc #17
bx lr
as an __aeabi_read_tp() implementation.
The real work has to be done inside of the exception handler.

You are right, in this case no clobbering is needed.

 The linker is responsibe for converting bl to blx, or for inserting
 mode changing stubs.  It is also responsible for long calls.  Unless
 you're using a really old linker, I can't see why you would have any
 problems.
It's a while ago that I have had problems with long calls.
I did not realised that ld is doing that for me, great.

 Do you have a concrete problem?
I've had a problem with my __aeabi_read_tp() implementation.
This was solved by clobbering registers r1-r3.

Thank you for this information.
Now that I know that this is not a bug, I can decide what to do
, changing the compiler , or better changing my implementation.

regards
Thomas