suoyuanG opened a new pull request, #17815:
URL: https://github.com/apache/nuttx/pull/17815

   ## Summary
   
   Fix #17040
   
   In 
https://github.com/ARM-software/abi-aa/blob/main/rtabi32/rtabi32.rst#thread-local-storage-new-in-v2-01,
 we know that `__aeabi_read_tp` cannot modify registers $r1\~$r3, so we need to 
use assembly implementation instead of C functions.
   
   As #17040 mentioned here, our current implementation modifies the $r1\~r3 
register, but the compiler generates code based on the assumption that the 
function will not modify $r1~$r3. Therefore, you might find that your 
thread_local is causing problems, which is very likely due to this. I will 
provide an example below to reproduce the problem.
   
   Secondly, we changed `_stdata`  and `_stbss` to `uint_8`, but the TLS 
initialization here doesn't synchronize this change; it still follows the 
assumption that they are `uint32_t`. We need to modify this and adapt for any 
padding that might exist between the two sections to accurately perform TLS 
initialization.
   
   ## Testing
   
   In qemu-armv7a:nsh:
   
   First, you need to apply the following patch to  link script:
   
   ```patch
   diff --git a/boards/arm/qemu/qemu-armv7a/scripts/dramboot.ld 
b/boards/arm/qemu/qemu-armv7a/scripts/dramboot.ld
   index 19e481ecd6..de0104fad9 100644
   --- a/boards/arm/qemu/qemu-armv7a/scripts/dramboot.ld
   +++ b/boards/arm/qemu/qemu-armv7a/scripts/dramboot.ld
   @@ -93,6 +93,18 @@ SECTIONS
          _erodata = .;
      } > ROM
    
   +  .tdata : {
   +      _stdata = ABSOLUTE(.);
   +      *(.tdata .tdata.* .gnu.linkonce.td.*);
   +      _etdata = ABSOLUTE(.);
   +  } > ROM
   +
   +  .tbss : {
   +      _stbss = ABSOLUTE(.);
   +      *(.tbss .tbss.* .gnu.linkonce.tb.* .tcommon);
   +      _etbss = ABSOLUTE(.);
   +  } > ROM
   +
      _eronly = LOADADDR(.data);
      .data : {                    /* Data */
          _sdata = .;
   ```
   
   Then, we can use the following code:
   
   ```c
   #include <nuttx/config.h>
   #include <stdio.h>
   #include <pthread.h>
   #include <unistd.h>
   #include <threads.h>
   
   thread_local int hello_counter = 0;
   
   static void *thread_function(void *arg)
   {
     int thread_id = *(int *)arg;
   
     for (int i = 0; i < 5; i++)
       {
         hello_counter++;
         printf("Thread %d: hello_counter = %d, where is located at: %p\n", 
thread_id, hello_counter, (void *)&hello_counter);
         usleep(100000);
       }
   
     printf("Thread %d final counter: %d\n", thread_id, hello_counter);
     return NULL;
   }
   
   int main(int argc, FAR char *argv[])
   {
     pthread_t thread1, thread2, thread3;
     int id1 = 1, id2 = 2, id3 = 3;
   
     pthread_create(&thread1, NULL, thread_function, &id1);
     pthread_create(&thread2, NULL, thread_function, &id2);
     pthread_create(&thread3, NULL, thread_function, &id3);
   
     pthread_join(thread1, NULL);
     pthread_join(thread2, NULL);
     pthread_join(thread3, NULL);
   
     // printf("hello_counter in main thread is located at: %p\n", (void 
*)&hello_counter);
     printf("Main thread: hello_counter = %d (unmodified)\n", hello_counter);
     printf("Thread-local demo completed!\n");
   
     return 0;
   }
   ```
   
   Finally, you need to enable `CONFIG_SCHED_THREAD_LOCAL`, and  your GCC must 
be compiled using `--enable-tls` .
   
   ```bash
   $ arm-none-eabi-objdump --disassemble=hello_main build/nuttx
   ...
      226b4:    eb0028ec        bl      2ca6c <pthread_join>
      226b8:    e59f3024        ldr     r3, [pc, #36]   @ 226e4 
<hello_main+0xa4>
      226bc:    ebffa8fa        bl      caac <__aeabi_read_tp>
      226c0:    e7931000        ldr     r1, [r3, r0]
      226c4:    e59f001c        ldr     r0, [pc, #28]   @ 226e8 
<hello_main+0xa8>
   ...
   ```
   
   Running this nuttx will result in loading an inaccessible address, because 
$r3 was originally an offset for the Thread Pointer, but  $r3 will be modified 
to a different value after calling `__aeabi_read_tp`.
   
   I have run ostest in qemu-armv7a:nsh and there are no errors.
   
   > You might find you can't reproduce this issue, and `__aeabi_read_tp` isn't 
even in nuttx. This is because the compiler might not generate code for Arm-A 
by default that uses `__aeabi_read_tp` to obtain the thread pointer. You need 
to add the `-mtp=soft` compiler parameter to use `__aeabi_read_tp` as the 
method for obtaining the thread pointer. In fact, I want to submit another 
patch to fix this.
   >
   > 
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-mtp
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to