On Wed, Mar 24, 2021 at 09:41:03AM +0100, Willy Tarreau wrote:
> This is particularly strange. Could you please disassemble hlua_alloc ?
> (dis hlua_alloc) ?
> 
> You should find something like this:
> 
>    0x00000000004476c3 <+147>:   add    DWORD PTR fs:0xfffffffffffdd678,0x1
>    0x00000000004476cc <+156>:   lock add QWORD PTR [rip+0x25df8b],0x1        
> # 0x6a5660 <_.44806>
>    0x00000000004476d5 <+165>:   lock add QWORD PTR [rip+0x25df8b],rbx        
> # 0x6a5668 <_.44806+8>
>    0x00000000004476dd <+173>:   mov    rdi,rbx
>    0x00000000004476e0 <+176>:   call   0x4248b0 <malloc@plt>
>    0x00000000004476e5 <+181>:   sub    DWORD PTR fs:0xfffffffffffdd678,0x1
>    0x00000000004476ee <+190>:   test   rax,rax
>    0x00000000004476f1 <+193>:   jne    0x44769d <hlua_alloc+109>
> 
> The "add DWORD..." is the ++, and the "sub DWORD..." is the --, both of
> them around the function call.
> 
> If the variable had been declared static I would have agreed with the
> possibility of reordering. But here it's global and it's not legal to
> move it when you don't know if the called function will make use of it.
> Otherwise global variables cannot be used anymore!
> 
> By the way, in your case above it should even equal 2. That's rather
> strange.
> 
> One thing you can test is simply to declare it volatile. If it changes
> anything, it will mean that gcc has some specific assumptions on this
> malloc function (i.e. it decides it will never read your global variables) :-/

And this seems to be the case :-(

I was wondering where the two lock add instructions were coming from
and figure that I was building with DEBUG_MEM_STATS. Removing it
resulted in the add/sub being remove, but only around malloc, not
around realloc nor free.

Malloc is declared with __attribute__((malloc)) on my system so I
wondered if it could cause this. I tried adding this in the function:

        extern void *malloc (size_t __size);

And it didn't change the output code. I just *renamed* the function
to blablalloc() and now the code is correct again:

    22cb:       64 83 04 25 00 00 00    addl   $0x1,%fs:0x0
    22d2:       00 01 
                        22cf: R_X86_64_TPOFF32  hlua_not_dumpable
    22d4:       48 89 df                mov    %rbx,%rdi
    22d7:       e8 00 00 00 00          callq  22dc <hlua_alloc+0x8c>
                        22d8: R_X86_64_PLT32    blablalloc-0x4
    22dc:       64 83 2c 25 00 00 00    subl   $0x1,%fs:0x0
    22e3:       00 01 
                        22e0: R_X86_64_TPOFF32  hlua_not_dumpable
    22e5:       48 85 c0                test   %rax,%rax
    22e8:       75 be                   jne    22a8 <hlua_alloc+0x58>

So yes, it looks like gcc decides that a function called "malloc" will
never use your program's global variables but that "blablalloc" may. I
have no explanation to this except "optimization craziness" resulting
in breaking valid code, since it means that if I provide my own malloc
function it might not work. Pfff....

At least the volatile solves it and is cleaner since it indicates it
should assume it doesn't know it (e.g. for use from a signal handler
which is what we're doing by the way):

diff --git a/include/haproxy/hlua.h b/include/haproxy/hlua.h
index 29656258e..46fbba6ec 100644
--- a/include/haproxy/hlua.h
+++ b/include/haproxy/hlua.h
@@ -50,7 +50,7 @@ void hlua_applet_tcp_fct(struct appctx *ctx);
 void hlua_applet_http_fct(struct appctx *ctx);
 struct task *hlua_process_task(struct task *task, void *context, unsigned 
short state);
 
-extern THREAD_LOCAL unsigned int hlua_not_dumpable;
+extern THREAD_LOCAL volatile unsigned int hlua_not_dumpable;
 #else /* USE_LUA */
 
 /************************ For use when Lua is disabled ********************/
diff --git a/src/hlua.c b/src/hlua.c
index 5bef67a48..6ccce72dd 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -237,7 +237,7 @@ struct hlua_mem_allocator {
 static struct hlua_mem_allocator hlua_global_allocator;
 
  /* > 0 if lua is in a non-rentrant part, thus with a non-dumpable stack */
-THREAD_LOCAL unsigned int hlua_not_dumpable = 0;
+THREAD_LOCAL unsigned int volatile hlua_not_dumpable = 0;
 
 /* These functions converts types between HAProxy internal args or
  * sample and LUA types. Another function permits to check if the

Willy

Reply via email to