Re: druntime thread (from foreach parallel?) cleanup bug
On Tuesday, 1 November 2022 at 19:49:47 UTC, mw wrote: On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote: [...] Maybe the hunt library author doesn't know. (My code does not directly use this library, it got pulled in by some other decencies.) [...] Please, if you see anything in the docs that needs to be updated, make a PR right away <3 Documentation saves lives! The times I have thought "I'll do it later" have been too many.
Re: druntime thread (from foreach parallel?) cleanup bug
On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote: Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing. As discussed on discord, this isn't actually true. All it does is prevent the thread from being joined before exiting the runtime. What is *likely* happening is, the runtime shuts down. That thread is still running, but the D runtime is gone. So it eventually starts trying to do something (like let's say, access thread local storage), and it's gone. Hence the segfault. -Steve
Re: druntime thread (from foreach parallel?) cleanup bug
On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote: And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime { shared static this() { ... dateThread.isDaemon = true; // not sure if this is related } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96 Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing. Maybe the hunt library author doesn't know. (My code does not directly use this library, it got pulled in by some other decencies.) Currently, the `isDaemon` doc does not mention this about this: https://dlang.org/library/core/thread/threadbase/thread_base.is_daemon.html Sets the daemon status for this thread. While the runtime will wait for all normal threads to complete before tearing down the process, daemon threads are effectively ignored and thus will not prevent the process from terminating. In effect, daemon threads will be terminated automatically by the OS when the process exits. Maybe we should add to the doc? BTW, what is exactly going wrong with their code? I saw the tick() method call inside the anonymous `dateThread` is accessing these two stack variables of shared static this(): https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L409 Appender!(char[])[2] bufs; const(char)[][2] targets; Why does this tick() call work after the static this() finished in a normal run? Why the problem only shows up when program finish?
Re: druntime thread (from foreach parallel?) cleanup bug
On Tuesday, 1 November 2022 at 18:18:45 UTC, Steven Schveighoffer wrote: Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing. As discussed on discord, this isn't true actually. All it does is prevent the thread from being joined before exiting the runtime. What is *likely* happening is, the runtime shuts down. That thread is still running, but the D runtime is gone. So it eventually starts trying to do something (like let's say, access thread local storage), and it's gone. Hence the segfault. -Steve
Re: druntime thread (from foreach parallel?) cleanup bug
On 11/1/22 1:47 PM, mw wrote: Can you show a code snippet that includes the parallel foreach? (It's just a very straight forward foreach on an array; as I said it may not be relevant.) And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime { shared static this() { ... dateThread.isDaemon = true; // not sure if this is related } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96 Oh yeah, isDaemon detaches the thread from the GC. Don't do that unless you know what you are doing. -Steve
Re: druntime thread (from foreach parallel?) cleanup bug
Can you show a code snippet that includes the parallel foreach? (It's just a very straight forward foreach on an array; as I said it may not be relevant.) And I just noticed, one of the thread trace points to here: https://github.com/huntlabs/hunt/blob/master/source/hunt/util/DateTime.d#L430 ``` class DateTime { shared static this() { ... dateThread.isDaemon = true; // not sure if this is related } } ``` in the comments, it said: "BUG: ... crashed". Looks like someone run into this (cleanup) issue already, but unable to fix it. Anyway I logged an issue there: https://github.com/huntlabs/hunt/issues/96
Re: druntime thread (from foreach parallel?) cleanup bug
On Tue, Nov 01, 2022 at 10:37:57AM -0700, Ali Çehreli via Digitalmars-d-learn wrote: > On 11/1/22 10:27, H. S. Teoh wrote: > > > Maybe try running Digger to reduce the code for you? > > Did you mean dustmite, which is accessible as 'dub dustmite > ' but I haven't used it. Oh yes, sorry, I meant dustmite, not digger. :-P > My guess for the segmentation fault is that the OP is executing > destructor code that assumes some members are alive. If so, the code > should be moved from destructors to functions to be called like > obj.close(). But it's just a guess... [...] Yes, that's a common gotcha. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL
Re: druntime thread (from foreach parallel?) cleanup bug
On 11/1/22 10:27, H. S. Teoh wrote: > Maybe try running Digger to reduce the code for you? Did you mean dustmite, which is accessible as 'dub dustmite ' but I haven't used it. My guess for the segmentation fault is that the OP is executing destructor code that assumes some members are alive. If so, the code should be moved from destructors to functions to be called like obj.close(). But it's just a guess... Ali
Re: druntime thread (from foreach parallel?) cleanup bug
On Tue, Nov 01, 2022 at 05:19:56PM +, mw via Digitalmars-d-learn wrote: > My program received signal SIGSEGV, Segmentation fault. > > Its simplified structure looks like this: > > ``` > void foo() { > ... > writeln("done"); // saw this got printed! > } > > int main() { > foo(); > return 0; > } > > ``` Can you show a code snippet that includes the parallel foreach? Because the above code snippet is over-simplified to the point it's impossible to tell what the original problem might be, since obviously calling a function that calls writeln would not crash the program. Maybe try running Digger to reduce the code for you? T -- Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
druntime thread (from foreach parallel?) cleanup bug
My program received signal SIGSEGV, Segmentation fault. Its simplified structure looks like this: ``` void foo() { ... writeln("done"); // saw this got printed! } int main() { foo(); return 0; } ``` So, just before the program exit, it failed. I suspect druntime has a thread (maybe due to foreach parallel) cleanup bug somewhere, which is unrelated to my own code. This kind of bug is hard to re-produce, not sure if I should file an issue. I'm using: LDC - the LLVM D compiler (1.30.0) on x86_64. Under gdb, here is the threads info (for the record): Thread 11 "xxx" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x153df700 (LWP 36258)] __GI___res_iclose (free_addr=true, statp=0x153dfdb8) at res-close.c:103 103 res-close.c: No such file or directory. (gdb) info threads Id Target Id Frame 1Thread 0x15515000 (LWP 36244) "lt" 0x10af1d2d in __GI___pthread_timedjoin_ex (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=) at pthread_join_common.c:89 * 11 Thread 0x153df700 (LWP 36258) "lt" __GI___res_iclose (free_addr=true, statp=0x153dfdb8) at res-close.c:103 17 Thread 0x155544817700 (LWP 36264) "lt" 0x10afac70 in __GI___nanosleep (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 (gdb) thread 1 [Switching to thread 1 (Thread 0x15515000 (LWP 36244))] #0 0x10af1d2d in __GI___pthread_timedjoin_ex (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=) at pthread_join_common.c:89 89 pthread_join_common.c: No such file or directory. (gdb) where #0 0x10af1d2d in __GI___pthread_timedjoin_ex (threadid=23456246527744, thread_return=0x0, abstime=0x0, block=) at pthread_join_common.c:89 #1 0x55fb94f8 in core.thread.osthread.joinLowLevelThread(ulong) () #2 0x55fd7210 in _D4core8internal2gc4impl12conservativeQw3Gcx15stopScanThreadsMFNbZv () #3 0x55fd3ae7 in _D4core8internal2gc4impl12conservativeQw3Gcx4DtorMFZv () #4 0x55fd3962 in _D4core8internal2gc4impl12conservativeQw14ConservativeGC6__dtorMFZv () #5 0x55fc2ce7 in rt_finalize2 () #6 0x55fc0056 in rt_term () #7 0x55fc0471 in _D2rt6dmain212_d_run_main2UAAamPUQgZiZ6runAllMFZv () #8 0x55fc0356 in _d_run_main2 () #9 0x55fc01ae in _d_run_main () #10 0x55840c02 in main (argc=2, argv=0x7fffe188) at //home/zhou/project/ldc2-1.30.0-linux-x86_64/bin/../import/core/internal/entrypoint.d:42 #11 0x10163b97 in __libc_start_main (main=0x55840be0 , argc=2, argv=0x7fffe188, init=, fini=, rtld_fini=, stack_end=0x7fffe178) at ../csu/libc-start.c:310 #12 0x556dccca in _start () (gdb) thread 11 [Switching to thread 11 (Thread 0x153df700 (LWP 36258))] #0 __GI___res_iclose (free_addr=true, statp=0x153dfdb8) at res-close.c:103 103 res-close.c: No such file or directory. (gdb) where #0 __GI___res_iclose (free_addr=true, statp=0x153dfdb8) at res-close.c:103 #1 res_thread_freeres () at res-close.c:138 #2 0x102de8c2 in __libc_thread_freeres () at thread-freeres.c:29 #3 0x10af0700 in start_thread (arg=0x153df700) at pthread_create.c:476 #4 0x10263a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) thread 17 [Switching to thread 17 (Thread 0x155544817700 (LWP 36264))] #0 0x10afac70 in __GI___nanosleep (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 28 ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory. (gdb) where #0 0x10afac70 in __GI___nanosleep (requested_time=0x155544810e90, remaining=0x155544810ea8) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 #1 0x55fb8c3b in _D4core6thread8osthread6Thread5sleepFNbNiSQBo4time8DurationZv () #2 0x55d9a0c2 in _D4hunt4util8DateTimeQj25_sharedStaticCtor_L406_C5FZ9__lambda4MFZv () at home/zhou/.dub/packages/hunt-1.7.16/hunt/source/hunt/util/DateTime.d:430 #3 0x55fb89f4 in thread_entryPoint () #4 0x10af06db in start_thread (arg=0x155544817700) at pthread_create.c:463 #5 0x10263a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95