tl;dr I know enough to sense there are important stuff that I don't know.

Even though I sometimes act[1] like someone who knows stuff, there are many fuzzy areas for me especially in the runtime.

Things work great when D code is inside a D program. The runtime and module states are magically initialized and everything works. It is not clear when it comes to writing a D library and especially when that library may be used by other language runtimes, necessarily on foreign threads.

Here are the essential points that I do and don't understand.

- Initialize the runtime: This is automatically done for a D program as described on the wiki[2]. This must be done by calling rt_init[3] for a D shared library. I handle this by calling rt_init from a pragma(crt_constructor) function[4]. Luckily, this is easy and works for all cases that I have.

- Execute module constructors ("ctor" for short, i.e. 'shared static this' blocks). This is done automatically for a D program and when the D library is loaded by other language code like C++ and Python. However, I've encountered a case[5] where module ctors were not being called. This could be due to runtime bugs or something that I don't understand with loading shared libraries. (My workaround is very involved: I grep the output of 'nm' to determine the symbol for the module ctor, call it after dlsym'ing, and because 'nm | grep' is a slow process, I cache this information in a file along with the ~2K libraries that I may load conditionally.)

- Loading D libraries from D code: I call loadLibrary[6] to load a D library so that "[its] D runtime [...] will be integrated with the current runtime". Sounds promising; assuming that rt_init is already called for the calling library, I assume loadLibrary will handle everything, and all code will use a single runtime and things will work fine. This works flawlessly for my D and C++ programs that load my D library that loads the other D libraries.

- Attaching foreign threads: D runtime needs to know about all threads that are running D code so that it will know what threads consist of "the world" for it to "stop the world" when performing garbage collection. The function to do this is thread_attachThis[7].

One question I have is, does rt_init already do thread_attachThis? I ask because I have a library that is loaded by Python and things work even *without* calling thread_attachThis.

- Execute thread local storage (TLS) ctors: Again, this happens automatically for most cases. However, thread_attachThis says "[if] full functionality as a D thread is desired, [rt_moduleTlsCtor] must be called after thread_attachThis". Ok. When would I not want "full functionality" anyway?

Another question: Are TLS ctors executed when I do loadLibrary?

And when they are executed, which modules are involved? The module that is calling rt_moduleTlsCtor or all modules? What are "all modules"?

- Detaching foreign threads: Probably even more important than thread_attachThis is thread_detachThis[8]. As its documentation says, one should call rt_moduleTlsDtor as well for "full functionality".

This is very important because when the GC collection kick in, it will stop all threads that makes up its world. If one of those threads has already been terminated, we will crash. (Related, I have an abandoned PR[9] that tried to fix issues with thread_detachThis, which stalled due to failing unit tests for the 32-bit Apple operating system, which D stopped supporting since then.) (And I stopped working on that issue mostly because the company I used to work for stopped using D and rewrote their library in C++.)

I have questions regarding thread_attachThis and thread_detachThis: When should they be called? Should the library expose a function that the users must call from *each thread* that they will be using? This may not be easy because a user may not know what thread they are running on. For example, the user of our library may be on a framework where threads may come and go, where the user may not have an opportunity to call thread_detachThis when a thread goes away. For example, the user may provide callback functions (which call us) to a framework that is running on a thread pool.

For that reason, my belief has been to call thread_attachThis upon entering an API function and calling thread_detachThis upon leaving it because I may not know whether this thread will survive or die soon. (thread_detachThis is so important because the next GC cycle will try to stop this thread and may crash.)

More questions: Can I thread_detachThis the thread that called rt_init? Can I call rt_moduleTlsCtor more than once? I guess it depends on each module. It will be troubling if a TLS ctor reinitializes an module state. :/

While trying to sort all of these out, I am facing a bug[10], which will force me to move away from std.parallelism and perhaps use std.concurrency. Even though that bug is reported for OS X, I think both that case and my "called from Python" case are related to an undefined behavior in thread management of runtime, which is exposed by std.parallelism. (?)

As you can see, even though I can list many references to act like I know stuff, I really don't and have many questions. :) The trouble is, when there are so many dimensions to test to be sure, it is extremely difficult to learn when a seg-fault bug is intermixed with all this, which hits sporadically. :(

I want to learn.

Thank you,
Ali

[1] https://www.youtube.com/watch?v=FNL-CPX4EuM

[2] https://wiki.dlang.org/Runtime_internals

[3] https://dlang.org/library/core/runtime/rt_init.html

[4] https://dlang.org/spec/pragma.html#crtctor

[5] https://forum.dlang.org/thread/rucm30$1lgk$1...@digitalmars.com

[6] https://dlang.org/library/core/runtime/runtime.load_library.html

[7] https://dlang.org/library/core/thread/osthread/thread_attach_this.html

[8] https://dlang.org/library/core/thread/threadbase/thread_detach_this.html

[9] https://github.com/dlang/druntime/pull/1989

[10] https://issues.dlang.org/show_bug.cgi?id=11736

Reply via email to