On Friday, 19 September 2025 at 17:37:36 UTC, Sönke Ludwig wrote:
Am 19.09.25 um 18:29 schrieb Dmitry Olshansky:
Shouldn't it still be possible to set an "interrupted" flag
somewhere and let only the vibe-core-lite APIs throw? Low
level C functions should of course stay unaffected.
Since vibe-core-light depends on syscalls this would mean
creating a separate set of API for vibe-core-light which is
not something I’d like to do.
It's more of a timeout pattern that I've seen multiple times,
there are certainly multiple (better) alternatives, but if
compatibility with existing code is the goal then this would
still be important.
I guess, again most likely I’d need to create API specifically
for vibe. Also that would mean interrupt becomes part of
photon but only works when certain APIs are used. This is bad.
So you don't support timeouts when waiting for an event at all?
Otherwise I don't see why a separate API would be required,
this should be implementable with plain Posix APIs within
vibe-core-lite itself.
Photon's API is the syscall interface. So to wait on an event you
just call poll.
Behind the scenes it will just wait on the right fd to change
state.
Now vibe-core-light wants something like read(buffer, timeout)
which is not syscall API but maybe added. But since I'm going to
add new API I'd rather have something consistent and sane not
just a bunch of adhoc functions to satisfy vibe.d interface.
I on the other hand imagine that it’s not. In year 2025 not
utilizing all of available cores is shameful. The fact that
I had to dig around to find how vibe.d is supposed to run on
multiple cores is telling.
Telling in what way?
That running single threaded is the intended model.
Obviously this is wrong, though.
All the examples plus your last statement on process per core
being better
makes me conclude that. I don't see how I'm wrong here.
It's really quite simple, you can use plain D threads as
normal, or you can use task pools, either explicitly, or
through the default worker task pool using `runWorkerTask` or
`runWorkerTaskDist`. (Then there are also higher level
concepts, such as async, performInWorker or
parallel(Unordered)Map)
This does little to the most important case - handling
requests in parallel. Yeah there are pool and such for cases
where going parallel inside of a single request makes sense.
```
runWorkerTaskDist({
HTTPServerSettings settings;
settings.options |= HTTPServerOption.reusePort;
listenHTTP(settings);
});
```
Yet this is not the default, and the default is basically single
threaded.
We have different opinions on what the default should be
obviously.
Not everything is CPU bound and using threads "just because"
doesn't make sense either. This is especially true, because
of low level race conditions that require special care. D's
shared/immutable helps with that, but that also means that
your whole application suddenly needs to use shared/immutable
when passing data between tasks.
I’m dying to know which application not being cpu bound still
needs to pass data between tasks that are all running on a
single thread.
Anything client side involving a user interface has plenty of
opportunities for employing secondary tasks or long-running
sparsely updated state logic that are not CPU bound. Most of
the time is spent idle there. Specific computations on the
other hand can of course still be handed off to other threads.
Latency still going to be better if multiple cores are utilized.
And I'm still not sure what the example is.
But TLS variables are always "globals" in the sense that they
outlive the scope that accesses them. A modification in one
thread would obviously not be visible in another thread,
meaning that you may or may not have a semantic connection
when you access such a library sequentially from multiple
tasks.
And then there are said libraries that are not thread-safe at
all, or are bound to the thread where you initialize them. Or
handles returned from a library may be bound to the thread
that created them. Dealing with all of this just becomes
needlessly complicated and error-prone, especially if CPU
cycles are not a concern.
TLS is fine for using not thread safe library - just make sure
you initialize it for all threads. I do not switch or
otherwise play dirty tricks with TLS.
The problem is that for example you might have a handle that
was created in thread A and is not valid in thread B, or you
set a state in thread A and thread B doesn't see that state.
This would mean that you are limited to a single task for the
complete library interaction.
Or just initialize it lazily in all threads that happen to use it.
Otherwise, this is basically stick to one thread really.
By robbing the user the control over where a task spawns, you
are also forcing synchronization everywhere, which can
quickly become more expensive than any benefits you would
gain from using multiple threads.
Either of default kind of rob user of control of where the
task spawns. Which is sensible a user shouldn’t really care.
This doesn't make sense, in the original vibe-core, you can
simply choose between spawning in the same thread or in "any"
thread. `shared`/`immutable` is correctly enforced in the
latter case to avoid unintended data sharing.
I have go and goOnSameThread. Guess which is the encouraged
option.
Finally, in the case of web applications, in my opinion the
better approach for using multiple CPU cores is *usually* by
running multiple *processes* in parallel, as opposed to
multiple threads within a single process. Of course, every
application is different and there is no one-size-fits-all
approach.
There we differ, not only load balancing is simpler within a
single application but also processes are more expansive.
Current D GC situation kind of sucks on multithreaded
workloads but that is the only reason to go multiprocess IMHO.
The GC/malloc is the main reason why this is mostly false in
practice, but it extends to any central contention source
within the process - yes, often you can avoid that, but often
that takes a lot of extra work and processes sidestep that
issue in the first place.
As is observable from the look on other languages and runtimes
malloc is not the bottleneck it used to be. Our particular
version of GC that doesn't have thread caches is a bottleneck.
Also, in the usual case where the threads don't have to
communicate with each other (apart from memory allocation
synchronization), a separate process per core isn't any slower
- except maybe when hyper-threading is in play, but whether
that helps or hurts performance always depends on the concrete
workload.
The fact that context switch has to drop all of virtual address
spaces does add a bit of overhead. Though to be certain of
anything there better be a benchmark.
Separate process also have the advantage of being more robust
and enabling seamless restarts and updates of the executable.
And they facilitate an application design that lends itself to
scaling across multiple machines.
Then give me the example code to run multiple vibe.d in parallel
processes (should be simillar to runDist) and we can compare
approaches. For all I know it could be faster then multi-threaded
vibe.d-light. Also honestly if vibe.d's target is multiple
processes it should probably start like this by default.