On 11-05-25 11:39 AM, Rafael Avila de Espindola wrote:
My main concern is that we have to keep costs and priorities in mind.
Yesterday you described C/C++ as crazy/insane. It so happens that that
insanity is used to implement all the kernels that we are targeting. It
is used by every program I use (sometimes with an extra interpreter on
top) and is used by the compiler framework we selected. If all those are
crazy, I would say that that is not sufficient market for sanity.
I'm sorry to have used such language; clearly "crazy" is too vague to be
useful. I'll be more specific in this email.
C, C++ and Java (most things exposing "thread") don't use threads as
isolation primitives. That is, threads share memory. The language
doesn't support private memory terribly well; even TLS is "up to you" to
use safely, there's no enforcement.
This means that users never use threads for isolation (or if they do, it
doesn't work). They use them for other things, and tend to use only
modest numbers of them. These uses have two major sub-cases:
- One thread per core, more or less, to saturate cores. Small enough
N that nearly any scheduler will be fine. Definitely works for 1:1.
This is "parallelism".
- One thread per IO connection, using "parallelism" to handle
concurrent IO problems. Weak, but widely done. This can get bad
depending on IO abstraction. If we wire in 1:1, on *some* OSs this
will be high performance, on some it will not. You can't wave this
away by pointing at C/C++ programs; they hit this wall too and
regularly have to write custom event loops with IO+epoll pumps.
Programmers like to complain about the hard to reason about semantics of
C++, but they do get the job done.
No. Programmers complain to a point, then they abandon the language.
Which happens all the time. Know what the stock answer is in "how do I
implement this in C++" at mozilla? "Use JS".
Look at the cost we are proposing to
get something a bit better: every loop will have a check to know if it
should yield or not. Is this really worth it? What is basically doing is
trying to give task a similar semantics to what threads already have.
Yes. And every function call has a check to see if it has to grow the
stack. We can plausibly collapse these two costs into one in some cases.
In cases where we can't, and it's a performance bottleneck, we can offer
some unsafe variant; but you're assuming a particular datapoint without
evidence just as you accuse me of. We don't know it'll be prohibitively
slow any more than we know bounds checking vector accesses is
prohibitively slow. I'm not ok with this argument.
"Do what threads do" is not an answer; threads don't cancel anywhere
they're not checking a flag. They support abrupt killing, but that's
even worse: almost guaranteed to corrupt memory. How would we cancel a
runaway thread in your proposal? The issue re-occurs.
When I criticized C and C++ on this point (and the memory model)
yesterday, it was because they sweep these issues under the rug. They're
safety issues but the language tells users "you're on your own". I'm not
interested in offering a memory model or threading model as weak as that
in C++. It makes too few safety promises.
About datapoits. No, I don't have them. It is too early for that. We
don't need tasks right now. C, C++ and Java have come a *long* way
without them and they have a significant market share.
They are very, very difficult to program safely. Show me a C or C++
program that isn't littered with memory-safety-corrupting failure modes.
Show me a highly concurrent one that doesn't have subtle data races (not
just at I/O points). Show me a java program that doesn't have the next
worst thing, random take-down-the-whole-process failures due to uncaught
NPEs, due to lack of isolation.
These are problems I'm trying to address. You keep proposing stances
along the lines of "we must sacrifice safety for speed to compete with
C++". I'm not willing to do that. Keep your proposals to the space of
"safe" (isolation-preserving, including resource-bounds and runaway
tasks, portable, diagnostic-friendly) and we can talk; otherwise this is
not fruitful.
With the changes on how channel work, tasks are not a fundamentally new
concept. You even acknowledge that they could be implemented at 1:1.
With yield points, yes. On OSs that support lots of threads for cheap,
that could be true. Without yield points it's unsafe, and we do not have
evidence that it'll perform well on every OS. And it doesn't meet other
useful cases, see below:
Given that we must also provide access to raw threads, lets start with
that. My proposal is
* Implement OS threads
* Have a notion of a task, which is a language abstraction that provides
a subset of the features that are common to all operating systems. You
could not, for example, send a signal to a task.
* Implement them now with a 1:1 mapping.
No. I am not comfortable with that.
- It needs yield points anyways to be safe (see above), and if those
are never-taken branches when running 1:1 it will perform the same
if it's "M:N running 1:1" as if it's "only 1:1 supported"; so there
is IMO no performance argument here.
- It's very likely to wind up depending on some aspect of 1:1 if
built as you suggest.
- It's not clear that it'll be efficient everywhere. Not everything
has cheap threads.
- It doesn't handle running "task per connection" on IO-heavy loads
with a single (or small number) of IO multiplexing threads, which
is the appropriate IO model on some OSs.
- It doesn't handle debugging by running M:1 and single stepping
without debugger heroism (and gets worse if we are trying to tap
the message queues for tracing inter-task communication).
- It doesn't handle bounding the number of threads a task creates
while permitting it to spawn extra tasks; IOW every task turns
into a real commitment of CPU resources, scheduled with every other.
I want to be able to run a rust program and say "use no more than 5
threads, I don't care what you're trying to do". Threads are a
system-wide resource.
For these reasons I would like to be M:N by default with 1:1 as a
special case. It may be that it's *such* a special case that it's the
default mode on some, maybe even all OSs. I'd be perfectly pleased if we
never have to reason deeply about "scheduling algorithms" in rust's
runtime; the rust scheduler is currently PRNG-driven for a reason: it's
the dumbest scheduling algorithm around. I know I'll never be able to
write a competitive scheduler. That's not the point.
Once the language is more mature, we can consider what is next. It might
be another "insanity" that is known to work, like thread pools and
continuation style code or it might be user space tasks.
Continuation-style code is a non-starter for me. You've never explained
how it could be made safe, and users complain bitterly anyways. It's
strictly worse language UI than having a stack. Users want to write
mostly-sequential logic.
The important point is that once we have web servers, GUI toolkits,
image decoders, audio APIs, etc available in rust, we will have data
points. It might be that, like what happened to java, using 1:1 is the
best. If not, we can then implement M:N as an optimization.
They weren't approaching the same problem we are. They were just trying
to make "conventional threads go fast"; I agree they probably wound up
at the right place with 1:1 always (though weirdly, their IO model was
part of what forced many OS vendors to *make* 1:1 go fast). If we
weren't trying to make tasks a ubiquitous isolation (and IO
multiplexing) concept, I'd probably agree with you here. But I think we
are pursuing different goals.
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev