On 11-05-25 11:39 AM, Rafael Avila de Espindola wrote:

My main concern is that we have to keep costs and priorities in mind.

Yesterday you described C/C++ as crazy/insane. It so happens that that
insanity is used to implement all the kernels that we are targeting. It
is used by every program I use (sometimes with an extra interpreter on
top) and is used by the compiler framework we selected. If all those are
crazy, I would say that that is not sufficient market for sanity.

I'm sorry to have used such language; clearly "crazy" is too vague to be useful. I'll be more specific in this email.

C, C++ and Java (most things exposing "thread") don't use threads as isolation primitives. That is, threads share memory. The language doesn't support private memory terribly well; even TLS is "up to you" to use safely, there's no enforcement.

This means that users never use threads for isolation (or if they do, it doesn't work). They use them for other things, and tend to use only modest numbers of them. These uses have two major sub-cases:

  - One thread per core, more or less, to saturate cores. Small enough
    N that nearly any scheduler will be fine. Definitely works for 1:1.
    This is "parallelism".

  - One thread per IO connection, using "parallelism" to handle
    concurrent IO problems. Weak, but widely done. This can get bad
    depending on IO abstraction. If we wire in 1:1, on *some* OSs this
    will be high performance, on some it will not. You can't wave this
    away by pointing at C/C++ programs; they hit this wall too and
    regularly have to write custom event loops with IO+epoll pumps.

Programmers like to complain about the hard to reason about semantics of
C++, but they do get the job done.

No. Programmers complain to a point, then they abandon the language. Which happens all the time. Know what the stock answer is in "how do I implement this in C++" at mozilla? "Use JS".

Look at the cost we are proposing to
get something a bit better: every loop will have a check to know if it
should yield or not. Is this really worth it? What is basically doing is
trying to give task a similar semantics to what threads already have.

Yes. And every function call has a check to see if it has to grow the stack. We can plausibly collapse these two costs into one in some cases. In cases where we can't, and it's a performance bottleneck, we can offer some unsafe variant; but you're assuming a particular datapoint without evidence just as you accuse me of. We don't know it'll be prohibitively slow any more than we know bounds checking vector accesses is prohibitively slow. I'm not ok with this argument.

"Do what threads do" is not an answer; threads don't cancel anywhere they're not checking a flag. They support abrupt killing, but that's even worse: almost guaranteed to corrupt memory. How would we cancel a runaway thread in your proposal? The issue re-occurs.

When I criticized C and C++ on this point (and the memory model) yesterday, it was because they sweep these issues under the rug. They're safety issues but the language tells users "you're on your own". I'm not interested in offering a memory model or threading model as weak as that in C++. It makes too few safety promises.

About datapoits. No, I don't have them. It is too early for that. We
don't need tasks right now. C, C++ and Java have come a *long* way
without them and they have a significant market share.

They are very, very difficult to program safely. Show me a C or C++ program that isn't littered with memory-safety-corrupting failure modes. Show me a highly concurrent one that doesn't have subtle data races (not just at I/O points). Show me a java program that doesn't have the next worst thing, random take-down-the-whole-process failures due to uncaught NPEs, due to lack of isolation.

These are problems I'm trying to address. You keep proposing stances along the lines of "we must sacrifice safety for speed to compete with C++". I'm not willing to do that. Keep your proposals to the space of "safe" (isolation-preserving, including resource-bounds and runaway tasks, portable, diagnostic-friendly) and we can talk; otherwise this is not fruitful.

With the changes on how channel work, tasks are not a fundamentally new
concept. You even acknowledge that they could be implemented at 1:1.

With yield points, yes. On OSs that support lots of threads for cheap, that could be true. Without yield points it's unsafe, and we do not have evidence that it'll perform well on every OS. And it doesn't meet other useful cases, see below:

Given that we must also provide access to raw threads, lets start with
that. My proposal is

* Implement OS threads
* Have a notion of a task, which is a language abstraction that provides
a subset of the features that are common to all operating systems. You
could not, for example, send a signal to a task.
* Implement them now with a 1:1 mapping.

No. I am not comfortable with that.

  - It needs yield points anyways to be safe (see above), and if those
    are never-taken branches when running 1:1 it will perform the same
    if it's "M:N running 1:1" as if it's "only 1:1 supported"; so there
    is IMO no performance argument here.

  - It's very likely to wind up depending on some aspect of 1:1 if
    built as you suggest.

  - It's not clear that it'll be efficient everywhere. Not everything
    has cheap threads.

  - It doesn't handle running "task per connection" on IO-heavy loads
    with a single (or small number) of IO multiplexing threads, which
    is the appropriate IO model on some OSs.

  - It doesn't handle debugging by running M:1 and single stepping
    without debugger heroism (and gets worse if we are trying to tap
    the message queues for tracing inter-task communication).

  - It doesn't handle bounding the number of threads a task creates
    while permitting it to spawn extra tasks; IOW every task turns
    into a real commitment of CPU resources, scheduled with every other.
    I want to be able to run a rust program and say "use no more than 5
    threads, I don't care what you're trying to do". Threads are a
    system-wide resource.

For these reasons I would like to be M:N by default with 1:1 as a special case. It may be that it's *such* a special case that it's the default mode on some, maybe even all OSs. I'd be perfectly pleased if we never have to reason deeply about "scheduling algorithms" in rust's runtime; the rust scheduler is currently PRNG-driven for a reason: it's the dumbest scheduling algorithm around. I know I'll never be able to write a competitive scheduler. That's not the point.

Once the language is more mature, we can consider what is next. It might
be another "insanity" that is known to work, like thread pools and
continuation style code or it might be user space tasks.

Continuation-style code is a non-starter for me. You've never explained how it could be made safe, and users complain bitterly anyways. It's strictly worse language UI than having a stack. Users want to write mostly-sequential logic.

The important point is that once we have web servers, GUI toolkits,
image decoders, audio APIs, etc available in rust, we will have data
points. It might be that, like what happened to java, using 1:1 is the
best. If not, we can then implement M:N as an optimization.

They weren't approaching the same problem we are. They were just trying to make "conventional threads go fast"; I agree they probably wound up at the right place with 1:1 always (though weirdly, their IO model was part of what forced many OS vendors to *make* 1:1 go fast). If we weren't trying to make tasks a ubiquitous isolation (and IO multiplexing) concept, I'd probably agree with you here. But I think we are pursuing different goals.

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to