The situation may not be as dire as you think. The runtime is still in a state of flux, and don't forget that in one summer the entire runtime was rewritten in rust and was entirely redesigned. I personally still think that M:N is a viable model for various applications, and it seems especially unfortunate to just remove everything because it's not tailored for all use cases.
Rust made an explicit design decision early on to pursue lightweight/green tasks, and it was made with the understanding that there were drawbacks to the strategy. Using libuv as a backend for driving I/O was also an explicit decision with known drawbacks. That being said, I do not believe that all is lost. I don't believe that the rust standard library as-is today can support *every* use case, but it's getting to a point where it can get pretty close. In the recent redesign of the I/O implementation, all I/O was abstracted behind trait objects that are synchronous in their interface. This I/O interface is all implemented in librustuv by talking to the rust scheduler under the hood. Additionally, in pull #10457, I'm starting to add support for a native implementation of this I/O interface. The great boon of this strategy is that all std::io primitives have no idea if their underlying interface is native and blocking or libuv and asynchronous. The exact same rust code works for one as it does for the other. I personally don't see why the same strategy shouldn't work for the task model as well. When you link a program to the librustuv crate, then you're choosing to have a runtime with M:N scheduling and asynchronous I/O. Perhaps, though, if you didn't link to librustuv, you would get 1:1 scheduling with blocking I/O. You would still have all the benefits of the standard library's communication primitives, spawning primitives, I/O, task-local-storage etc. The only difference is that everything would be powered by OS-level threads instead of rust-level green tasks. I would very much like to see a standard library which supports this abstraction, and I believe that it is very realistically possible. Right now we have an EventLoop interface which defines interacting with I/O that is the abstraction between asynchronous I/O and blocking I/O. This sounds like we need a more formalized Scheduler interface which abstracts M:N scheduling vs 1:1 scheduling. The main goal of all of this would be to allow the same exact rust code to work in both M:N and 1:1 environments. This ability would allow authors to specialize their code for their task at-hand. Those writing web servers would be sure to link to librustuv, but those writing command-line utilities would simply just omit librustuv. Additionally, as a library author, I don't really care which implementation you're using. I can write a mysql database driver and then you as a consumer of my library decided whether my network calls are blocking or not. This is a fairly new concept to me (I haven't thought much about it before), but this sounds like it may be the right way forward to addressing your concerns without compromising too much existing functionality. There would certainly be plenty of work to do in this realm, and I'm not sure if this goal would block the 1.0 milestone or not. Ideally, this would be a completely backwards-compatible change, but there would perhaps be unintended consequences. As always, this would need plenty of discussion to see whether this is even a reasonable strategy to take. On Wed, Nov 13, 2013 at 2:45 AM, Daniel Micay <danielmi...@gmail.com> wrote: > Before getting right into the gritty details about why I think we should think > about a path away from M:N scheduling, I'll go over the details of the > concurrency model we currently use. > > Rust uses a user-mode scheduler to cooperatively schedule many tasks onto OS > threads. Due to the lack of preemption, tasks need to manually yield control > back to the scheduler. Performing I/O with the standard library will block the > *task*, but yield control back to the scheduler until the I/O is completed. > > The scheduler manages a thread pool where the unit of work is a task rather > than a queue of closures to be executed or data to be pass to a function. A > task consists of a stack, register context and task-local storage much like an > OS thread. > > In the world of high-performance computing, this is a proven model for > maximizing throughput for CPU-bound tasks. By abandoning preemption, there's > zero overhead from context switches. For socket servers with only negligible > server-side computations the avoidance of context switching is a boon for > scalability and predictable performance. > > # Lightweight? > > Rust's tasks are often called *lightweight* but at least on Linux the only > optimization is the lack of preemption. Since segmented stacks have been > dropped, the resident/virtual memory usage will be identical. > > # Spawning performance > > An OS thread can actually spawn nearly as fast as a Rust task on a system with > one CPU. On a multi-core system, there's a high chance of the new thread being > spawned on a different CPU resulting in a performance loss. > > Sample C program, if you need to see it to believe it: > > ``` > #include <pthread.h> > #include <err.h> > > static const size_t n_thread = 100000; > > static void *foo(void *arg) { > return arg; > } > > int main(void) { > for (size_t i = 0; i < n_thread; i++) { > pthread_attr_t attr; > if (pthread_attr_init(&attr) < 0) { > return 1; > } > if (pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED) < 0) { > return 1; > } > pthread_t thread; > if (pthread_create(&thread, &attr, foo, NULL) < 0) { > return 1; > } > } > pthread_exit(NULL); > } > ``` > > Sample Rust program: > > ``` > fn main() { > for _ in range(0, 100000) { > do spawn { > } > } > } > ``` > > For both programs, I get around 0.9s consistently when pinned to a core. The > Rust version drops to 1.1s when not pinned and the OS thread one to about 2s. > It drops further when asked to allocate 8MiB stacks like C is doing, and will > drop more when it has to do `mmap` and `mprotect` calls like the pthread API. > > # Asynchronous I/O > > Rust's requirements for asynchronous I/O would be filled well by direct usage > of IOCP on Windows. However, Linux only has solid support for non-blocking > sockets because file operations usually just retrieve a result from cache and > do not truly have to block. This results in libuv being significantly slower > than blocking I/O for most common cases for the sake of scalable socket > servers. > > On modern systems with flash memory, including mobile, there is a *consistent* > and relatively small worst-case latency for accessing data on the disk so > blocking is essentially a non-issue. Memory mapped I/O is also an incredibly > important feature for I/O performance, and there's almost no reason to use > traditional I/O on 64-bit. However, it's a no-go with M:N scheduling because > the page faults block the thread. > > # Overview > > Advantages: > > * lack of preemptive/fair scheduling, leading to higher throughput > * very fast context switches to other tasks on the same scheduler thread > > Disadvantages: > > * lack of preemptive/fair scheduling (lower-level model) > * poor profiler/debugger support > * async I/O stack is much slower for the common case; for example stat is 35x > slower when run in a loop for an mlocate-like utility > * true blocking code will still block a scheduler thread > * most existing libraries use blocking I/O and OS threads > * cannot directly use fast and easy to use linker-supported thread-local data > * many existing libraries rely on thread-local storage, so there's a need to > be > wary of hidden yields in Rust function calls and it's very difficult to > expose a safe interface to these libraries > * every level of a CPU architecture adding registers needs explicit support > from Rust, and it must be selected at runtime when not targeting a specific > CPU (this is currently not done correctly) > > # User-mode scheduling > > Windows 7 introduced user-mode scheduling[1] to replace fibers on 64-bit. > Google implemented the same thing for Linux (perhaps even before Windows 7 was > released), and plans on pushing for it upstream.[2] The linked video does a > better job of covering this than I can. > > User-mode scheduling provides a 1:1 threading model including full support for > normal thread-local data and existing debuggers/profilers. It can yield to the > scheduler on system calls and page faults. The operating system is responsible > for details like context switching, so a large maintenance/portability burden > is dealt with. It narrows down the above disadvantage list to just the point > about not having preemptive/fair scheduling and doesn't introduce any new > ones. > > I hope this is where concurrency is headed, and I hope Rust doesn't miss this > boat by concentrating too much on libuv. I think it would allow us to simply > drop support for pseudo-blocking I/O in the Go style and ignore asynchronous > I/O and non-blocking sockets in the standard library. It may be useful to have > the scheduler use them, but it wouldn't be essential. > > [1] > http://msdn.microsoft.com/en-us/library/windows/desktop/dd627187(v=vs.85).aspx > [2] http://www.youtube.com/watch?v=KXuZi9aeGTw > _______________________________________________ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev _______________________________________________ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev