Hi Bill, If I understand this right, what you are proposing is the approach of Python's greenlets library, which I already tried to pitch here<http://mail.mozilla.org/pipermail/rust-dev/2013-November/006544.html>.
I think that the biggest problem with this approach, in its' raw form, is that greenlets are bound to a specific OS thread, which would cause all sorts of headache. For example: - It would be difficult to balance greenlet tasks between several OS threads, - If async I/O requests for two greenlets, bound to the same OS thread, were to complete simultaneously, you would not be able to resume them in parallel, - If a thread were to become blocked for a significant amount of time, all the greenlets bound to it become un-resumable, and so on... Greelets would be much more palatable if their stacks were position-independent. This might be possible to achieve with a suitable LLVM transform<http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-January/069662.html>, though I am not yet sure how portable this would be. cheers, Vadim On Sat, Jan 25, 2014 at 8:58 AM, Bill Myers <bill_my...@outlook.com> wrote: > Stack management for green tasks has been based in the past first on > segmented stacks and then on standard large stacks. > > However, I just realized that there is a third alternative which might > well be better than both of those. > > The idea is very simple: a green task would run on a large stack like now, > but when it is descheduled, a memory buffer of the size of its used stack > space is allocated, and its stack data is copied there; when it is > rescheduled, the stack data is copied back to the original address. > > The copying can be performed either by memcpy (perhaps with a specialized > memcpy that assumes alignment and always uses SIMD instruction to copy with > no checks), or perhaps by using a compression algorithm designed for > maximum speed, such as Snappy or maybe a new ad-hoc design. > > This allows to have the best possible memory efficiency and thus the > maximum possible number of tasks, even better than segmented stacks due to > precise sizing and optional compression, while not adding any overhead to > code that does not block, either in terms of code generation complexity or > runtime cost. > > There is of course an additional runtime cost when blocking, but blocking > already often involves both system calls and context switching, and > workloads with lots of blocking green tasks are probably I/O bound anyway; > furthermore, stack data is probably in the CPU cache, so there shouldn't be > much penalty. > > The only significant drawback is that two green tasks running from the > same "large" stack (note that stacks cannot be easily switched, as that > would invalidate borrowed pointers on the stack to the stack) cannot > ever run concurrently; however, by making the number of large stacks a > large enough multiple of the number of CPU cores, the probability of > reduced parallelism due to this issue can be made as small as desired. > > In general, one would use an adaptive approach, where this kind of copying > would start to kick in only once the number of green tasks becomes large; > when not copying, one would just optionally use madvise(MADV_DONTNEED) to > trim unused whole pages of the stack. > > Also, if the data to be copied is large (several OS pages), one may use > mremap() or similar facilities to move the address space mappings instead > of the data itself. > > Some unsafe code that assumes that tasks can access each other's stacks > will break (probably only the new Mutex implementation), but that can be > fixed by putting the data in the task structure instead of the stack. > > Overall, this should allow writing something like a parallel network > server or web crawler in Rust in a natural blocking style, while being > fully confident in its scalability, which is something that is not really > possible at the moment. > > Once this is in place, the best practices for Rust programs would become: > 1. Use native tasks for tasks whose number is bounded at compile time and > small > 2. Use green tasks for tasks whose number is unbounded or large > > One could even automate this to some extent by spawning by default a > native task the first C times a given proc() code address is used to start > a task (where C = number of CPU cores), and green tasks afterwards. > > What do you think? > _______________________________________________ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev >
_______________________________________________ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev