Re: [rust-dev] Separating heaps from tasks

Niko Matsakis Mon, 04 Nov 2013 03:23:40 -0800

This is not a complete answer to your question, but I have toyed with
the idea of an (unsafely implemented but with safe interface) message
allocation library that would partially address your use
case. Unfortunately I think that implementing it would require
higher-kinded types.


The basic would be an interface where you have an opaque "message"
that contains an arena and a root pointer (of some type `T`):

    struct Message<T<'a>> {
        priv arena: ~Arena,
        priv root: *mut T<'static>
    }

You can create a message like so:

    Message::new(|arena| {
        let obj1 = arena.alloc(|| Leaf(...));
        let obj2 = arena.alloc(|| Leaf(...));
        arena.alloc(|| Root(obj1, obj2)) // return the root
    });

You could "open" an existing message and revise its contents:

    Message::edit(msg, |arena, root| {
        ...
        root // return new root
    })

These messages could be sent to other tasks, provided of course
that they do not access managed data and so forth.

The idea is to leverage the lifetime system to guarantee that pointers
allocated from the arena do not escape. I haven't thought too hard
about this, so there might be a hole, but I think it would work like
so:

    impl<T<'a>:Isolate> Message<T> {
        fn new(f: <'a> |&'a Arena| -> &'a mut T<'a> {
            let arena = ~Arena::new();
            let root: *mut T<'static> = unsafe {
                transmute(f(&arena))
            };
            Message { arena: arena, root: root }
        }

        fn edit<'a>(m: &mut Message<'a>,
                    f: <'b> |&'b Arena, &'b mut T<'b>| -> &'b mut T<'b>) {
            m.root = unsafe {
                transmute(f(&m.arena), transmute(m.root))
            };
        }
    }

To address your use case, of course, we'd want to extend `arena` with
the ability to track the types of the memory it has allocated and
performance GC. Doing this while a message is being created or edited
would be challenging and would require hooks from the runtime to
obtain stack roots and so forth; interestingly, it'd be pretty trivial
to run the GC at deterministic times or (say) at the end of an editing
session. This might be enough for some uses cases.


Niko

[1]: 
http://smallcultfollowing.com/babysteps/blog/2013/06/11/data-parallelism-in-rust/


On Mon, Nov 04, 2013 at 08:11:29AM +0200, Oren Ben-Kiki wrote:
> I am toying with a non-trivial Rust project to get a feel for the language.
> There's a pattern I keep seeing in my code which isn't easy to express in
> Rust. I wonder what the "right thing to do" is here.
> 
> The pattern is as follows. I have some container, which contains some
> components of different types. The container as a whole is send-able . The
> components form a complex graph (with cycles).
> 
> What I'd like to do is something like this:
> - Declare a pool of objects of some types, which is held by the container.
> - Declare pointer-to-object whose scope is the container; that is, the
> lifetime of the pointer is the lifetime of the container. The pointer can
> be freely passed around, cloned, etc. but (for mutable objects), only one
> mutable access is allowed at a time.
> 
> This calls for something between GC pointers and RC pointers. GC is out,
> because it isn't send-able. RC is out, because it doesn't allow for loops.
> So right now I use explicit pools and a semi-safe (that is, unsafe...)
> "smart pointer" type. And I don't support dropping objects until the whole
> container is done (which is OK in my specific app but isn't really a good
> solution).
> 
> Ideally what I'd like to see is separating heaps from tasks. That is,
> suppose that GC pointers had a heap attribute (like borrowed pointers have
> a lifetime attribute). By default, each task has a heap, but it is also
> possible to define additional heaps (like we have the static lifetime and
> can also define additional lifetimes).
> 
> So, the container could hold a heap and then many components with
> heap-scoped pointers. The whole thing is send-able and GC is done in the
> scope of each heap on its own (like today).
> 
> There are implications on the type system (no mixing pointers between
> different heaps, unless the heaps are nested) - this seems very similar to
> the lifetimes type checking...
> 
> Overall this seems very symmetrical with lifetimes. Basically, lifetimes ==
> static (compile-time computable) free/malloc; heaps == dynamic (run-time
> computable) free/malloc.
> 
> One interesting pattern allowed by this is ad-hoc actors (there are others
> of course). Currently, if one wants to write actor-style code, one ties in
> the GC pointers to one heap of one actor, which means one forces the
> parallelization policy to one task per actor. One could argue that the
> run-time should be good enough that any aggregation of actor threads to OS
> threads would be done optimally (which is a good goal); but in some apps,
> to get good performance one would like to control this. If we could
> separate heaps from tasks, we could spawn fewer tasks (roughly the number
> of OS threads) and use application code to decide which actor runs when and
> where (e.g., in combination with thread affinity to ensure better cache
> locality).
> 
> At any rate - is this something that makes sense in the Rust view?
> If so, is there a chance of something like that being added (a completely
> separate question :-)?

> _______________________________________________
> Rust-dev mailing list
> Rust-dev@mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev

_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Separating heaps from tasks

Reply via email to