Hi Hartmut,

Firstly, thanks for your previous advice on this. I wanted to try
implementing something like what you suggested before you replied.

A very basic example (with a single shared workqueue component, but you
could easily make this distribute workqueues)  is here:
https://github.com/BlairArchibald/hpx-SumEuler . The actual application
isn't too interesting and is just as a test (chunk up some irregular work).

The interesting part is:
https://github.com/BlairArchibald/hpx-SumEuler/blob/master/src/main.cpp#L60
which defines the main scheduling loop. Here I've used an executor to
essentially pass of stolen tasks into (OS) thread pools. This works fairly
well, but it's still slower than I'd like.

One of the biggest issues I'm having is determining when to steal more
work. Currently it's an infinite loop which is checking
executor.num_pending_closures() and when there is less pending than threads
we go get more work (we could also add thresholds etc by stealing before we
fully run out of pending closures). I think this spin loop is probably
slowing us down and causing a lot of overhead - Is there a way of getting a
callback when threads block/suspend (i.e on a "get" call or similar)? This
way I could avoid the spinning and steal work when a thread goes idle and
there is no tasks available. I've seen callbacks for thread_start and
thread_stop but none for suspend?

Many thanks,
Blair

On 16 January 2017 at 16:50, Hartmut Kaiser <hartmut.kai...@gmail.com>
wrote:

> Hey Blair,
>
> Welcome to the community!
>
> > I'm looking to build an application which makes use of distributed work
> > stealing to propagate dynamic work between nodes as and when it is
> > required.
> >
> > Firstly I just want to check that as far as I can tell HPX only does
> > thread local work stealing and doesn't steal async tasks across nodes
> > automatically? i.e hpx::async takes a locality_id as a parameter.
> >
> > If this is the case then I'd like to add distributed task scheduling on
> > top of HPX. My current plan would be something along the lines of:
>
> Yes, HPX does not perform remote work stealing at this point. You're not
> limited to passing a locality id to async, though. Async can also be called
> with a distribution policy which is then used to determine the target of
> the operation (where things are executed). One example for a
> distribution_policy is collocated():
>
>     hpx::id_type id = hpx::new_<SomeComponent>(here);
>     auto f = hpx::async(some_action(), hpx::collocated(id), ...);
>
> Here, the action 'some_action' will be executed on the same locality as
> the component is located. Other distribution policies perform round-robin
> execution using a list of localities, etc.
>
> However, I'm not sure if this would do what you want (i.e. whether a
> distribution_policy can expose the functionality you need).
>
> > 1. Define and create a workqueue component (let's assume one for now)
> > 2. Thief's can call a "steal" action on this component, which takes the
> > thief locality as an argument.
> > 3. The workqueue responds to steals by launching the next action via
> async
> > on the locality which did the steal (or doing nothing/responding with a
> > retry later task if there is no work).
> >
> > Thiefs would have a thread periodically doing steals when no work is
> left.
>
> Sounds doable. This workqueue could be implemented as a component (i.e. a
> C++ object which can be created and accessed remotely, using actions).
>
> At the same time, stealing work across localities might involve copying
> the data the operation should be performed on. In HPX we always let the
> 'work follow the data', i.e. the action should be executed close to the
> data it refers to. So we normally place the data on some locality and let
> the system execute the work next to it.
>
> However in use cases where you don't have any data, or where the data
> needed is minimal, doing it as you suggest seems to be viable.
>
> > The biggest issuse I have is that I can't figure out what type the
> > workqueue should hold to enable it to store generic actions. I have
> tried:
> >
> > std::vector<hpx::actions::base_action> tasks;
> >
> > But getting them into the workqueue is an issue (since it won't compile
> > with a function which takes a base_action or ref/ptr type). Ideally the
> > workqueue should be able to store tasks of many different types.
> >
> > Is something like this possible?
>
> This sounds like an interesting approach.
>
> If you'd like to store the actions alone, passing around (and storing)
> std::shared_ptr<hpx::actions::base_action> should work (std::unique_ptr<>
> should work as well, but this might not be what you need). This also gives
> you the benefit of passing real pointers as long as operations are local,
> forcing memory allocation only if things are remote.
>
> However, as alluded to above, I think that storing the plain actions might
> not be sufficient, at least not if the actions are not stateless (that
> would be equivalent to storing function pointers). You might want to bind
> additional parameters to the action. HPX can bind() actions in the same way
> as you can bind plain functions (with std::bind or similar).
>
>     auto a = hpx::util::bind(some_action(), args...);
>     // ...
>     a(); // invoke action
>
> where 'args' are additional parameters (including placeholders) you want
> to associate an action invocation with. Here I understand however that
> storing arbitrary bound actions require some kind of type erasure as well.
> In the simplest case you could look into using hpx::util::function, even if
> that limited you to storing (bound) actions exposing similar argument
> requirements...
>
> Both, the bound action and hpx::util::function can be serialized, i.e.
> passed as arguments to an action.
>
> As you can see, there are many options and things mostly depend on your
> concrete use case.
>
> HTH
> Regards Hartmut
> ---------------
> http://boost-spirit.com
> http://stellar.cct.lsu.edu
>
>
>
>
_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to