Re: [hpx-users] parallel_executor::post dominating APEX trace
Kor, I think you've just found a bug (all my fault...) at a very good time. There are indeed configuration options that affect how the scheduler steals tasks and it looks like I've set them to very inappropriate values. Stay tuned for a PR. On the second point you're probably seeing your hpx_main/main function as run_helper. Because of some changes in APEX the task names in the OTF file get the first name of the task. We don't have a good solution for this yet. As a hack you could try doing this as the first thing in your main function (note: that's main if you include hpx_main.hpp or hpx_main if you include hpx_init.hpp or hpx_start.hpp): hpx::util::annotate_function annotation("my_main"); hpx::this_thread::yield(); Once the task is rescheduled it should have the label "my_main". Mikael From: hpx-users-boun...@stellar.cct.lsu.edu on behalf of Jong, K. de (Kor) Sent: Wednesday, December 18, 2019 3:20:59 PM To: hpx-users@stellar.cct.lsu.edu Subject: Re: [hpx-users] parallel_executor::post dominating APEX trace Hi Mikael, Thank you for your detailed and helpful answer! It starts to make sense to me now. My program almost behaves as I expect it should. I still have two questions though. Maybe you or someone else can point me to the right direction? 1. I run my program on a single node and expect that all threads receive about the same number of same-sized tasks. The APEX trace in Vampir shows that all threads start busy but after a while gaps appear on some OS threads during which nothing seems to happen, while other threads are still performing tasks. I would expect tasks to be more evenly distributed and/or to be stolen from the task queues of other OS threads. Is this assumption correct? Can I increase the tendency of the scheduler to steal tasks to keep OS threads busy? 2. I perform scaling tests, and each time my tasks run in parallel there is a serial 'run_helper' task that runs on a single OS thread. What is this and can I somehow keep it out of my timings? Based on a quick look at the HPX code I concluded that run_helper has to do with initializing the HPX run time. But even if I run my tasks multiple times (from the same running process), run_helper spends time before my tasks do. I focus on a single node now (1-96 OS threads) and am not doing anything too clever, I think. I don't tweak the bindings and scheduler yet. Thanks! Kor ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
Re: [hpx-users] parallel_executor::post dominating APEX trace
Hi Mikael, Thank you for your detailed and helpful answer! It starts to make sense to me now. My program almost behaves as I expect it should. I still have two questions though. Maybe you or someone else can point me to the right direction? 1. I run my program on a single node and expect that all threads receive about the same number of same-sized tasks. The APEX trace in Vampir shows that all threads start busy but after a while gaps appear on some OS threads during which nothing seems to happen, while other threads are still performing tasks. I would expect tasks to be more evenly distributed and/or to be stolen from the task queues of other OS threads. Is this assumption correct? Can I increase the tendency of the scheduler to steal tasks to keep OS threads busy? 2. I perform scaling tests, and each time my tasks run in parallel there is a serial 'run_helper' task that runs on a single OS thread. What is this and can I somehow keep it out of my timings? Based on a quick look at the HPX code I concluded that run_helper has to do with initializing the HPX run time. But even if I run my tasks multiple times (from the same running process), run_helper spends time before my tasks do. I focus on a single node now (1-96 OS threads) and am not doing anything too clever, I think. I don't tweak the bindings and scheduler yet. Thanks! Kor ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
Re: [hpx-users] parallel_executor::post dominating APEX trace
Hi Kor, hpx::parallel::execution::parallel_executor::post is what eventually gets called if you use hpx::apply (or future::then) to create tasks. post itself should not take up much time, even in extreme cases. It's hard to say for sure without more information about your application but there are at least a few possibilities: 1. You haven't annotated your tasks and your tasks end up with the default name set by the executor. Have you annotated your tasks and do you see them at all in your traces? 2. You have annotated your tasks and we haven't fixed the annotations for all use cases. Are you actually using actions or just plain local functions? hpx::apply or hpx::async? 3. Your task size is too small for our overheads. What is a typical task size in your case? We typically recommend a task size of at least 1 ms to be on the safe side, but you can most likely go a bit smaller than that, especially if you don't have too many cores on your machine. I think if this were the case the symptoms would be a bit different though, so it's most likely not this. Mikael From: hpx-users-boun...@stellar.cct.lsu.edu on behalf of Jong, K. de (Kor) Sent: Tuesday, December 17, 2019 4:14:51 PM To: hpx-users@stellar.cct.lsu.edu Subject: [hpx-users] parallel_executor::post dominating APEX trace Hi list, I am using HPX commit bbc3ad7 (1.4.0-rc2 + APEX linking fix) and APEX to gain insights into the run time behavior of my program (and hopefully improve it, based on that). The trace I am looking at shows that by far most of the time is spent by hpx::parallel::execution::parallel_executor::post instead of in my actions. Maybe this makes complete sense. Can someone maybe explain in what situations parallel_executor::post would take up a lot of time? Thanks! Kor ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
[hpx-users] parallel_executor::post dominating APEX trace
Hi list, I am using HPX commit bbc3ad7 (1.4.0-rc2 + APEX linking fix) and APEX to gain insights into the run time behavior of my program (and hopefully improve it, based on that). The trace I am looking at shows that by far most of the time is spent by hpx::parallel::execution::parallel_executor::post instead of in my actions. Maybe this makes complete sense. Can someone maybe explain in what situations parallel_executor::post would take up a lot of time? Thanks! Kor ___ hpx-users mailing list hpx-users@stellar.cct.lsu.edu https://mail.cct.lsu.edu/mailman/listinfo/hpx-users