On 4/3/23 6:02 PM, Paul wrote:
On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote:


It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count.
**?!?**

So for example, if you have:

```d
foreach(i; iota(0, 2_000_000).parallel)
{
   runExpensiveTask(i);
}
```

The foreach is run on the main thread, gets a `0`, then hands off to a task thread `runExpensiveTask(0)`. Then it gets a `1`, and hands off to a task thread `runExpensiveTask(1)`, etc. The iteration is not expensive, and is not done in parallel.

On the other hand, what you *shouldn't* do is:

```d
foreach(i; iota(0, 2_000_000).map!(x => runExpensiveTask(x)).parallel)
{
}
```

as this will run the expensive task *before* running any tasks.


If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually).
**Ok I did have some debug writelns I commented out.**

And did it help? Another thing that takes a global lock is memory allocation.

Also make sure you have more than one logical CPU.
**I have 8.**

It's dependent on the work being done, but you should see a roughly 8x speedup as long as the overhead of distributing tasks is not significant compared to the work being done.

-Steve

Reply via email to