Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Mon, 2015-09-28 at 12:46 +, John Colvin via Digitalmars-d-learn wrote: > […] > > Pretty much as expected. Locks are slow, shared accumulators > suck, much better to write to thread local and then merge. Quite. Dataflow is where the parallel action is. (Except for those writing

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Sat, 2015-09-26 at 15:56 +, Jay Norwood via Digitalmars-d-learn wrote: > std.parallelism.reduce documentation provides an example of a > parallel sum. > > This works: > auto sum3 = taskPool.reduce!"a + b"(iota(1.0,101.0)); > > This results in a compile error: > auto sum3 =

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Mon, 2015-09-28 at 11:38 +, John Colvin via Digitalmars-d-learn wrote: > […] > > It would be really great if someone knowledgable did a full > review of std.parallelism to find out the answer, hint, hint... > :) Indeed, I would love to be able to do this. However I don't have time in

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Sat, 2015-09-26 at 14:33 +0200, anonymous via Digitalmars-d-learn wrote: > […] > I'm pretty sure atomicOp is faster, though. Rough and ready anecdotal evidence would indicate that this is a reasonable statement, by quite a long way. However a proper benchmark is needed for statistical

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
As a single data point: == anonymous_fix.d == 5050 real0m0.168s user0m0.200s sys 0m0.380s == colvin_fix.d == 5050 real0m0.036s user0m0.124s sys 0m0.000s == norwood_reduce.d

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Sat, 2015-09-26 at 12:32 +, Zoidberg via Digitalmars-d-learn wrote: > > Here's a correct version: > > > > import std.parallelism, std.range, std.stdio, core.atomic; > > void main() > > { > > shared ulong i = 0; > > foreach (f; parallel(iota(1, 100+1))) > > { > >

Re: Parallel processing and further use of output

2015-09-28 Thread Russel Winder via Digitalmars-d-learn
On Sat, 2015-09-26 at 17:20 +, Jay Norwood via Digitalmars-d-learn wrote: > This is a work-around to get a ulong result without having the > ulong as the range variable. > > ulong getTerm(int i) > { > return i; > } > auto sum4 = taskPool.reduce!"a + >

Re: Parallel processing and further use of output

2015-09-28 Thread John Colvin via Digitalmars-d-learn
On Monday, 28 September 2015 at 11:31:33 UTC, Russel Winder wrote: On Sat, 2015-09-26 at 14:33 +0200, anonymous via Digitalmars-d-learn wrote: […] I'm pretty sure atomicOp is faster, though. Rough and ready anecdotal evidence would indicate that this is a reasonable statement, by quite a

Re: Parallel processing and further use of output

2015-09-28 Thread Jay Norwood via Digitalmars-d-learn
On Saturday, 26 September 2015 at 15:56:54 UTC, Jay Norwood wrote: This results in a compile error: auto sum3 = taskPool.reduce!"a + b"(iota(1UL,101UL)); I believe there was discussion of this problem recently ... https://issues.dlang.org/show_bug.cgi?id=14832

Re: Parallel processing and further use of output

2015-09-28 Thread John Colvin via Digitalmars-d-learn
On Monday, 28 September 2015 at 12:18:28 UTC, Russel Winder wrote: As a single data point: == anonymous_fix.d == 5050 real0m0.168s user0m0.200s sys 0m0.380s == colvin_fix.d == 5050 real0m0.036s user

Re: Parallel processing and further use of output

2015-09-26 Thread Zoidberg via Digitalmars-d-learn
Here's a correct version: import std.parallelism, std.range, std.stdio, core.atomic; void main() { shared ulong i = 0; foreach (f; parallel(iota(1, 100+1))) { i.atomicOp!"+="(f); } i.writeln; } Thanks! Works fine. So "shared" and "atomic" is a must?

Re: Parallel processing and further use of output

2015-09-26 Thread John Colvin via Digitalmars-d-learn
On Saturday, 26 September 2015 at 12:18:16 UTC, Zoidberg wrote: I've run into an issue, which I guess could be resolved easily, if I knew how... [CODE] ulong i = 0; foreach (f; parallel(iota(1, 100+1))) { i += f; } thread_joinAll(); i.writeln; [/CODE] It's

Re: Parallel processing and further use of output

2015-09-26 Thread Meta via Digitalmars-d-learn
On Saturday, 26 September 2015 at 12:33:45 UTC, anonymous wrote: foreach (f; parallel(iota(1, 100+1))) { synchronized i += f; } Is this valid syntax? I've never seen synchronized used like this before.

Re: Parallel processing and further use of output

2015-09-26 Thread anonymous via Digitalmars-d-learn
On Saturday 26 September 2015 14:18, Zoidberg wrote: > I've run into an issue, which I guess could be resolved easily, > if I knew how... > > [CODE] > ulong i = 0; > foreach (f; parallel(iota(1, 100+1))) > { > i += f; > } > thread_joinAll(); >

Parallel processing and further use of output

2015-09-26 Thread Zoidberg via Digitalmars-d-learn
I've run into an issue, which I guess could be resolved easily, if I knew how... [CODE] ulong i = 0; foreach (f; parallel(iota(1, 100+1))) { i += f; } thread_joinAll(); i.writeln; [/CODE] It's basically an example which adds all the numbers from 1 to

Re: Parallel processing and further use of output

2015-09-26 Thread anonymous via Digitalmars-d-learn
On Saturday, 26 September 2015 at 13:09:54 UTC, Meta wrote: On Saturday, 26 September 2015 at 12:33:45 UTC, anonymous wrote: foreach (f; parallel(iota(1, 100+1))) { synchronized i += f; } Is this valid syntax? I've never seen synchronized used like this before. I'm

Re: Parallel processing and further use of output

2015-09-26 Thread Jay Norwood via Digitalmars-d-learn
btw, on my corei5, in debug build, reduce (using double): 11msec non_parallel: 37msec parallel with atomicOp: 123msec so, that is the reason for using parallel reduce, assuming the ulong range thing will get fixed.

Re: Parallel processing and further use of output

2015-09-26 Thread Jay Norwood via Digitalmars-d-learn
This is a work-around to get a ulong result without having the ulong as the range variable. ulong getTerm(int i) { return i; } auto sum4 = taskPool.reduce!"a + b"(std.algorithm.map!getTerm(iota(11)));

Re: Parallel processing and further use of output

2015-09-26 Thread John Colvin via Digitalmars-d-learn
On Saturday, 26 September 2015 at 17:20:34 UTC, Jay Norwood wrote: This is a work-around to get a ulong result without having the ulong as the range variable. ulong getTerm(int i) { return i; } auto sum4 = taskPool.reduce!"a + b"(std.algorithm.map!getTerm(iota(11))); or auto

Re: Parallel processing and further use of output

2015-09-26 Thread Jay Norwood via Digitalmars-d-learn
std.parallelism.reduce documentation provides an example of a parallel sum. This works: auto sum3 = taskPool.reduce!"a + b"(iota(1.0,101.0)); This results in a compile error: auto sum3 = taskPool.reduce!"a + b"(iota(1UL,101UL)); I believe there was discussion of this problem recently

Re: Parallel processing and further use of output

2015-09-26 Thread Zoidberg via Digitalmars-d-learn
On Saturday, 26 September 2015 at 13:09:54 UTC, Meta wrote: On Saturday, 26 September 2015 at 12:33:45 UTC, anonymous wrote: foreach (f; parallel(iota(1, 100+1))) { synchronized i += f; } Is this valid syntax? I've never seen synchronized used like this before.