Thanks a lot. This one divides the x-cross-y region by rows. Suppose dimension is 8*12 and 4 parallel threads are there, so current method is dividing by 2*12 to each of 4 threads.for(int i=1; i< N; i++) <==> foreach(i; iota(1, N)) so you can use: foreach(i; parallel(iota(1, N))) { ... }
The current reply answers my question, but I was just curious. Can we have a method which divides the 2d region as follows: 8*12 divided into 4*6 to each of 4 threads.