On 07/03/2012 15:26, Jed Brown wrote: > On Wed, Mar 7, 2012 at 09:12, Lawrence Mitchell > <lawrence.mitchell at ed.ac.uk <mailto:lawrence.mitchell at ed.ac.uk>> wrote: > > Moving forward, I've done a bunch of this macroisation, which makes the > code more PETSc-like. > > As you point out it's probably unwise to call omp_get_num_threads() on > every loop iteration, so I went for the magic names in the end. A > typical simple loop now looks like: > > VecOMPParallelBegin(xin, private(i) shared(x)); for (i=__start; > i<__end;i++) x[i] = work(x[i]); VecOMPParallelEnd(); > > In the --without-openmp case, this transforms into: > > do { PetscInt __start = 0, __end = xin->map->n; for ...; } while (0); > > Rather than adding an nthreads slot to vectors, I put it in the generic > PetscObject struct -- we're threading matrices too. > > > We are going with PetscLayout for distribution information with the > pthreads implementations.
Ah, that's a better approach. So probably the recently created seqpthread/tmap.c wants moving. > We have also discussed making "thread communicator" information (how many > threads and their affinities) an attribute of an MPI_Comm. I'd be > interested to hear your opinions on that. I'm not sure quite what that buys you, except in the case you describe below where two third-party libraries have different threading models. In that case stashing the threading information on one communicator per library makes data management slightly easier, I guess. You'd probably still need to manage transferring the data between layouts by hand though. > All, should we consider moving all the "kernels" into their own > functions and just doing parallel region dispatch differently for OpenMP > and pthreads? I think that is the only reasonable way to have just one > version of the code. If we committed to always using separate > side-effect-free kernels, we might be able to reuse many of them with TBB > and OpenCL as well. It would probably be a good idea, even if only to cut down on typo-bugs. I imagine, to take as an example VecMax_Seq it would look something like this: VecMax_SeqKernel(PetscScalar *xx, PetscInt start, PetscInt end, PetscInt *idx, PetscReal *z) { PetscInt i; PetscInt j; PetscReal max; max = PetscRealPart(xx[start]); j = start; for ( i=start+1; i < end; i++ ) { if (PetscRealPart(xx[i]) > max) { max = PetscRealPart(xx[i]); j = i; } } if (idx) *idx = j; } And then you have three (?) wrapper implementations: VecMax_Seq(Vec xin, PetscInt *idx, PetscReal *z) { PetscScalar *xx; PetscFunctionBegin; VecGetArrayRead(xin, &xx); VecMax_SeqKernel(xx, 0, xin->map->n, idx, z); VecRestoreArrayRead(xin, &xx); PetscFunctionReturn(0); } VecMax_SeqOMP(Vec xin, PetscInt *idx, PetscReal *z) { PetscScalar *xx; PetscInt *jt; PetscReal *zt; PetscFunctionBegin; // you could just call VecMax_Seq here if nthread == 1 VecGetArrayRead(xin, &xx); PetscMalloc(xin->map->tmap->nthread * sizeof(PetscReal), &zt); PetscMalloc(xin->map->tmap->nthread * sizeof(PetscInt), &jt); VecOMPParallelBegin(xin, shared(xx, jt, zt)); PetscInt thread = PetscGetThreadNum(); zt[thread] = PETSC_MIN_REAL; jt[thread] = -1; VecMax_SeqKernel(xx + __start, __end, jt + thread, zt + thread); VecOMPParallelEnd(); // do reduction to find max value/index ...; PetscFunctionReturn(0); } And the Pthread version looks similar but has to wrap the kernel invocation in another level as is currently done, I guess. > Is there ever a case where we want to have _both_ OpenMP and Pthreads > objects in the same application? Maybe, e.g. when servicing two > multiphysics applications, one of which chooses OpenMP and one of which > chooses Pthreads (perhaps on different communicators so we don't have to > fight with over-subscription). See above. I think that currently the OMP and pthread implementations would play fine with one-another, modulo over-subscription of threads. I think addressing the issue should probably be motivated by a concrete use case. > > > I don't know how to write the correct tests for configure to probe for > the existance of _Pragma, so that's still missing. > > > I'll do it. Thanks. Lawrence -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.