Surely the group_comm object does not exist on processes outside the group, and the Expression object construction can only happen within the group?
I don't see how anything else makes sense. But clear docstring is always good. Btw, can we assert that the jit signatures match across the group? I'm a bit nervous about bugs in nonuniform mpi programs, and that would be a good early indicator of something funny happening. Martin 26. nov. 2014 09:43 skrev "Garth N. Wells" <[email protected]>: > On Wed, 26 Nov, 2014 at 8:32 AM, Johan Hake <[email protected]> wrote: > >> On Wed, Nov 26, 2014 at 9:22 AM, Garth N. Wells <[email protected]> wrote: >> >>> >>> >>> On Wed, 26 Nov, 2014 at 7:50 AM, Johan Hake <[email protected]> wrote: >>> >>>> On Wed, Nov 26, 2014 at 8:34 AM, Garth N. Wells <[email protected]> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, 25 Nov, 2014 at 9:48 PM, Johan Hake <[email protected]> >>>>> wrote: >>>>> >>>>>> Hello! >>>>>> >>>>>> I just pushed some fixes to the jit interface of DOLFIN. Now one can >>>>>> jit on different mpi groups. >>>>>> >>>>> >>>>> Nice. >>>>> >>>>> Previously jiting was only done on rank 1 of the mpi_comm_world. Now >>>>>> it is done on rank 1 of any passed group communicator. >>>>>> >>>>> >>>>> Do you mean rank 0? >>>>> >>>> >>>> Yes, of course. >>>> >>>> >>>> There is no demo atm showing this but a test has been added: >>>>>> >>>>>> test/unit/python/jit/test_jit_with_mpi_groups.py >>>>>> >>>>>> Here an expression, a subdomain, and a form is constructed on >>>>>> different ranks using group. It is somewhat tedious as one need to >>>>>> initialize PETSc with the same group, otherwise PETSc will deadlock >>>>>> during >>>>>> initialization (the moment a PETSc la object is constructed). >>>>>> >>>>> >>>>> This is ok. It's arguably a design flaw that we don't make the user >>>>> handle MPI initialisation manually. >>>>> >>>> >>>> Sure, it is just somewhat tedious. You cannot start your typical >>>> script with importing dolfin. >>>> >>>> The procedure in Python for this is: >>>>>> >>>>>> 1) Construct mpi groups using mpi4py >>>>>> 2) Initalize petscy4py using the groups >>>>>> 3) Wrap groups to petsc4py comm (dolfin only support petsc4py not >>>>>> mpi4py) >>>>>> 4) import dolfin >>>>>> 5) Do group specific stuff: >>>>>> a) Function and forms no change needed as communicator >>>>>> is passed via mesh >>>>>> b) domain = CompiledSubDomain("...", mpi_comm=group_comm) >>>>>> c) e = Expression("...", mpi_comm=group_comm) >>>>>> >>>>> >>>>> It's not so clear whether passing the communicator means that the >>>>> Expression is only defined/available on group_comm, or if group_comm is >>>>> simply to control who does the JIT. Could you clarify this? >>>>> >>>> >>>> My knowledge is not that good in MPI. I have only tried to access (and >>>> construct) the Expression on ranks included in that group. Also when I >>>> tried construct one using a group communicator on a rank that is not >>>> included in the group, I got an when calling MPI_size on it. There is >>>> probably a perfectly reasonable explaination to this. >>>> >>> >>> Could you clarify what goes on behind-the-scenes with the communicator? >>> Is it only used in a call to get the process rank? What do the ranks other >>> than zero do? >>> >> >> Not sure what you want to know. Instead of using mpi_comm_world to >> construct meshes you use the group communicator. This communicator has its >> own local group of ranks. JITing is still done on rank 0 of the local >> group, which might and most often is different from rank 0 process of the >> mpi_comm_word. >> > > I just want to be clear (and have in the docstring) that > > e = Expression("...", mpi_comm=group_comm) > > is valid only on group_comm (if this is the case), or make clear that the > communicator only determines the process that does the JIT. > > If we required all Expressions to have a domain/mesh, as Martin advocates, > things would be clearer. > > The group communicator works exactly like the world communicator but now >> on just a subset of the processes. There were some sharp edges with >> deadlocks as a consequence, when barriers were taken on the world >> communicator. This is done by default when dolfin is imported and petcs >> gets initialized with the world communicator. So we need to initialized >> petsc using the group communicator. Other than that there are not real >> differences. >> > > That doesn't sound right. PETSc initialisation does not take a > communicator. It is collective on MPI_COMM_WORLD, but each PETSc object > takes a communicator at construction, which can be something other than > MPI_COMM_WORLD or MPI_COMM_SELF. > > Garth > > >> Johan >> >> >> >>> Garth >>> >>> >>> >>> Please try it out and report any sharp edges. A demo would also be fun >>>>>> to include :) >>>>>> >>>>> >>>>> We could run tests on different communicators to speed them up on >>>>> machines with high core counts! >>>>> >>>> >>>> True! >>>> >>>> Johan >>>> >>>> >>>> Garth >>>>> >>>>> >>>>> Johan >>>>>> >>>>> >>>>> >>>> >>> >> > _______________________________________________ > fenics mailing list > [email protected] > http://fenicsproject.org/mailman/listinfo/fenics >
_______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
