> Il giorno 26/nov/2014, alle ore 10:28, Garth N. Wells <[email protected]> ha > scritto: > > >> On Wed, 26 Nov, 2014 at 9:09 AM, Johan Hake <[email protected]> wrote: >>> On Wed, Nov 26, 2014 at 9:43 AM, Garth N. Wells <[email protected]> wrote: >>>> On Wed, 26 Nov, 2014 at 8:32 AM, Johan Hake <[email protected]> wrote: >>>>> On Wed, Nov 26, 2014 at 9:22 AM, Garth N. Wells <[email protected]> wrote: >>>>>> On Wed, 26 Nov, 2014 at 7:50 AM, Johan Hake <[email protected]> wrote: >>>>>>> On Wed, Nov 26, 2014 at 8:34 AM, Garth N. Wells <[email protected]> wrote: >>>>>>>> On Tue, 25 Nov, 2014 at 9:48 PM, Johan Hake <[email protected]> wrote: >>>>>>>> Hello! >>>>>>>> I just pushed some fixes to the jit interface of DOLFIN. Now one can >>>>>>>> jit on different mpi groups. >>>>>>> Nice. >>>>>>>> Previously jiting was only done on rank 1 of the mpi_comm_world. Now >>>>>>>> it is done on rank 1 of any passed group communicator. >>>>>>> Do you mean rank 0? >>>>>> Yes, of course. >>>>>>>> There is no demo atm showing this but a test has been added: >>>>>>>> test/unit/python/jit/test_jit_with_mpi_groups.py >>>>>>>> Here an expression, a subdomain, and a form is constructed on >>>>>>>> different ranks using group. It is somewhat tedious as one need to >>>>>>>> initialize PETSc with the same group, otherwise PETSc will deadlock >>>>>>>> during initialization (the moment a PETSc la object is constructed). >>>>>>> This is ok. It's arguably a design flaw that we don't make the user >>>>>>> handle MPI initialisation manually. >>>>>> Sure, it is just somewhat tedious. You cannot start your typical script >>>>>> with importing dolfin. >>>>>>>> The procedure in Python for this is: >>>>>>>> 1) Construct mpi groups using mpi4py >>>>>>>> 2) Initalize petscy4py using the groups >>>>>>>> 3) Wrap groups to petsc4py comm (dolfin only support petsc4py not >>>>>>>> mpi4py) >>>>>>>> 4) import dolfin >>>>>>>> 5) Do group specific stuff: >>>>>>>> a) Function and forms no change needed as communicator >>>>>>>> is passed via mesh >>>>>>>> b) domain = CompiledSubDomain("...", mpi_comm=group_comm) >>>>>>>> c) e = Expression("...", mpi_comm=group_comm) >>>>>>> It's not so clear whether passing the communicator means that the >>>>>>> Expression is only defined/available on group_comm, or if group_comm is >>>>>>> simply to control who does the JIT. Could you clarify this? >>>>>> My knowledge is not that good in MPI. I have only tried to access (and >>>>>> construct) the Expression on ranks included in that group. Also when I >>>>>> tried construct one using a group communicator on a rank that is not >>>>>> included in the group, I got an when calling MPI_size on it. There is >>>>>> probably a perfectly reasonable explaination to this. >>>>> Could you clarify what goes on behind-the-scenes with the communicator? >>>>> Is it only used in a call to get the process rank? What do the ranks >>>>> other than zero do? >>>> Not sure what you want to know. Instead of using mpi_comm_world to >>>> construct meshes you use the group communicator. This communicator has its >>>> own local group of ranks. JITing is still done on rank 0 of the local >>>> group, which might and most often is different from rank 0 process of the >>>> mpi_comm_word. >>> I just want to be clear (and have in the docstring) that >>> e = Expression("...", mpi_comm=group_comm) >>> is valid only on group_comm (if this is the case), or make clear that the >>> communicator only determines the process that does the JIT. >> I see now what you mean. I can update the docstring. As far as I understand >> it should be that the expression is only valid on group_comm, and that rank >> 0 of that group take care of the JIT. > > > OK, could you make this clear in the docstring? > >>> If we required all Expressions to have a domain/mesh, as Martin advocates, >>> things would be clearer. >> Sure, but the same question is there for the mesh too. Is it available on >> ranks that is not in the group? > > > I think in this case it is clear - a mesh lives only on the processes > belonging to its communicator. The ambiguity with an Expression is that is > doesn't have any data that lives on processes. > > >>>> The group communicator works exactly like the world communicator but now >>>> on just a subset of the processes. There were some sharp edges with >>>> deadlocks as a consequence, when barriers were taken on the world >>>> communicator. This is done by default when dolfin is imported and petcs >>>> gets initialized with the world communicator. So we need to initialized >>>> petsc using the group communicator. Other than that there are not real >>>> differences. >>> That doesn't sound right. PETSc initialisation does not take a >>> communicator. It is collective on MPI_COMM_WORLD, but each PETSc object >>> takes a communicator at construction, which can be something other than >>> MPI_COMM_WORLD or MPI_COMM_SELF. >> Well, for all I know petsc can be initialized with a mpi_comm. In petsc4py >> that is done by: >> import petsc4py >> petsc4py.init(comm=group_1) >> import petsc4py.PETSc as petsc >> It turned out that this was required for the Function constructor to not >> deadlock. The line: >> _vector = factory.create_vector(); >> initilizes PETSc with world_comm, which obviously deadlocks. > > There must be something else wrong. PETScInitialize does not take a > communicator: > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html > > Why does petsc4py want one? It doesn't make sense to initialise it with a > communicator - a communicator belongs to objects.
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_COMM_WORLD.html Simone > > Garth > >> You might say that this could be avoided by initializing PETSc on all ranks >> with the world communicator before constructing a Function on a group. >> However it still deadlocks during construction. Here I have just assumed it >> deadlock at the same line, but I need to double check this. And when I >> initilized PETSc using the group communicator it just worked. So somewhere a >> collective call to mpi_world_comm is executed when constructing a >> PETScVector. >> Johan >>> Garth >>>> Johan >>>>> Garth >>>>>>>> Please try it out and report any sharp edges. A demo would also be fun >>>>>>>> to include :) >>>>>>> We could run tests on different communicators to speed them up on >>>>>>> machines with high core counts! >>>>>> True! >>>>>> Johan >>>>>>> Garth >>>>>>>> Johan > > _______________________________________________ > fenics mailing list > [email protected] > http://fenicsproject.org/mailman/listinfo/fenics _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
