>> That summary misses the whole point of the errors I am seeing. >> >> The code runs fine locally AND under Sun Grid Engine, if you only >> spawn TWO processes but not FOUR or EIGHT. > > Well the the 'np 2' runs could be scheduled on your local node [or a > single SMP remote node].
Well, they "could be", yes: they are not though. Look, you need to trust me when I tell you things (except for version numbers, ha ha). I would not be bothering you if I had not looked into this to a reasonable extent before deciding to bother you. I am in control of where the jobs are running. > And I suspect there is something wrong in your OpenMPI+SunGridEngine > config thats triggering this problem. I am happy to accept that and I even suggested that might be the case. I am happy to go and look around the OpenMPI and SGE sources, if that turns out to be the case. However, I came to the PETSc list for some insight from the PETSc error messages. If they can confirm/reject the notion that it might be an SGE/OpenMPI issue and not a PETSc one then I will have gained information. > I don't know exactly how though.. So far, nothing has been confirmed either way. > [the basic petsc examples are supporsed to work in any valid > MPI enviornment]. I don't doubt for a minute that they are supposed too. I am also aware that few people are likley to be using this software stack on NetBSD and thus there may be some gaps in your map of "valid MPI environments". > ok - mpi is shared. Can you confirm that the exact same version of > openmpi is installed on all the nodes - and that there is no minor > version differences that could trigger this? Just take that as read. Are you saying that the error messages PETSc is throwing out ARE consistent with a slightly mis-matched MPI then ? I am building an OpenMPI with some debugging in at present. I'll get back to you once I have rolled it out across the nodes and have some more info. In the meantime, if you can think of anything I can tickle PETSc with, you being familiar with PETSC, so as to get some error messages that might tell you something, do let me know. -- Kevin M. Buckley Room: CO327 School of Engineering and Phone: +64 4 463 5971 Computer Science Victoria University of Wellington New Zealand
