Re: [OMPI devel] Confusion about slots

2016-03-23 Thread Aurélien Bouteiller
To add to what Ralf said, you probably do not want to use Hyper Threads for HPC workloads, as that generally results in very poor performance (as you noticed). Set the number of slots to the number of real cores (not HT), that would yield optimal results 95% of the time. Aurélien --

Re: [OMPI devel] Confusion about slots

2016-03-23 Thread Federico Reghenzani
Ok, I've investigated further today, it seems "--map-by hwthread" does not remove the problem. However, if I specified in the hostfile "node0 slots=32" it runs really slower than specifying only "node0". In both cases I run mpirun with -np 32. So I'm quite sure I didn't understand what slots are.

Re: [OMPI devel] OMPI devel] MPI Error

2016-03-23 Thread Gilles Gouaillardet
Dominic, I can only recommend you write a small self contained programs that write the data in parallel, and then check from task 0 only that data was written as you expected. Feel free to take some time reading mpi io tutorials. If you are still struggling with your code, i will try to help

Re: [OMPI devel] MPI Error

2016-03-23 Thread Dominic Kedelty
I am open to any suggestions to make the code better, especially if the way it's coded now is wrong. I believe what the MPI_TYPE_INDEXED is trying to do is this... I have a domain of for example 8 hexahedral elements (2x2x2 cell domain) that has 27 unique connectivity nodes (3x3x3 nodes) In this

Re: [OMPI devel] MPI Error

2016-03-23 Thread Gilles Gouaillardet
Dominik, with MPI_Type_indexed, array_of_displacements is an int[] so yes, there is a risk of overflow on the other hand, MPI_Type_create_hindexed, array_of_displacements is an MPI_Aint[] note array_of_displacements Displacement for each block, in multiples of

Re: [OMPI devel] MPI Error

2016-03-23 Thread Dominic Kedelty
Hi Gilles, I believe I have found the problem. Initially I thought it may have been an mpi issue since it was internally within an mpi function. However, now I am sure that the problem has to do with an overflow of 4-byte signed integers. I am dealing with computational domains that have a