Hello, i got an algorithm that generates trees, of different sizes, recursively. at the moment i have the algorithm in its secuential version.
here we have 4 identical computers with Xeon 8-core in each node + 4gb ram. they have HyperThreading so they count as 16-processors per node. so i can launch a total of 64 parallel threads. my question is, what could be the best approach when using MPI.??? assigning -np 64 maybe is not a good idea, because i would not be taking advantage of the vecinity of cores which could improve memory tasks speeds, i mean it might be better to have 4 mpi processes and each one of these spawn 15 threads locally???...(can i mix MPI with local threads right? ) i dont have much experience in MPI, i only programmed bigger algorithms in CUDA which is much easier. any suggestions or help is welcome Cristobal -- Cristobal <http://www.youtube.com/neoideo>