Hi, I'm having a hard time trying to figure out what could be causing the slow I/O behaviour that I see: the same code (in Fortran) run in three different clusters behaves pretty similar in terms of I/O times, except for one of the variables in the code, where I get two orders of magnitude slower writes in one of machines (last timing data in the e-mail). So I hope that somebody with more in-depth knowledge of Parallel HDF5 can give me a hand with it.
This is the situation. Our code writes to files two types of variables: the first type are 3D variables that have been decomposed with 3D decomposition across different processors, and I use hyperslabs to select where each part should go. Using arrays of size 200x200x200 that have been decomposed in 64 processors I get similar times for the reading and writing (each file 794MB) routines in three clusters that I have access to: Cluster 1: ------ READING 0.1231E+01 WRITING 0.1600E+01 Cluster 2: ------ READING 0.1973E+01 WRITING 0.2544E+01 Cluster 3: ----- READING 0.1274E+01 WRITING 0.5895E+01 As you can see there is some variation, but I would be happy with this sort of behaviour. The other type of data that I write to disk are like outside layers of the 3D cube. So, for example, in the 200x200x200 cube above, I have six outside layers, two in each dimension. The depth of this layers can vary, but in this example I'm using 24 cells, so the X layers would be in this case 24x200x200. But for each of these layers I need to save 24 variables, so in reality I end up with 4D arrays. In this particular example, for the outside layers in X dimension, we have 4D arrays of size 24x200x200x24, for Y 200x24x200x24 and for Z 200x200x24x24. So now the fun begins. If I tell my code to only save the X outside layers, I end up with files of 1.2GB and the times in the 3 clusters where I've been running these tests are: Cluster 1: ----- READING 0.1270E+01 WRITING 0.2088E+01 Cluster 2: ----- READING 0.2214E+01 WRITING 0.3826E+01 Cluster 3:: ----- READING 0.1279E+01 WRITING 0.7138E+01 If I only save the outside layers in Y, I get also 1.2GB files, and the times: Cluster 1: ----- READING 0.1207E+01 WRITING 0.1832E+01 Cluster 2: ----- READING 0.1606E+01 WRITING 0.3895E+01 Cluster 3:: ----- READING 0.1264E+01 WRITING 0.6670E+01 But if I ask to only save the outside layers in Z, I also get 1.2GB files, but the times are: Cluster 1: ----- READING 0.7905E+00 WRITING 0.2190E+01 Cluster 2: ----- READING 0.1856E+01 WRITING 0.8722E+02 Cluster 3: ----- READING 0.1252E+01 WRITING 0.2372E+03 What can be so different about the Z dimension to get I/O behaviours so different in the three clusters? (Needless to say the code is exactly the same, the input data is exactly the same...) Any pointers are more than welcome, -- Ángel de Vicente http://www.iac.es/galeria/angelv/ --------------------------------------------------------------------------------------------- ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
