Re: [gmx-users] compiling issue
Actually I can't figure where my setting is wrong. Here is my cmake command: CC=$IMPI_PATH/mpiicc CXX=$IMPI_PATH/mpiicpc CMAKE_PREFIX_PATH=$FFTW3DIR:$CUDAPATH/include:$CUDAPATH/lib64:$CUDAPATH/bin\ $CMAKE \ -DFFTW_INCLUDE_DIR=$FFTW3DIR/include \ -DFFTW_LIBRARY=$FFTW3DIR/lib \ -DGMX_MPI=ON \ -DGMX_GPU=ON [-DCUDA_TOOLKIT_ROOT_DIR=$CUDAPATH] \ -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR \ -DGMX_X11=OFF ../$APP \ -DGMX_THREADS=OFF \ -DGMX_OPENMP=ON \ -DBUILD_SHARED_LIBS=ON \ -DGMX_PREFER_STATIC_LIBS=OFF Anyone has experienced this issue already? On 01/10/2015 09:00 AM, Éric Germaneau wrote: alright, thank you. Will check this. On 01/09/2015 10:17 PM, Mark Abraham wrote: Picking up standard library includes from the system gcc and not being able to assemble SIMD sounds like you haven't set up the whole compiler environment properly. Odds are excellent that gcc will run faster anyway... Mark On Jan 9, 2015 1:24 AM, Éric Germaneau german...@sjtu.edu.cn wrote: So sorry, I forgot to mention I use *GMX 5.0.4*. On 01/09/2015 08:21 AM, Éric Germaneau wrote: Dear all, I'm trying to build GMX on a Intel CentOS release 6.6 machine using icc 14.0 and CUDA 6.5. Here are the error I get: [ 1%] Built target mdrun_objlib In file included from /usr/local/cuda/include/crt/ device_runtime.h(251), from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/include/stddef.h(212): /usr/local/cuda/include/crt/storage_class.h(61): remark #7: unrecognized token #define __storage_auto__device__ @@@ COMPILER @@@ ERROR @@@ ...// /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; } ... Scanning dependencies of target cuda_tools Linking CXX static library ../../../../lib/libcuda_tools.a [ 1%] Built target cuda_tools [ 2%] [ 2%] Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/ nbnxn_cuda_generated_nbnxn_cuda_data_mgmt.cu.o Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/ nbnxn_cuda_generated_nbnxn_cuda.cu.o /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; ... /tmp/iccZVwEChas_.s: Assembler messages: /tmp/iccZVwEChas_.s:375: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:467: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:628: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:629: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:630: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:709: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:711: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:712: Error: suffix or operands invalid for `vpsubd' /tmp/iccZVwEChas_.s:713: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:1620: Error: no such instruction: `shlx %r8d,%eax,%r11d' /tmp/iccZVwEChas_.s:2000: Error: no such instruction: `shlx %r8d,%eax,%r10d' /tmp/iccZVwEChas_.s:2107: Error: no such instruction: `shlx %r9d,%eax,%eax' /tmp/iccZVwEChas_.s:2485: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3255: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3650: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:4154: Error: suffix or operands invalid for `vpaddd' CMake Error at gpu_utils_generated_memtestG80_core.cu.o.cmake:264 (message): Error generating file /home/eric/soft/science/opensource/gromacs/build-5.0. 4/src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils. dir//./gpu_utils_generated_memtestG80_core.cu.o make[2]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/gpu_ utils_generated_memtestG80_core.cu.o] Error 1 make[1]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/all] Error 2 make: *** [all] Error 2 The CPU version compile smoothly. Any hint here ? Éric. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/ Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit
Re: [gmx-users] compiling issue
alright, thank you. Will check this. On 01/09/2015 10:17 PM, Mark Abraham wrote: Picking up standard library includes from the system gcc and not being able to assemble SIMD sounds like you haven't set up the whole compiler environment properly. Odds are excellent that gcc will run faster anyway... Mark On Jan 9, 2015 1:24 AM, Éric Germaneau german...@sjtu.edu.cn wrote: So sorry, I forgot to mention I use *GMX 5.0.4*. On 01/09/2015 08:21 AM, Éric Germaneau wrote: Dear all, I'm trying to build GMX on a Intel CentOS release 6.6 machine using icc 14.0 and CUDA 6.5. Here are the error I get: [ 1%] Built target mdrun_objlib In file included from /usr/local/cuda/include/crt/ device_runtime.h(251), from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/include/stddef.h(212): /usr/local/cuda/include/crt/storage_class.h(61): remark #7: unrecognized token #define __storage_auto__device__ @@@ COMPILER @@@ ERROR @@@ ...// /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; } ... Scanning dependencies of target cuda_tools Linking CXX static library ../../../../lib/libcuda_tools.a [ 1%] Built target cuda_tools [ 2%] [ 2%] Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/ nbnxn_cuda_generated_nbnxn_cuda_data_mgmt.cu.o Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/ nbnxn_cuda_generated_nbnxn_cuda.cu.o /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; ... /tmp/iccZVwEChas_.s: Assembler messages: /tmp/iccZVwEChas_.s:375: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:467: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:628: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:629: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:630: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:709: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:711: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:712: Error: suffix or operands invalid for `vpsubd' /tmp/iccZVwEChas_.s:713: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:1620: Error: no such instruction: `shlx %r8d,%eax,%r11d' /tmp/iccZVwEChas_.s:2000: Error: no such instruction: `shlx %r8d,%eax,%r10d' /tmp/iccZVwEChas_.s:2107: Error: no such instruction: `shlx %r9d,%eax,%eax' /tmp/iccZVwEChas_.s:2485: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3255: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3650: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:4154: Error: suffix or operands invalid for `vpaddd' CMake Error at gpu_utils_generated_memtestG80_core.cu.o.cmake:264 (message): Error generating file /home/eric/soft/science/opensource/gromacs/build-5.0. 4/src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils. dir//./gpu_utils_generated_memtestG80_core.cu.o make[2]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/gpu_ utils_generated_memtestG80_core.cu.o] Error 1 make[1]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/all] Error 2 make: *** [all] Error 2 The CPU version compile smoothly. Any hint here ? Éric. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/ Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China Email:german...@sjtu.edu.cn Mobi:+86-136-4161-6480 http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail
Re: [gmx-users] compiling issue
So sorry, I forgot to mention I use *GMX 5.0.4*. On 01/09/2015 08:21 AM, Éric Germaneau wrote: Dear all, I'm trying to build GMX on a Intel CentOS release 6.6 machine using icc 14.0 and CUDA 6.5. Here are the error I get: [ 1%] Built target mdrun_objlib In file included from /usr/local/cuda/include/crt/device_runtime.h(251), from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/include/stddef.h(212): /usr/local/cuda/include/crt/storage_class.h(61): remark #7: unrecognized token #define __storage_auto__device__ @@@ COMPILER @@@ ERROR @@@ ...// /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; } ... Scanning dependencies of target cuda_tools Linking CXX static library ../../../../lib/libcuda_tools.a [ 1%] Built target cuda_tools [ 2%] [ 2%] Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/nbnxn_cuda_generated_nbnxn_cuda_data_mgmt.cu.o Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/nbnxn_cuda_generated_nbnxn_cuda.cu.o /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; ... /tmp/iccZVwEChas_.s: Assembler messages: /tmp/iccZVwEChas_.s:375: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:467: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:628: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:629: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:630: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:709: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:711: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:712: Error: suffix or operands invalid for `vpsubd' /tmp/iccZVwEChas_.s:713: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:1620: Error: no such instruction: `shlx %r8d,%eax,%r11d' /tmp/iccZVwEChas_.s:2000: Error: no such instruction: `shlx %r8d,%eax,%r10d' /tmp/iccZVwEChas_.s:2107: Error: no such instruction: `shlx %r9d,%eax,%eax' /tmp/iccZVwEChas_.s:2485: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3255: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3650: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:4154: Error: suffix or operands invalid for `vpaddd' CMake Error at gpu_utils_generated_memtestG80_core.cu.o.cmake:264 (message): Error generating file /home/eric/soft/science/opensource/gromacs/build-5.0.4/src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir//./gpu_utils_generated_memtestG80_core.cu.o make[2]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/gpu_utils_generated_memtestG80_core.cu.o] Error 1 make[1]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/all] Error 2 make: *** [all] Error 2 The CPU version compile smoothly. Any hint here ? Éric. -- Éric Germaneau (艾海克), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] compiling issue
Dear all, I'm trying to build GMX on a Intel CentOS release 6.6 machine using icc 14.0 and CUDA 6.5. Here are the error I get: [ 1%] Built target mdrun_objlib In file included from /usr/local/cuda/include/crt/device_runtime.h(251), from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/include/stddef.h(212): /usr/local/cuda/include/crt/storage_class.h(61): remark #7: unrecognized token #define __storage_auto__device__ @@@ COMPILER @@@ ERROR @@@ ...// /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; } ... Scanning dependencies of target cuda_tools Linking CXX static library ../../../../lib/libcuda_tools.a [ 1%] Built target cuda_tools [ 2%] [ 2%] Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/nbnxn_cuda_generated_nbnxn_cuda_data_mgmt.cu.o Building NVCC (Device) object src/gromacs/mdlib/nbnxn_cuda/CMakeFiles/nbnxn_cuda.dir/nbnxn_cuda_generated_nbnxn_cuda.cu.o /usr/local/cuda/include/crt/host_runtime.h(121): remark #82: storage class is not first static void nv_dummy_param_ref(void *param) { volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)param; ... /tmp/iccZVwEChas_.s: Assembler messages: /tmp/iccZVwEChas_.s:375: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:467: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:628: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:629: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:630: Error: no such instruction: `vpbroadcastd %xmm0,%ymm0' /tmp/iccZVwEChas_.s:709: Error: suffix or operands invalid for `vpcmpeqd' /tmp/iccZVwEChas_.s:711: Error: suffix or operands invalid for `vpxor' /tmp/iccZVwEChas_.s:712: Error: suffix or operands invalid for `vpsubd' /tmp/iccZVwEChas_.s:713: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:1620: Error: no such instruction: `shlx %r8d,%eax,%r11d' /tmp/iccZVwEChas_.s:2000: Error: no such instruction: `shlx %r8d,%eax,%r10d' /tmp/iccZVwEChas_.s:2107: Error: no such instruction: `shlx %r9d,%eax,%eax' /tmp/iccZVwEChas_.s:2485: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3255: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:3650: Error: suffix or operands invalid for `vpaddd' /tmp/iccZVwEChas_.s:4154: Error: suffix or operands invalid for `vpaddd' CMake Error at gpu_utils_generated_memtestG80_core.cu.o.cmake:264 (message): Error generating file /home/eric/soft/science/opensource/gromacs/build-5.0.4/src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir//./gpu_utils_generated_memtestG80_core.cu.o make[2]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/gpu_utils_generated_memtestG80_core.cu.o] Error 1 make[1]: *** [src/gromacs/gmxlib/gpu_utils/CMakeFiles/gpu_utils.dir/all] Error 2 make: *** [all] Error 2 The CPU version compile smoothly. Any hint here ? Éric. -- Éric Germaneau (艾海克), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] multinode issue
Dear Mark, Dear Szilárd, Thank you for your help. I did try different I_MPI... option without success. Something I can't figure is I can run jobs with 2 or more OpenMP threads per MPI process, but not just one. It crash doing one OpenMP threads per MPI process, even I disable I_MPI_PIN. Éric. On 12/06/2014 02:54 AM, Szilárd Páll wrote: On a second thought (and a quick googling), it _seems_ that this is an issue caused by the following: - the OpenMP runtime gets initialized outside mdrun and its threads (or just the master thread), get their affinity set; - mdrun then executes the sanity check, point at which omp_get_num_procs(), reports 1 CPU most probably because the master thread is bound to a single core. This alone should not be a big deal as long as the affinity settings get correctly overridden in mdrun. However this can have the ugly side-effect that, if mdrun's affinity setting gets disabled (if mdrun detects the externally set affinities it back off or if not all cores/hardware threads are used), all compute threads will inherit the affinity set previously and multiple threads will run on a the same core. Note that this warning should typically not cause a crash, but it is telling you that something is not quite right, so it may be best to start with eliminating this warning (hints: I_MPI_PIN for Intel MPI, -cc for Cray's aprun, --cpu-bind for slurm). Cheers, -- Szilárd On Fri, Dec 5, 2014 at 7:35 PM, Szilárd Páll pall.szil...@gmail.com wrote: I don't think this is a sysconf issue. As you seem to have 16-core (hw thread?) nodes, it looks like sysnconf returned the correct value (16), but the OpenMP runtime actually returned 1. This typically means that the OpenMP runtime was initialized outside mdrun and for some reason (which I'm not sure about) it returns 1. My guess is that your job scheduler is multi-threading aware and by default assumes 1 core/hardware thread per rank so you may want to set some rank depth/width option. -- Szilárd On Fri, Dec 5, 2014 at 1:37 PM, Éric Germaneau german...@sjtu.edu.cn wrote: Thank you Mark, Yes this was the end of the log. I tried an other input and got the same issue: Number of CPUs detected (16) does not match the number reported by OpenMP (1). Consider setting the launch configuration manually! Reading file yukuntest-70K.tpr, VERSION 4.6.3 (single precision) [16:node328] unexpected disconnect completion event from [0:node299] Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0 internal ABORT - process 16 Actually, I'm running some test for our users, I'll talk with the admin about how to return information to the standard sysconf() routine in the usual way. Thank you, Éric. On 12/05/2014 07:38 PM, Mark Abraham wrote: On Fri, Dec 5, 2014 at 9:15 AM, Éric Germaneau german...@sjtu.edu.cn wrote: Dear all, I use impi and when I submit o job (via LSF) to more than one node I get the following message: Number of CPUs detected (16) does not match the number reported by OpenMP (1). That suggests this machine has not be set up to return information to the standard sysconf() routine in the usual way. What kind of machine is this? Consider setting the launch configuration manually! Reading file test184000atoms_verlet.tpr, VERSION 4.6.2 (single precision) I hope that's just a 4.6.2-era .tpr, but nobody should be using 4.6.2 mdrun because there was a bug in only that version affecting precisely these kinds of issues... [16:node319] unexpected disconnect completion event from [11:node328] Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0 internal ABORT - process 16 I submit doing mpirun -np 32 -machinefile nodelist $EXE -v -deffnm $INPUT The machinefile looks like this node328:16 node319:16 I'm running the release 4.6.7. I do not set anything about OpenMP for this job, I'd like to have 32 MPI process. Using one node it works fine. Any hints here? Everything seems fine. What was the end of the .log file? Can you run another MPI test program thus? Mark Éric. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/ Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China Email:german...@sjtu.edu.cn Mobi
Re: [gmx-users] multinode issue
Thanks Mark for having tried to help. On 12/06/2014 10:08 PM, Mark Abraham wrote: On Sat, Dec 6, 2014 at 9:29 AM, Éric Germaneau german...@sjtu.edu.cn wrote: Dear Mark, Dear Szilárd, Thank you for your help. I did try different I_MPI... option without success. Something I can't figure is I can run jobs with 2 or more OpenMP threads per MPI process, but not just one. It crash doing one OpenMP threads per MPI process, even I disable I_MPI_PIN. OK, well that points to something being configured incorrectly in IMPI, rather than any of the other theories. Try OpenMPI ;-) Mark Éric. On 12/06/2014 02:54 AM, Szilárd Páll wrote: On a second thought (and a quick googling), it _seems_ that this is an issue caused by the following: - the OpenMP runtime gets initialized outside mdrun and its threads (or just the master thread), get their affinity set; - mdrun then executes the sanity check, point at which omp_get_num_procs(), reports 1 CPU most probably because the master thread is bound to a single core. This alone should not be a big deal as long as the affinity settings get correctly overridden in mdrun. However this can have the ugly side-effect that, if mdrun's affinity setting gets disabled (if mdrun detects the externally set affinities it back off or if not all cores/hardware threads are used), all compute threads will inherit the affinity set previously and multiple threads will run on a the same core. Note that this warning should typically not cause a crash, but it is telling you that something is not quite right, so it may be best to start with eliminating this warning (hints: I_MPI_PIN for Intel MPI, -cc for Cray's aprun, --cpu-bind for slurm). Cheers, -- Szilárd On Fri, Dec 5, 2014 at 7:35 PM, Szilárd Páll pall.szil...@gmail.com wrote: I don't think this is a sysconf issue. As you seem to have 16-core (hw thread?) nodes, it looks like sysnconf returned the correct value (16), but the OpenMP runtime actually returned 1. This typically means that the OpenMP runtime was initialized outside mdrun and for some reason (which I'm not sure about) it returns 1. My guess is that your job scheduler is multi-threading aware and by default assumes 1 core/hardware thread per rank so you may want to set some rank depth/width option. -- Szilárd On Fri, Dec 5, 2014 at 1:37 PM, Éric Germaneau german...@sjtu.edu.cn wrote: Thank you Mark, Yes this was the end of the log. I tried an other input and got the same issue: Number of CPUs detected (16) does not match the number reported by OpenMP (1). Consider setting the launch configuration manually! Reading file yukuntest-70K.tpr, VERSION 4.6.3 (single precision) [16:node328] unexpected disconnect completion event from [0:node299] Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0 internal ABORT - process 16 Actually, I'm running some test for our users, I'll talk with the admin about how to return information to the standard sysconf() routine in the usual way. Thank you, Éric. On 12/05/2014 07:38 PM, Mark Abraham wrote: On Fri, Dec 5, 2014 at 9:15 AM, Éric Germaneau german...@sjtu.edu.cn wrote: Dear all, I use impi and when I submit o job (via LSF) to more than one node I get the following message: Number of CPUs detected (16) does not match the number reported by OpenMP (1). That suggests this machine has not be set up to return information to the standard sysconf() routine in the usual way. What kind of machine is this? Consider setting the launch configuration manually! Reading file test184000atoms_verlet.tpr, VERSION 4.6.2 (single precision) I hope that's just a 4.6.2-era .tpr, but nobody should be using 4.6.2 mdrun because there was a bug in only that version affecting precisely these kinds of issues... [16:node319] unexpected disconnect completion event from [11:node328] Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0 internal ABORT - process 16 I submit doing mpirun -np 32 -machinefile nodelist $EXE -v -deffnm $INPUT The machinefile looks like this node328:16 node319:16 I'm running the release 4.6.7. I do not set anything about OpenMP for this job, I'd like to have 32 MPI process. Using one node it works fine. Any hints here? Everything seems fine. What was the end of the .log file? Can you run another MPI test program thus? Mark Éric. -- Éric Germaneau (???), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/ Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe
[gmx-users] Issue building 5.0.2 release for GPU
Dear all, I'm building the 5.0.2 release the same way as the 4.6.x releases but can't compile it for GPU. I use CUDA 6.5 and icc/14.0.2 and impi/4.1.3.048. Here is my cmake command: CC=mpiicc CXX=mpiicpc CMAKE_PREFIX_PATH=$FFTW3DIR:$CUDAPATH/include:$CUDAPATH/lib64:$CUDAPATH/bin \ $CMAKE \ -DFFTW_INCLUDE_DIR=$FFTW3DIR/include \ -DFFTW_LIBRARY=$FFTW3DIR/lib\ -DGMX_MPI=ON \ -DGMX_GPU=ON [-DCUDA_TOOLKIT_ROOT_DIR=$CUDAPATH] \ -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR \ -DGMX_X11=OFF ../$APP \ -DGMX_THREADS=OFF \ -DBUILD_SHARED_LIBS=OFF \ -DGMX_PREFER_STATIC_LIBS=ON Here is the kind of errors I get: /tmp/tmpxft_88e3_-9_copyrite_gpu.compute_20.cudafe1.stub.c(6): remark #82: storage class is not first static void __nv_cudaEntityRegisterCallback(void **__T20){{ volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)__T20; };__nv_save_fatbinhandle_for_managed_rt(__T20);} ^ In file included from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config.hpp(35), from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/smart_ptr/scoped_ptr.hpp(14), from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/scoped_ptr.hpp(14), from /path/to/gromacs/gromacs-5.0.2/src/gromacs/utility/common.h(50), from /path/to/gromacs/gromacs-5.0.2/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda_data_mgmt.cu(63): /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config/compiler/intel.hpp(40): warning #47: incompatible redefinition of macro BOOST_COMPILER (declared at line 11 of /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config/compiler/nvcc.hpp) #define BOOST_COMPILER Intel C++ version BOOST_STRINGIZE(BOOST_INTEL_CXX_VERSION) ^ Do I have to install Boost? I actually just tried but it's horrible with Intel MPI. Any suggestions? Thanks, Éric. -- Éric Germaneau (艾海克), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Issue building 5.0.2 release for GPU
Hey Roland, Thank you for your reply. I'll do that. Hopefully this will fixed in the next release. Éric. On 10/17/2014 03:02 PM, Roland Schulz wrote: Hi, this seems to be the boost bug https://svn.boost.org/trac/boost/ticket/10420 . Please file a Gromacs redmine issue. As a temporary solution use gcc=4.7. It should be as fast as ICC at least when used with GPU. Roland On Fri, Oct 17, 2014 at 2:23 AM, Éric Germaneau german...@sjtu.edu.cn wrote: Dear all, I'm building the 5.0.2 release the same way as the 4.6.x releases but can't compile it for GPU. I use CUDA 6.5 and icc/14.0.2 and impi/4.1.3.048. Here is my cmake command: CC=mpiicc CXX=mpiicpc CMAKE_PREFIX_PATH=$FFTW3DIR:$CUDAPATH/include:$CUDAPATH/lib64:$CUDAPATH/bin \ $CMAKE \ -DFFTW_INCLUDE_DIR=$FFTW3DIR/include \ -DFFTW_LIBRARY=$FFTW3DIR/lib\ -DGMX_MPI=ON \ -DGMX_GPU=ON [-DCUDA_TOOLKIT_ROOT_DIR=$CUDAPATH] \ -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR \ -DGMX_X11=OFF ../$APP \ -DGMX_THREADS=OFF \ -DBUILD_SHARED_LIBS=OFF \ -DGMX_PREFER_STATIC_LIBS=ON Here is the kind of errors I get: /tmp/tmpxft_88e3_-9_copyrite_gpu.compute_20.cudafe1.stub.c(6): remark #82: storage class is not first static void __nv_cudaEntityRegisterCallback(void **__T20){{ volatile static void * *__ref __attribute__((unused)); __ref = (volatile void * *)__T20; };__nv_save_fatbinhandle_for_managed_rt(__T20);} ^ In file included from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config.hpp(35), from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/smart_ptr/scoped_ptr.hpp(14), from /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/scoped_ptr.hpp(14), from /path/to/gromacs/gromacs-5.0.2/src/gromacs/utility/common.h(50), from /path/to/gromacs/gromacs-5.0.2/src/gromacs/mdlib/nbnxn_cuda/ nbnxn_cuda_data_mgmt.cu(63): /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config/compiler/intel.hpp(40): warning #47: incompatible redefinition of macro BOOST_COMPILER (declared at line 11 of /path/to/gromacs/gromacs-5.0.2/src/external/boost/boost/config/compiler/nvcc.hpp) #define BOOST_COMPILER Intel C++ version BOOST_STRINGIZE(BOOST_INTEL_CXX_VERSION) ^ Do I have to install Boost? I actually just tried but it's horrible with Intel MPI. Any suggestions? Thanks, Éric. -- Éric Germaneau (艾海克), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Éric Germaneau (艾海克), Specialist Center for High Performance Computing Shanghai Jiao Tong University Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China M:german...@sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.