I am usually able to find the answer to my problems by searching the archive but I've run up against one that I can't suss out.
bison-opt: relocation error: /home/pbme002/opt/gcc-4.8.2-tpls/openmpi-1.8.4/lib/libmpi.so.1: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference There is the error I am getting, the problem is that it's not consistent. This happens to a random few jobs in a series of the same job on different data sets. The ones that fail and produce the error run fine when a second attempt is made. I am the admin for this cluster and the user is using their own compiled OpenMPI and not the system OpenMPI so I can't say for certain that it was compiled correctly but it strikes me as odd that jobs would fail with the above error but run perfectly fine when a second attempt is made. I'm looking for any help sussing out what could be causing this issue. Regards, Mark L. Potter