I'm seeing comm_spawn hang here: [bend001][[52890,1],0][coll_ml_module.c:3030:mca_coll_ml_comm_query] COLL-ML ml_coll_schedule_setup exit with error [bend001][[52890,1],1][coll_ml_module.c:3030:mca_coll_ml_comm_query] COLL-ML ml_coll_schedule_setup exit with error
Setting -mca coll ^ml allows things to run to completion just fine, so it appears that coll/ml is having a problem with comm_spawn. Ralph