Ok, I'm totally flummoxed here. I'm an ISV delivering a C program that can use MPI for it's inter-node communications. It has been deployed on a number (dozens) of small clusters and has been working pretty over the last few months. That is, until someone tried to change the static IP address and netmask of the cluster's PUBLIC ethernet interface to a "special" address for their university. Now, my program "hangs" in some early MPI communications and I have to CTRL-C to get out of the process. I got things working again by specifying "--mca btl_tcp_if_include eth0" as an argument to mpiexec ( eth0=private TCP ).
Any idea WHY changing the public address messes things up so badly? While I have a workaround, it kinda caught me by surprise and that usually that means there's something going on I don't understand. I thought I was being hit by this: http://www.open-mpi.org/faq/?category=tcp#tcp-routability But my process doesn't fail, it just gets...stuck. Here's the routing table for the head and a compute node: 239.2.11.71 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 128.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 172.76.76.240 0.0.0.0 255.255.255.240 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1 224.0.0.0 0.0.0.0 240.0.0.0 U 0 0 0 eth0 0.0.0.0 128.0.0.10 0.0.0.0 UG 0 0 0 eth1 ----- 239.2.11.71 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 172.76.76.240 0.0.0.0 255.255.255.240 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 224.0.0.0 0.0.0.0 240.0.0.0 U 0 0 0 eth0 0.0.0.0 172.76.76.254 0.0.0.0 UG 0 0 0 eth0 What I know so far: - As I test this, there is nothing other than a switch plugged into eth1 and nothing else plugged into that (i.e. it gives me a link light, but no one to talk to). - "mpiexec -np 2 -host master,node1 myProgram" hangs - MPI Init is completing. I write out one log file per process and messages from "mpiexec -d' seem to support that conclusion. - I'm pretty sure my first Bcast works, but I seem to be getting stuck in my first Allreduce. - If I run strace on a process, it looks like it is sitting in a poll loop. - "mpiexec -np 2 -host master,node1 --mca btl_tcp_if_include eth0 myProgram" doesn't hang - If I run mpiexec from the head node and just specify a host list that does NOT include the head node, things work just fine. "mpiexec -np 2 -host node1,node2 myProgram" doesn't hang - I have strace outputs from each of these scenarios above from each node, but cannot make heads nor tails of them - If I take down the public interface (if-down eth1), things also work. - I can ping and ssh from any node to any node without any problem, so I don't think it's network related. - A non-mpi job launches and exits just fine ( "mpiexec -np 2 -host master,node1 hostname" works ) Details: - MPI 1.2.5, RedHat 4, 64-bit OS - Gigabit Ethernet, no high-speed interfaces - Original working public IP: 192.168.1.1 / 16 - Public IP address that breaks stuff: 128.0.0.1 / 24 - Internal address: 172.76.76.240 / 28 with the head node being .254 and the nodes are .241 .242. .243 and .244 - I built the 1.2.5 "multi" RPMs using the shell script and spec file on the openmpi site and installed the runtime using "rpm -Uvh ..." - All addresses are static. - Clusters are generally 5 nodes, master plus four compute nodes, but this shows up on just two. Per the FAQ, here's my ifconfig and ompi_info... [adminrig@vnode ~]$ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr xxxxx inet addr:172.76.76.254 Bcast:172.76.76.255 Mask:255.255.255.240 inet6 addr: fe80::xxxx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:65860 errors:0 dropped:0 overruns:0 frame:0 TX packets:51860 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8077348 (7.7 MiB) TX bytes:17135962 (16.3 MiB) Base address:0x2000 Memory:c8200000-c8220000 eth1 Link encap:Ethernet HWaddr xxxxx inet addr:128.0.0.1 Bcast:128.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::xxxx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:257 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:16880 (16.4 KiB) Base address:0x2020 Memory:c8220000-c8240000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:192297 errors:0 dropped:0 overruns:0 frame:0 TX packets:192297 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:32801704 (31.2 MiB) TX bytes:32801704 (31.2 MiB) ompi_info below: Open MPI: 1.2.5 Open MPI SVN revision: r16989 Open RTE: 1.2.5 Open RTE SVN revision: r16989 OPAL: 1.2.5 OPAL SVN revision: r16989 MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5) MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5) MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5) MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5) MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5) MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5) MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5) MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5) MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5) MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5) MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5) MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5) MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5) MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5) MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5) MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5) MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5) MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5) MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5) MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5) MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5) MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5) MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0) MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5) MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5) MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5) MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.5) MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5) MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5) MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5) MCA rmaps: round_robin (MCA v1.0, API v1.3, Component v1.2.5) MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5) MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5) MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5) MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.5) MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.5) MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.5) MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.5) MCA sds: env (MCA v1.0, API v1.0, Component v1.2.5) MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.5) MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.5) MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2.5) MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.5) Prefix: /usr Bindir: /usr/bin Libdir: /usr/lib64 Incdir: /usr/include Pkglibdir: /usr/lib64/openmpi Sysconfdir: /etc Configured architecture: x86_64-redhat-linux-gnu Configured by: root Configured on: Wed Jun 11 20:04:56 EDT 2008 Configure host: newcluster4.cluster Built by: root Built on: Wed Jun 11 20:07:25 EDT 2008 Built host: newcluster4.cluster C bindings: yes C++ bindings: yes Fortran77 bindings: no Fortran90 bindings: no Fortran90 bindings size: na C compiler: gcc C compiler absolute: /usr/lib64/ccache/bin/gcc C char size: 1 C bool size: 1 C short size: 2 C int size: 4 C long size: 8 C float size: 4 C double size: 8 C pointer size: 8 C char align: 1 C bool align: 1 C int align: 4 C float align: 4 C double align: 8 C++ compiler: g++ C++ compiler absolute: /usr/lib64/ccache/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: none Fortran90 compiler abs: none Fort integer size: 4 Fort logical size: 4 Fort logical value true: 0 Fort real size: skipped Fort dbl prec size: skipped Fort cplx size: skipped Fort dbl cplx size: skipped Fort integer align: skipped Fort real align: skipped Fort dbl prec align: skipped Fort cplx align: skipped Fort dbl cplx align: skipped C profiling: yes C++ profiling: yes Fortran77 profiling: no Fortran90 profiling: no C++ exceptions: no Thread support: posix (mpi: no, progress: no) Build CFLAGS: -DNDEBUG -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m64 -mtune=nocona -finline-functions -fno-strict-aliasing -pthread Build CXXFLAGS: -DNDEBUG -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m64 -mtune=nocona -finline-functions -pthread Build FFLAGS: -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m64 -mtune=nocona Build FCFLAGS: -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m64 -mtune=nocona Build LDFLAGS: -export-dynamic Build LIBS: -lnsl -lutil -lm Wrapper extra CFLAGS: -pthread Wrapper extra CXXFLAGS: -pthread Wrapper extra FFLAGS: Wrapper extra FCFLAGS: Wrapper extra LDFLAGS: Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: yes mpirun default --prefix: no MCA mca: parameter "mca_param_files" (current value: "/home/adminrig/.openmpi/mca-params.conf:/etc/openmpi-mca-params.conf") Path for MCA configuration files containing default parameter values MCA mca: parameter "mca_component_path" (current value: "/usr/lib64/openmpi:/home/adminrig/.openmpi/components") Path where to look for Open MPI and ORTE components MCA mca: parameter "mca_verbose" (current value: <none>) Top-level verbosity parameter MCA mca: parameter "mca_component_show_load_errors" (current value: "1") Whether to show errors for components that failed to load or not MCA mca: parameter "mca_component_disable_dlopen" (current value: "0") Whether to attempt to disable opening dynamic components or not MCA mpi: parameter "mpi_param_check" (current value: "1") Whether you want MPI API parameters checked at run-time or not. Possible values are 0 (no checking) and 1 (perform checking at run-time) MCA mpi: parameter "mpi_yield_when_idle" (current value: "0") Yield the processor when waiting for MPI communication (for MPI processes, will default to 1 when oversubscribing nodes) MCA mpi: parameter "mpi_event_tick_rate" (current value: "-1") How often to progress TCP communications (0 = never, otherwise specified in microseconds) MCA mpi: parameter "mpi_show_handle_leaks" (current value: "0") Whether MPI_FINALIZE shows all MPI handles that were not freed or not MCA mpi: parameter "mpi_no_free_handles" (current value: "0") Whether to actually free MPI objects when their handles are freed MCA mpi: parameter "mpi_show_mca_params" (current value: "0") Whether to show all MCA parameter value during MPI_INIT or not (good for reproducability of MPI jobs) MCA mpi: parameter "mpi_show_mca_params_file" (current value: <none>) If mpi_show_mca_params is true, setting this string to a valid filename tells Open MPI to dump all the MCA parameter values into a file suitable for reading via the mca_param_files parameter (good for reproducability of MPI jobs) MCA mpi: parameter "mpi_paffinity_alone" (current value: "0") If nonzero, assume that this job is the only (set of) process(es) running on each node and bind processes to processors, starting with processor ID 0 MCA mpi: parameter "mpi_keep_peer_hostnames" (current value: "1") If nonzero, save the string hostnames of all MPI peer processes (mostly for error / debugging output messages). This can add quite a bit of memory usage to each MPI process. MCA mpi: parameter "mpi_abort_delay" (current value: "0") If nonzero, print out an identifying message when MPI_ABORT is invoked (hostname, PID of the process that called MPI_ABORT) and delay for that many seconds before exiting (a negative delay value means to never abort). This allows attaching of a debugger before quitting the job. MCA mpi: parameter "mpi_abort_print_stack" (current value: "0") If nonzero, print out a stack trace when MPI_ABORT is invoked MCA mpi: parameter "mpi_preconnect_all" (current value: "0") Whether to force MPI processes to create connections / warmup with *all* peers during MPI_INIT (vs. making connections lazily -- upon the first MPI traffic between each process peer pair) MCA mpi: parameter "mpi_preconnect_oob" (current value: "0") Whether to force MPI processes to fully wire-up the OOB system between MPI processes. MCA mpi: parameter "mpi_leave_pinned" (current value: "0") Whether to use the "leave pinned" protocol or not. Enabling this setting can help bandwidth performance when repeatedly sending and receiving large messages with the same buffers over RDMA-based networks. MCA mpi: parameter "mpi_leave_pinned_pipeline" (current value: "0") Whether to use the "leave pinned pipeline" protocol or not. MCA mpi: parameter "mpi_warn_if_thread_multiple" (current value: "1") Whether to show a warning when MPI_THREAD_MULTIPLE is used or not. MCA mpi: parameter "mpi_warn_if_progress_threads" (current value: "1") Whether to show a warning when progress threads are used or not. MCA orte: parameter "orte_debug" (current value: "0") Top-level ORTE debug switch MCA orte: parameter "orte_no_daemonize" (current value: "0") Whether to properly daemonize the ORTE daemons or not MCA orte: parameter "orte_base_user_debugger" (current value: "totalview @mpirun@ -a @mpirun_args@ : ddt -n @np@ -start @executable@ @executable_argv@ @single_app@ : fxp @mpirun@ -a @mpirun_args@") Sequence of user-level debuggers to search for in orterun MCA orte: parameter "orte_abort_timeout" (current value: "10") Time to wait [in seconds] before giving up on aborting an ORTE operation MCA orte: parameter "orte_timing" (current value: "0") Request that critical timing loops be measured MCA opal: parameter "opal_signal" (current value: "6,7,8,11") If a signal is received, display the stack trace frame MCA backtrace: parameter "backtrace" (current value: <none>) Default selection set of components for the backtrace framework (<none> means "use all components that can be found") MCA backtrace: parameter "backtrace_base_verbose" (current value: "0") Verbosity level for the backtrace framework (0 = no verbosity) MCA backtrace: parameter "backtrace_execinfo_priority" (current value: "0") MCA memory: parameter "memory" (current value: <none>) Default selection set of components for the memory framework (<none> means "use all components that can be found") MCA memory: parameter "memory_base_verbose" (current value: "0") Verbosity level for the memory framework (0 = no verbosity) MCA memory: parameter "memory_ptmalloc2_priority" (current value: "0") MCA paffinity: parameter "paffinity_base_verbose" (current value: "0") Verbosity level of the paffinity framework MCA paffinity: parameter "paffinity" (current value: <none>) Default selection set of components for the paffinity framework (<none> means "use all components that can be found") MCA paffinity: parameter "paffinity_linux_priority" (current value: "10") Priority of the linux paffinity component MCA paffinity: information "paffinity_linux_have_cpu_set_t" (value: "1") Whether this component was compiled on a system with the type cpu_set_t or not (1 = yes, 0 = no) MCA paffinity: information "paffinity_linux_CPU_ZERO_ok" (value: "1") Whether this component was compiled on a system where CPU_ZERO() is functional or broken (1 = functional, 0 = broken/not available) MCA paffinity: information "paffinity_linux_sched_setaffinity_num_params" (value: "3") The number of parameters that sched_set_affinity() takes on the machine where this component was compiled MCA maffinity: parameter "maffinity_base_verbose" (current value: "0") Verbosity level of the maffinity framework MCA maffinity: parameter "maffinity" (current value: <none>) Default selection set of components for the maffinity framework (<none> means "use all components that can be found") MCA maffinity: parameter "maffinity_first_use_priority" (current value: "10") Priority of the first_use maffinity component MCA timer: parameter "timer" (current value: <none>) Default selection set of components for the timer framework (<none> means "use all components that can be found") MCA timer: parameter "timer_base_verbose" (current value: "0") Verbosity level for the timer framework (0 = no verbosity) MCA timer: parameter "timer_linux_priority" (current value: "0") MCA allocator: parameter "allocator" (current value: <none>) Default selection set of components for the allocator framework (<none> means "use all components that can be found") MCA allocator: parameter "allocator_base_verbose" (current value: "0") Verbosity level for the allocator framework (0 = no verbosity) MCA allocator: parameter "allocator_basic_priority" (current value: "0") MCA allocator: parameter "allocator_bucket_num_buckets" (current value: "30") MCA allocator: parameter "allocator_bucket_priority" (current value: "0") MCA coll: parameter "coll" (current value: <none>) Default selection set of components for the coll framework (<none> means "use all components that can be found") MCA coll: parameter "coll_base_verbose" (current value: "0") Verbosity level for the coll framework (0 = no verbosity) MCA coll: parameter "coll_basic_priority" (current value: "10") Priority of the basic coll component MCA coll: parameter "coll_basic_crossover" (current value: "4") Minimum number of processes in a communicator before using the logarithmic algorithms MCA coll: parameter "coll_self_priority" (current value: "75") MCA coll: parameter "coll_sm_priority" (current value: "0") Priority of the sm coll component MCA coll: parameter "coll_sm_control_size" (current value: "4096") Length of the control data -- should usually be either the length of a cache line on most SMPs, or the size of a page on machines that support direct memory affinity page placement (in bytes) MCA coll: parameter "coll_sm_bootstrap_filename" (current value: "shared_mem_sm_bootstrap") Filename (in the Open MPI session directory) of the coll sm component bootstrap rendezvous mmap file MCA coll: parameter "coll_sm_bootstrap_num_segments" (current value: "8") Number of segments in the bootstrap file MCA coll: parameter "coll_sm_fragment_size" (current value: "8192") Fragment size (in bytes) used for passing data through shared memory (will be rounded up to the nearest control_size size) MCA coll: parameter "coll_sm_mpool" (current value: "sm") Name of the mpool component to use MCA coll: parameter "coll_sm_comm_in_use_flags" (current value: "2") Number of "in use" flags, used to mark a message passing area segment as currently being used or not (must be >= 2 and <= comm_num_segments) MCA coll: parameter "coll_sm_comm_num_segments" (current value: "8") Number of segments in each communicator's shared memory message passing area (must be >= 2, and must be a multiple of comm_in_use_flags) MCA coll: parameter "coll_sm_tree_degree" (current value: "4") Degree of the tree for tree-based operations (must be => 1 and <= min(control_size, 255)) MCA coll: information "coll_sm_shared_mem_used_bootstrap" (value: "216") Amount of shared memory used in the shared memory bootstrap area (in bytes) MCA coll: parameter "coll_sm_info_num_procs" (current value: "4") Number of processes to use for the calculation of the shared_mem_size MCA information parameter (must be => 2) MCA coll: information "coll_sm_shared_mem_used_data" (value: "548864") Amount of shared memory used in the shared memory data area for info_num_procs processes (in bytes) MCA coll: parameter "coll_tuned_priority" (current value: "30") Priority of the tuned coll component MCA coll: parameter "coll_tuned_pre_allocate_memory_comm_size_limit" (current value: "32768") Size of communicator were we stop pre-allocating memory for the fixed internal buffer used for message requests etc that is hung off the communicator data segment. I.e. if you have a 100'000 nodes you might not want to pre-allocate 200'000 request handle slots per communicator instance! MCA coll: parameter "coll_tuned_init_tree_fanout" (current value: "4") Inital fanout used in the tree topologies for each communicator. This is only an initial guess, if a tuned collective needs a different fanout for an operation, it build it dynamically. This parameter is only for the first guess and might save a little time MCA coll: parameter "coll_tuned_init_chain_fanout" (current value: "4") Inital fanout used in the chain (fanout followed by pipeline) topologies for each communicator. This is only an initial guess, if a tuned collective needs a different fanout for an operation, it build it dynamically. This parameter is only for the first guess and might save a little time MCA coll: parameter "coll_tuned_use_dynamic_rules" (current value: "0") Switch used to decide if we use static (compiled/if statements) or dynamic (built at runtime) decision function rules MCA io: parameter "io_base_freelist_initial_size" (current value: "16") Initial MPI-2 IO request freelist size MCA io: parameter "io_base_freelist_max_size" (current value: "64") Max size of the MPI-2 IO request freelist MCA io: parameter "io_base_freelist_increment" (current value: "16") Increment size of the MPI-2 IO request freelist MCA io: parameter "io" (current value: <none>) Default selection set of components for the io framework (<none> means "use all components that can be found") MCA io: parameter "io_base_verbose" (current value: "0") Verbosity level for the io framework (0 = no verbosity) MCA io: parameter "io_romio_priority" (current value: "10") Priority of the io romio component MCA io: parameter "io_romio_delete_priority" (current value: "10") Delete priority of the io romio component MCA io: parameter "io_romio_enable_parallel_optimizations" (current value: "0") Enable set of Open MPI-added options to improve collective file i/o performance MCA mpool: parameter "mpool" (current value: <none>) Default selection set of components for the mpool framework (<none> means "use all components that can be found") MCA mpool: parameter "mpool_base_verbose" (current value: "0") Verbosity level for the mpool framework (0 = no verbosity) MCA mpool: parameter "mpool_rdma_rcache_name" (current value: "vma") The name of the registration cache the mpool should use MCA mpool: parameter "mpool_rdma_rcache_size_limit" (current value: "0") the maximum size of registration cache in bytes. 0 is unlimited (default 0) MCA mpool: parameter "mpool_rdma_print_stats" (current value: "0") print pool usage statistics at the end of the run MCA mpool: parameter "mpool_rdma_priority" (current value: "0") MCA mpool: parameter "mpool_sm_allocator" (current value: "bucket") Name of allocator component to use with sm mpool MCA mpool: parameter "mpool_sm_max_size" (current value: "536870912") Maximum size of the sm mpool shared memory file MCA mpool: parameter "mpool_sm_min_size" (current value: "134217728") Minimum size of the sm mpool shared memory file MCA mpool: parameter "mpool_sm_per_peer_size" (current value: "33554432") Size (in bytes) to allocate per local peer in the sm mpool shared memory file, bounded by min_size and max_size MCA mpool: parameter "mpool_sm_verbose" (current value: "0") Enable verbose output for mpool sm component MCA mpool: parameter "mpool_sm_priority" (current value: "0") MCA mpool: parameter "mpool_base_use_mem_hooks" (current value: "0") use memory hooks for deregistering freed memory MCA mpool: parameter "mpool_use_mem_hooks" (current value: "0") (deprecated, use mpool_base_use_mem_hooks) MCA mpool: parameter "mpool_base_disable_sbrk" (current value: "0") use mallopt to override calling sbrk (doesn't return memory to OS!) MCA mpool: parameter "mpool_disable_sbrk" (current value: "0") (deprecated, use mca_mpool_base_disable_sbrk) MCA pml: parameter "pml" (current value: <none>) Default selection set of components for the pml framework (<none> means "use all components that can be found") MCA pml: parameter "pml_base_verbose" (current value: "0") Verbosity level for the pml framework (0 = no verbosity) MCA pml: parameter "pml_cm_free_list_num" (current value: "4") Initial size of request free lists MCA pml: parameter "pml_cm_free_list_max" (current value: "-1") Maximum size of request free lists MCA pml: parameter "pml_cm_free_list_inc" (current value: "64") Number of elements to add when growing request free lists MCA pml: parameter "pml_cm_priority" (current value: "30") CM PML selection priority MCA pml: parameter "pml_ob1_free_list_num" (current value: "4") MCA pml: parameter "pml_ob1_free_list_max" (current value: "-1") MCA pml: parameter "pml_ob1_free_list_inc" (current value: "64") MCA pml: parameter "pml_ob1_priority" (current value: "20") MCA pml: parameter "pml_ob1_eager_limit" (current value: "131072") MCA pml: parameter "pml_ob1_send_pipeline_depth" (current value: "3") MCA pml: parameter "pml_ob1_recv_pipeline_depth" (current value: "4") MCA bml: parameter "bml" (current value: <none>) Default selection set of components for the bml framework (<none> means "use all components that can be found") MCA bml: parameter "bml_base_verbose" (current value: "0") Verbosity level for the bml framework (0 = no verbosity) MCA bml: parameter "bml_r2_show_unreach_errors" (current value: "1") Show error message when procs are unreachable MCA bml: parameter "bml_r2_priority" (current value: "0") MCA rcache: parameter "rcache" (current value: <none>) Default selection set of components for the rcache framework (<none> means "use all components that can be found") MCA rcache: parameter "rcache_base_verbose" (current value: "0") Verbosity level for the rcache framework (0 = no verbosity) MCA rcache: parameter "rcache_vma_priority" (current value: "0") MCA btl: parameter "btl_base_debug" (current value: "0") If btl_base_debug is 1 standard debug is output, if > 1 verbose debug is output MCA btl: parameter "btl" (current value: <none>) Default selection set of components for the btl framework (<none> means "use all components that can be found") MCA btl: parameter "btl_base_verbose" (current value: "0") Verbosity level for the btl framework (0 = no verbosity) MCA btl: parameter "btl_self_free_list_num" (current value: "0") Number of fragments by default MCA btl: parameter "btl_self_free_list_max" (current value: "-1") Maximum number of fragments MCA btl: parameter "btl_self_free_list_inc" (current value: "32") Increment by this number of fragments MCA btl: parameter "btl_self_eager_limit" (current value: "131072") Eager size fragmeng (before the rendez-vous ptotocol) MCA btl: parameter "btl_self_min_send_size" (current value: "262144") Minimum fragment size after the rendez-vous MCA btl: parameter "btl_self_max_send_size" (current value: "262144") Maximum fragment size after the rendez-vous MCA btl: parameter "btl_self_min_rdma_size" (current value: "2147483647") Maximum fragment size for the RDMA transfer MCA btl: parameter "btl_self_max_rdma_size" (current value: "2147483647") Maximum fragment size for the RDMA transfer MCA btl: parameter "btl_self_exclusivity" (current value: "65536") Device exclusivity MCA btl: parameter "btl_self_flags" (current value: "10") Active behavior flags MCA btl: parameter "btl_self_priority" (current value: "0") MCA btl: parameter "btl_sm_free_list_num" (current value: "8") MCA btl: parameter "btl_sm_free_list_max" (current value: "-1") MCA btl: parameter "btl_sm_free_list_inc" (current value: "64") MCA btl: parameter "btl_sm_exclusivity" (current value: "65535") MCA btl: parameter "btl_sm_latency" (current value: "100") MCA btl: parameter "btl_sm_max_procs" (current value: "-1") MCA btl: parameter "btl_sm_sm_extra_procs" (current value: "2") MCA btl: parameter "btl_sm_mpool" (current value: "sm") MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") MCA btl: parameter "btl_sm_max_frag_size" (current value: "32768") MCA btl: parameter "btl_sm_size_of_cb_queue" (current value: "128") MCA btl: parameter "btl_sm_cb_lazy_free_freq" (current value: "120") MCA btl: parameter "btl_sm_priority" (current value: "0") MCA btl: parameter "btl_tcp_if_include" (current value: <none>) MCA btl: parameter "btl_tcp_if_exclude" (current value: "lo") MCA btl: parameter "btl_tcp_free_list_num" (current value: "8") MCA btl: parameter "btl_tcp_free_list_max" (current value: "-1") MCA btl: parameter "btl_tcp_free_list_inc" (current value: "32") MCA btl: parameter "btl_tcp_sndbuf" (current value: "131072") MCA btl: parameter "btl_tcp_rcvbuf" (current value: "131072") MCA btl: parameter "btl_tcp_endpoint_cache" (current value: "30720") MCA btl: parameter "btl_tcp_exclusivity" (current value: "0") MCA btl: parameter "btl_tcp_eager_limit" (current value: "65536") MCA btl: parameter "btl_tcp_min_send_size" (current value: "65536") MCA btl: parameter "btl_tcp_max_send_size" (current value: "131072") MCA btl: parameter "btl_tcp_min_rdma_size" (current value: "131072") MCA btl: parameter "btl_tcp_max_rdma_size" (current value: "2147483647") MCA btl: parameter "btl_tcp_flags" (current value: "122") MCA btl: parameter "btl_tcp_priority" (current value: "0") MCA btl: parameter "btl_base_include" (current value: <none>) MCA btl: parameter "btl_base_exclude" (current value: <none>) MCA btl: parameter "btl_base_warn_component_unused" (current value: "1") This parameter is used to turn on warning messages when certain NICs are not used MCA mtl: parameter "mtl" (current value: <none>) Default selection set of components for the mtl framework (<none> means "use all components that can be found") MCA mtl: parameter "mtl_base_verbose" (current value: "0") Verbosity level for the mtl framework (0 = no verbosity) MCA topo: parameter "topo" (current value: <none>) Default selection set of components for the topo framework (<none> means "use all components that can be found") MCA topo: parameter "topo_base_verbose" (current value: "0") Verbosity level for the topo framework (0 = no verbosity) MCA osc: parameter "osc" (current value: <none>) Default selection set of components for the osc framework (<none> means "use all components that can be found") MCA osc: parameter "osc_base_verbose" (current value: "0") Verbosity level for the osc framework (0 = no verbosity) MCA osc: parameter "osc_pt2pt_no_locks" (current value: "0") Enable optimizations available only if MPI_LOCK is not used. MCA osc: parameter "osc_pt2pt_eager_limit" (current value: "16384") Max size of eagerly sent data MCA osc: parameter "osc_pt2pt_priority" (current value: "0") MCA errmgr: parameter "errmgr_base_verbose" (current value: "0") Verbosity level for the errmgr framework MCA errmgr: parameter "errmgr" (current value: <none>) Default selection set of components for the errmgr framework (<none> means "use all components that can be found") MCA errmgr: parameter "errmgr_hnp_debug" (current value: "0") MCA errmgr: parameter "errmgr_hnp_priority" (current value: "0") MCA errmgr: parameter "errmgr_orted_debug" (current value: "0") MCA errmgr: parameter "errmgr_orted_priority" (current value: "0") MCA errmgr: parameter "errmgr_proxy_debug" (current value: "0") MCA errmgr: parameter "errmgr_proxy_priority" (current value: "0") MCA gpr: parameter "gpr_base_verbose" (current value: "0") Verbosity level for the gpr framework MCA gpr: parameter "gpr_base_maxsize" (current value: "2147483647") MCA gpr: parameter "gpr_base_blocksize" (current value: "512") MCA gpr: parameter "gpr" (current value: <none>) Default selection set of components for the gpr framework (<none> means "use all components that can be found") MCA gpr: parameter "gpr_null_priority" (current value: "0") MCA gpr: parameter "gpr_proxy_debug" (current value: "0") MCA gpr: parameter "gpr_proxy_priority" (current value: "0") MCA gpr: parameter "gpr_replica_debug" (current value: "0") MCA gpr: parameter "gpr_replica_isolate" (current value: "0") MCA gpr: parameter "gpr_replica_priority" (current value: "0") MCA iof: parameter "iof_base_window_size" (current value: "4096") MCA iof: parameter "iof_base_service" (current value: "0.0.0") MCA iof: parameter "iof_base_verbose" (current value: "0") Verbosity level for the iof framework MCA iof: parameter "iof" (current value: <none>) Default selection set of components for the iof framework (<none> means "use all components that can be found") MCA iof: parameter "iof_proxy_priority" (current value: "0") MCA iof: parameter "iof_svc_priority" (current value: "0") MCA ns: parameter "ns_base_verbose" (current value: "0") Verbosity level for the ns framework MCA ns: parameter "ns" (current value: <none>) Default selection set of components for the ns framework (<none> means "use all components that can be found") MCA ns: parameter "ns_proxy_debug" (current value: "0") MCA ns: parameter "ns_proxy_maxsize" (current value: "2147483647") MCA ns: parameter "ns_proxy_blocksize" (current value: "512") MCA ns: parameter "ns_proxy_priority" (current value: "0") MCA ns: parameter "ns_replica_debug" (current value: "0") MCA ns: parameter "ns_replica_isolate" (current value: "0") MCA ns: parameter "ns_replica_maxsize" (current value: "2147483647") MCA ns: parameter "ns_replica_blocksize" (current value: "512") MCA ns: parameter "ns_replica_priority" (current value: "0") MCA oob: parameter "oob" (current value: <none>) Default selection set of components for the oob framework (<none> means "use all components that can be found") MCA oob: parameter "oob_base_verbose" (current value: "0") Verbosity level for the oob framework (0 = no verbosity) MCA oob: parameter "oob_tcp_peer_limit" (current value: "-1") MCA oob: parameter "oob_tcp_peer_retries" (current value: "60") MCA oob: parameter "oob_tcp_debug" (current value: "0") MCA oob: parameter "oob_tcp_sndbuf" (current value: "131072") MCA oob: parameter "oob_tcp_rcvbuf" (current value: "131072") MCA oob: parameter "oob_tcp_if_include" (current value: <none>) Comma-delimited list of TCP interfaces to use MCA oob: parameter "oob_tcp_if_exclude" (current value: <none>) Comma-delimited list of TCP interfaces to exclude MCA oob: parameter "oob_tcp_connect_sleep" (current value: "1") Enable (1) / disable (0) random sleep for connection wireup MCA oob: parameter "oob_tcp_listen_mode" (current value: "event") Mode for HNP to accept incoming connections: event, listen_thread MCA oob: parameter "oob_tcp_listen_thread_max_queue" (current value: "10") High water mark for queued accepted socket list size MCA oob: parameter "oob_tcp_listen_thread_max_time" (current value: "10") Maximum amount of time (in milliseconds) to wait between processing accepted socket list MCA oob: parameter "oob_tcp_accept_spin_count" (current value: "10") Number of times to let accept return EWOULDBLOCK before updating accepted socket list MCA oob: parameter "oob_tcp_priority" (current value: "0") MCA oob: parameter "oob_base_include" (current value: <none>) Components to include for oob framework selection MCA oob: parameter "oob_base_exclude" (current value: <none>) Components to exclude for oob framework selection MCA ras: parameter "ras_base_verbose" (current value: "0") Enable debugging for the RAS framework (nonzero = enabled) MCA ras: parameter "ras" (current value: <none>) MCA ras: parameter "ras_dash_host_priority" (current value: "5") Selection priority for the dash_host RAS component MCA ras: parameter "ras_gridengine_debug" (current value: "0") Enable debugging output for the gridengine ras component MCA ras: parameter "ras_gridengine_priority" (current value: "100") Priority of the gridengine ras component MCA ras: parameter "ras_gridengine_verbose" (current value: "0") Enable verbose output for the gridengine ras component MCA ras: parameter "ras_gridengine_show_jobid" (current value: "0") Show the JOB_ID of the Grid Engine job MCA ras: parameter "ras_localhost_priority" (current value: "0") Selection priority for the localhost RAS component MCA ras: parameter "ras_slurm_priority" (current value: "75") Priority of the slurm ras component MCA rds: parameter "rds_base_verbose" (current value: "0") Verbosity level for the rds framework MCA rds: parameter "rds" (current value: <none>) MCA rds: parameter "rds_hostfile_debug" (current value: "0") Toggle debug output for hostfile RDS component MCA rds: parameter "rds_hostfile_path" (current value: "/etc/openmpi-default-hostfile") ORTE Host filename MCA rds: parameter "rds_hostfile_priority" (current value: "0") MCA rds: parameter "rds_proxy_priority" (current value: "0") MCA rds: parameter "rds_resfile_debug" (current value: "0") Toggle debug output for resfile RDS component MCA rds: parameter "rds_resfile_name" (current value: <none>) ORTE Resource filename MCA rds: parameter "rds_resfile_priority" (current value: "0") MCA rmaps: parameter "rmaps_base_verbose" (current value: "0") Verbosity level for the rmaps framework MCA rmaps: parameter "rmaps_base_schedule_policy" (current value: "unspec") Scheduling Policy for RMAPS. [slot | node] MCA rmaps: parameter "rmaps_base_pernode" (current value: "0") Launch one ppn as directed MCA rmaps: parameter "rmaps_base_n_pernode" (current value: "-1") Launch n procs/node MCA rmaps: parameter "rmaps_base_no_schedule_local" (current value: "0") If false, allow scheduling MPI applications on the same node as mpirun (default). If true, do not schedule any MPI applications on the same node as mpirun MCA rmaps: parameter "rmaps_base_no_oversubscribe" (current value: "0") If true, then do not allow oversubscription of nodes - mpirun will return an error if there aren't enough nodes to launch all processes without oversubscribing MCA rmaps: parameter "rmaps_base_display_map" (current value: "0") Whether to display the process map after it is computed MCA rmaps: parameter "rmaps" (current value: <none>) Default selection set of components for the rmaps framework (<none> means "use all components that can be found") MCA rmaps: parameter "rmaps_round_robin_debug" (current value: "1") Toggle debug output for Round Robin RMAPS component MCA rmaps: parameter "rmaps_round_robin_priority" (current value: "1") Selection priority for Round Robin RMAPS component MCA rmgr: parameter "rmgr_base_verbose" (current value: "0") Verbosity level for the rmgr framework MCA rmgr: parameter "rmgr" (current value: <none>) Default selection set of components for the rmgr framework (<none> means "use all components that can be found") MCA rmgr: parameter "rmgr_proxy_priority" (current value: "0") MCA rmgr: parameter "rmgr_urm_priority" (current value: "0") MCA rml: parameter "rml_base_debug" (current value: "0") Verbosity level for the rml famework MCA rml: parameter "rml" (current value: <none>) Default selection set of components for the rml framework (<none> means "use all components that can be found") MCA rml: parameter "rml_base_verbose" (current value: "0") Verbosity level for the rml framework (0 = no verbosity) MCA rml: parameter "rml_oob_priority" (current value: "0") MCA pls: parameter "pls_base_reuse_daemons" (current value: "0") If nonzero, reuse daemons to launch dynamically spawned processes. If zero, do not reuse daemons (default) MCA pls: parameter "pls" (current value: <none>) Default selection set of components for the pls framework (<none> means "use all components that can be found") MCA pls: parameter "pls_base_verbose" (current value: "0") Verbosity level for the pls framework (0 = no verbosity) MCA pls: parameter "pls_gridengine_debug" (current value: "0") Enable debugging of gridengine pls component MCA pls: parameter "pls_gridengine_verbose" (current value: "0") Enable verbose output of the gridengine qrsh -inherit command MCA pls: parameter "pls_gridengine_priority" (current value: "100") Priority of the gridengine pls component MCA pls: parameter "pls_gridengine_orted" (current value: "orted") The command name that the gridengine pls component will invoke for the ORTE daemon MCA pls: parameter "pls_proxy_priority" (current value: "0") MCA pls: parameter "pls_rsh_debug" (current value: "0") Whether or not to enable debugging output for the rsh pls component (0 or 1) MCA pls: parameter "pls_rsh_num_concurrent" (current value: "128") How many pls_rsh_agent instances to invoke concurrently (must be > 0) MCA pls: parameter "pls_rsh_force_rsh" (current value: "0") Force the launcher to always use rsh, even for local daemons MCA pls: parameter "pls_rsh_orted" (current value: "orted") The command name that the rsh pls component will invoke for the ORTE daemon MCA pls: parameter "pls_rsh_priority" (current value: "10") Priority of the rsh pls component MCA pls: parameter "pls_rsh_delay" (current value: "1") Delay (in seconds) between invocations of the remote agent, but only used when the "debug" MCA parameter is true, or the top-level MCA debugging is enabled (otherwise this value is ignored) MCA pls: parameter "pls_rsh_reap" (current value: "1") If set to 1, wait for all the processes to complete before exiting. Otherwise, quit immediately -- without waiting for confirmation that all other processes in the job have completed. MCA pls: parameter "pls_rsh_assume_same_shell" (current value: "1") If set to 1, assume that the shell on the remote node is the same as the shell on the local node. Otherwise, probe for what the remote shell. MCA pls: parameter "pls_rsh_agent" (current value: "ssh : rsh") The command used to launch executables on remote nodes (typically either "ssh" or "rsh") MCA pls: parameter "pls_slurm_debug" (current value: "0") Enable debugging of slurm pls MCA pls: parameter "pls_slurm_priority" (current value: "75") Default selection priority MCA pls: parameter "pls_slurm_orted" (current value: "orted") Command to use to start proxy orted MCA pls: parameter "pls_slurm_args" (current value: <none>) Custom arguments to srun MCA sds: parameter "sds" (current value: <none>) Default selection set of components for the sds framework (<none> means "use all components that can be found") MCA sds: parameter "sds_base_verbose" (current value: "0") Verbosity level for the sds framework (0 = no verbosity) MCA sds: parameter "sds_env_priority" (current value: "0") MCA sds: parameter "sds_pipe_priority" (current value: "0") MCA sds: parameter "sds_seed_priority" (current value: "0") MCA sds: parameter "sds_singleton_priority" (current value: "0") MCA sds: parameter "sds_slurm_priority" (current value: "0")