There is the output: ############################################################################## [devel11:80858] [[2965,0],0] rmaps:base set policy with NULL device NONNULL [devel11:80858] mca:rmaps:select: checking available component mindist [devel11:80858] mca:rmaps:select: Querying component [mindist] [devel11:80858] mca:rmaps:select: checking available component ppr [devel11:80858] mca:rmaps:select: Querying component [ppr] [devel11:80858] mca:rmaps:select: checking available component rank_file [devel11:80858] mca:rmaps:select: Querying component [rank_file] [devel11:80858] mca:rmaps:select: checking available component resilient [devel11:80858] mca:rmaps:select: Querying component [resilient] [devel11:80858] mca:rmaps:select: checking available component round_robin [devel11:80858] mca:rmaps:select: Querying component [round_robin] [devel11:80858] mca:rmaps:select: checking available component seq [devel11:80858] mca:rmaps:select: Querying component [seq] [devel11:80858] [[2965,0],0]: Final mapper priorities [devel11:80858] Mapper: ppr Priority: 90 [devel11:80858] Mapper: seq Priority: 60 [devel11:80858] Mapper: resilient Priority: 40 [devel11:80858] Mapper: mindist Priority: 20 [devel11:80858] Mapper: round_robin Priority: 10 [devel11:80858] Mapper: rank_file Priority: 0 [devel11:80858] mca:rmaps: mapping job [2965,1] [devel11:80858] mca:rmaps: setting mapping policies for job [2965,1] nprocs 48 [devel11:80858] mca:rmaps[169] mapping not set by user - using bynuma [devel11:80858] mca:rmaps:ppr: job [2965,1] not using ppr mapper PPR NULL policy PPR NOTSET [devel11:80858] [[2965,0],0] rmaps:seq called on job [2965,1] [devel11:80858] mca:rmaps:seq: job [2965,1] not using seq mapper [devel11:80858] mca:rmaps:resilient: cannot perform initial map of job [2965,1] - no fault groups [devel11:80858] mca:rmaps:mindist: job [2965,1] not using mindist mapper [devel11:80858] mca:rmaps:rr: mapping job [2965,1] [devel11:80858] [[2965,0],0] Starting with 2 nodes in list [devel11:80858] [[2965,0],0] Filtering thru apps [devel11:80858] [[2965,0],0] Retained 2 nodes in list [devel11:80858] [[2965,0],0] node miriel025 has 24 slots available [devel11:80858] [[2965,0],0] node miriel026 has 24 slots available [devel11:80858] AVAILABLE NODES FOR MAPPING: [devel11:80858] node: miriel025 daemon: 1 [devel11:80858] node: miriel026 daemon: 2 [devel11:80858] [[2965,0],0] Starting bookmark at node miriel025 [devel11:80858] [[2965,0],0] Starting at node miriel025 [devel11:80858] mca:rmaps:rr: mapping no-span by NUMANode for job [2965,1] slots 48 num_procs 48 [devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node miriel025 [devel11:80858] mca:rmaps:rr: calculated nprocs 24 [devel11:80858] mca:rmaps:rr: assigning nprocs 24 [devel11:80858] mca:rmaps:rr: found 4 NUMANode objects on node miriel026 [devel11:80858] mca:rmaps:rr: calculated nprocs 24 [devel11:80858] mca:rmaps:rr: assigning nprocs 24 [devel11:80858] mca:rmaps:base: computing vpids by slot for job [2965,1] [devel11:80858] mca:rmaps:base: assigning rank 0 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 1 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 2 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 3 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 4 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 5 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 6 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 7 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 8 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 9 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 10 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 11 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 12 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 13 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 14 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 15 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 16 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 17 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 18 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 19 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 20 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 21 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 22 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 23 to node miriel025 [devel11:80858] mca:rmaps:base: assigning rank 24 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 25 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 26 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 27 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 28 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 29 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 30 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 31 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 32 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 33 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 34 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 35 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 36 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 37 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 38 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 39 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 40 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 41 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 42 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 43 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 44 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 45 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 46 to node miriel026 [devel11:80858] mca:rmaps:base: assigning rank 47 to node miriel026 [devel11:80858] [[2965,0],0] rmaps:base:compute_usage [devel11:80858] mca:rmaps: compute bindings for job [2965,1] with policy CORE[4008] [devel11:80858] [[2965,0],0] bind_depth: 6 map_depth 2 [devel11:80858] mca:rmaps: bind downward for job [2965,1] with bindings CORE [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],0] BITMAP 0 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],0][miriel025] TO socket 0[core 0[hwt 0]]: [B/././././././././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],1] BITMAP 12 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],1][miriel025] TO socket 0[core 6[hwt 0]]: [././././././B/././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],2] BITMAP 1 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],2][miriel025] TO socket 1[core 12[hwt 0]]: [./././././././././././.][B/././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],3] BITMAP 13 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],3][miriel025] TO socket 1[core 18[hwt 0]]: [./././././././././././.][././././././B/././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],4] BITMAP 2 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],4][miriel025] TO socket 0[core 1[hwt 0]]: [./B/./././././././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],5] BITMAP 14 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],5][miriel025] TO socket 0[core 7[hwt 0]]: [./././././././B/./././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],6] BITMAP 3 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],6][miriel025] TO socket 1[core 13[hwt 0]]: [./././././././././././.][./B/./././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],7] BITMAP 15 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],7][miriel025] TO socket 1[core 19[hwt 0]]: [./././././././././././.][./././././././B/./././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],8] BITMAP 4 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],8][miriel025] TO socket 0[core 2[hwt 0]]: [././B/././././././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],9] BITMAP 16 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],9][miriel025] TO socket 0[core 8[hwt 0]]: [././././././././B/././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],10] BITMAP 5 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],10][miriel025] TO socket 1[core 14[hwt 0]]: [./././././././././././.][././B/././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],11] BITMAP 17 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],11][miriel025] TO socket 1[core 20[hwt 0]]: [./././././././././././.][././././././././B/././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],12] BITMAP 6 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],12][miriel025] TO socket 0[core 3[hwt 0]]: [./././B/./././././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],13] BITMAP 18 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],13][miriel025] TO socket 0[core 9[hwt 0]]: [./././././././././B/./.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],14] BITMAP 7 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],14][miriel025] TO socket 1[core 15[hwt 0]]: [./././././././././././.][./././B/./././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],15] BITMAP 19 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],15][miriel025] TO socket 1[core 21[hwt 0]]: [./././././././././././.][./././././././././B/./.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],16] BITMAP 8 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],16][miriel025] TO socket 0[core 4[hwt 0]]: [././././B/././././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],17] BITMAP 20 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],17][miriel025] TO socket 0[core 10[hwt 0]]: [././././././././././B/.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],18] BITMAP 9 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],18][miriel025] TO socket 1[core 16[hwt 0]]: [./././././././././././.][././././B/././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],19] BITMAP 21 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],19][miriel025] TO socket 1[core 22[hwt 0]]: [./././././././././././.][././././././././././B/.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],20] BITMAP 10 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],20][miriel025] TO socket 0[core 5[hwt 0]]: [./././././B/./././././.][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],21] BITMAP 22 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],21][miriel025] TO socket 0[core 11[hwt 0]]: [./././././././././././B][./././././././././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],22] BITMAP 11 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],22][miriel025] TO socket 1[core 17[hwt 0]]: [./././././././././././.][./././././B/./././././.] [devel11:80858] [[2965,0],0] GOT 1 CPUS [devel11:80858] [[2965,0],0] PROC [[2965,1],23] BITMAP 23 [devel11:80858] [[2965,0],0] BOUND PROC [[2965,1],23][miriel025] TO socket 1[core 23[hwt 0]]: [./././././././././././.][./././././././././././B] [miriel025:60980] MCW rank 11 not bound [miriel025:60990] MCW rank 21 not bound [miriel025:60981] MCW rank 12 not bound [miriel025:60979] MCW rank 10 not bound [miriel025:60977] MCW rank 8 not bound [miriel025:60970] MCW rank 1 not bound [miriel025:60972] MCW rank 3 not bound [miriel025:60984] MCW rank 15 not bound [miriel026:163985] MCW rank 34 not bound [miriel026:163993] MCW rank 42 not bound [miriel026:163981] MCW rank 30 not bound [miriel026:163983] MCW rank 32 not bound [miriel025:60975] MCW rank 6 not bound [miriel025:60986] MCW rank 17 not bound [miriel025:60992] MCW rank 23 not bound [miriel025:60973] MCW rank 4 not bound [miriel025:60978] MCW rank 9 not bound [miriel025:60969] MCW rank 0 not bound [miriel025:60991] MCW rank 22 not bound [miriel025:60974] MCW rank 5 not bound [miriel025:60982] MCW rank 13 not bound [miriel025:60989] MCW rank 20 not bound [miriel025:60988] MCW rank 19 not bound [miriel025:60983] MCW rank 14 not bound [miriel025:60987] MCW rank 18 not bound [miriel025:60976] MCW rank 7 not bound [miriel026:163996] MCW rank 45 not bound [miriel026:163979] MCW rank 28 not bound [miriel026:163990] MCW rank 39 not bound [miriel026:163976] MCW rank 25 not bound [miriel026:163997] MCW rank 46 not bound [miriel025:60971] MCW rank 2 not bound [miriel026:163995] MCW rank 44 not bound [miriel026:163987] MCW rank 36 not bound [miriel026:163982] MCW rank 31 not bound [miriel025:60985] MCW rank 16 not bound [miriel026:163980] MCW rank 29 not bound [miriel026:163975] MCW rank 24 not bound [miriel026:163978] MCW rank 27 not bound [miriel026:163992] MCW rank 41 not bound [miriel026:163991] MCW rank 40 not bound [miriel026:163998] MCW rank 47 not bound [miriel026:163986] MCW rank 35 not bound [miriel026:163984] MCW rank 33 not bound [miriel026:163989] MCW rank 38 not bound [miriel026:163994] MCW rank 43 not bound [miriel026:163988] MCW rank 37 not bound [miriel026:163977] MCW rank 26 not bound ##############################################################################
Le 13/04/2017 à 16:31, r...@open-mpi.org a écrit : > Try adding "-mca rmaps_base_verbose 5” and see what that output tells us - I > assume you have a debug build configured, yes (i.e., added --enable-debug to > configure line)? > > >> On Apr 13, 2017, at 7:28 AM, Cyril Bordage <cyril.bord...@inria.fr> wrote: >> >> When I run this command from the compute node I have also that. But not >> when I run it from a login node (with the same machine file). >> >> >> Cyril. >> >> Le 13/04/2017 à 16:22, r...@open-mpi.org a écrit : >>> We are asking all these questions because we cannot replicate your problem >>> - so we are trying to help you figure out what is different or missing from >>> your machine. When I run your cmd line on my system, I get: >>> >>> [rhc002.cluster:55965] MCW rank 24 bound to socket 0[core 0[hwt 0-1]]: >>> [BB/../../../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 25 bound to socket 1[core 12[hwt 0-1]]: >>> [../../../../../../../../../../../..][BB/../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 26 bound to socket 0[core 1[hwt 0-1]]: >>> [../BB/../../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 27 bound to socket 1[core 13[hwt 0-1]]: >>> [../../../../../../../../../../../..][../BB/../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 28 bound to socket 0[core 2[hwt 0-1]]: >>> [../../BB/../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 29 bound to socket 1[core 14[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../BB/../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 30 bound to socket 0[core 3[hwt 0-1]]: >>> [../../../BB/../../../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 31 bound to socket 1[core 15[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../BB/../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 32 bound to socket 0[core 4[hwt 0-1]]: >>> [../../../../BB/../../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 33 bound to socket 1[core 16[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../BB/../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 34 bound to socket 0[core 5[hwt 0-1]]: >>> [../../../../../BB/../../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 35 bound to socket 1[core 17[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../BB/../../../../../..] >>> [rhc002.cluster:55965] MCW rank 36 bound to socket 0[core 6[hwt 0-1]]: >>> [../../../../../../BB/../../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 37 bound to socket 1[core 18[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../BB/../../../../..] >>> [rhc002.cluster:55965] MCW rank 38 bound to socket 0[core 7[hwt 0-1]]: >>> [../../../../../../../BB/../../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 39 bound to socket 1[core 19[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../BB/../../../..] >>> [rhc002.cluster:55965] MCW rank 40 bound to socket 0[core 8[hwt 0-1]]: >>> [../../../../../../../../BB/../../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 41 bound to socket 1[core 20[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../BB/../../..] >>> [rhc002.cluster:55965] MCW rank 42 bound to socket 0[core 9[hwt 0-1]]: >>> [../../../../../../../../../BB/../..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 43 bound to socket 1[core 21[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../BB/../..] >>> [rhc002.cluster:55965] MCW rank 44 bound to socket 0[core 10[hwt 0-1]]: >>> [../../../../../../../../../../BB/..][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 45 bound to socket 1[core 22[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../../BB/..] >>> [rhc002.cluster:55965] MCW rank 46 bound to socket 0[core 11[hwt 0-1]]: >>> [../../../../../../../../../../../BB][../../../../../../../../../../../..] >>> [rhc002.cluster:55965] MCW rank 47 bound to socket 1[core 23[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../../../BB] >>> [rhc001:197743] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: >>> [BB/../../../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 1 bound to socket 1[core 12[hwt 0-1]]: >>> [../../../../../../../../../../../..][BB/../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]: >>> [../BB/../../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 3 bound to socket 1[core 13[hwt 0-1]]: >>> [../../../../../../../../../../../..][../BB/../../../../../../../../../..] >>> [rhc001:197743] MCW rank 4 bound to socket 0[core 2[hwt 0-1]]: >>> [../../BB/../../../../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 5 bound to socket 1[core 14[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../BB/../../../../../../../../..] >>> [rhc001:197743] MCW rank 6 bound to socket 0[core 3[hwt 0-1]]: >>> [../../../BB/../../../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 7 bound to socket 1[core 15[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../BB/../../../../../../../..] >>> [rhc001:197743] MCW rank 8 bound to socket 0[core 4[hwt 0-1]]: >>> [../../../../BB/../../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 9 bound to socket 1[core 16[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../BB/../../../../../../..] >>> [rhc001:197743] MCW rank 10 bound to socket 0[core 5[hwt 0-1]]: >>> [../../../../../BB/../../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 11 bound to socket 1[core 17[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../BB/../../../../../..] >>> [rhc001:197743] MCW rank 12 bound to socket 0[core 6[hwt 0-1]]: >>> [../../../../../../BB/../../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 13 bound to socket 1[core 18[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../BB/../../../../..] >>> [rhc001:197743] MCW rank 14 bound to socket 0[core 7[hwt 0-1]]: >>> [../../../../../../../BB/../../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 15 bound to socket 1[core 19[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../BB/../../../..] >>> [rhc001:197743] MCW rank 16 bound to socket 0[core 8[hwt 0-1]]: >>> [../../../../../../../../BB/../../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 17 bound to socket 1[core 20[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../BB/../../..] >>> [rhc001:197743] MCW rank 18 bound to socket 0[core 9[hwt 0-1]]: >>> [../../../../../../../../../BB/../..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 19 bound to socket 1[core 21[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../BB/../..] >>> [rhc001:197743] MCW rank 20 bound to socket 0[core 10[hwt 0-1]]: >>> [../../../../../../../../../../BB/..][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 21 bound to socket 1[core 22[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../../BB/..] >>> [rhc001:197743] MCW rank 22 bound to socket 0[core 11[hwt 0-1]]: >>> [../../../../../../../../../../../BB][../../../../../../../../../../../..] >>> [rhc001:197743] MCW rank 23 bound to socket 1[core 23[hwt 0-1]]: >>> [../../../../../../../../../../../..][../../../../../../../../../../../BB] >>> >>> Exactly as expected. You might check that you have libnuma and >>> libnuma-devel installed >>> >>> >>>> On Apr 13, 2017, at 6:50 AM, gil...@rist.or.jp wrote: >>>> >>>> OK thanks, >>>> >>>> we've had some issues in the past when Open MPI assumed that the (login) >>>> node running mpirun has the same topology than the other (compute) nodes. >>>> i just wanted to clear this scenario. >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> ----- Original Message ----- >>>>> I am using the 6886c12 commit. >>>>> I have no particular option for the configuration. >>>>> I launch my application in the same way as I presented in my firt >>>> email, >>>>> there is the exact line: mpirun -np 48 -machinefile mf -bind-to core >>>>> -report-bindings ./a.out >>>>> >>>>> lstopo does give the same output on both types on nodes. What is the >>>>> purpose of that? >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> Cyril. >>>>> >>>>> Le 13/04/2017 à 15:24, gil...@rist.or.jp a écrit : >>>>>> Also, can you please run >>>>>> lstopo >>>>>> on both your login and compute nodes ? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>>> Can you be a bit more specific? >>>>>>> >>>>>>> - What version of Open MPI are you using? >>>>>>> - How did you configure Open MPI? >>>>>>> - How are you launching Open MPI applications? >>>>>>> >>>>>>> >>>>>>>> On Apr 13, 2017, at 9:08 AM, Cyril Bordage <cyril.bord...@inria.fr >>>>> >>>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> now this bug happens also when I launch my mpirun command from the >>>>>>>> compute node. >>>>>>>> >>>>>>>> >>>>>>>> Cyril. >>>>>>>> >>>>>>>> Le 06/04/2017 à 05:38, r...@open-mpi.org a écrit : >>>>>>>>> I believe this has been fixed now - please let me know >>>>>>>>> >>>>>>>>>> On Mar 30, 2017, at 1:57 AM, Cyril Bordage <cyril.bordage@inria. >>>> fr >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> I am using the git version of MPI with "-bind-to core -report- >>>>>> bindings" >>>>>>>>>> and I get that for all processes: >>>>>>>>>> [miriel010:160662] MCW rank 0 not bound >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> When I use an old version I get: >>>>>>>>>> [miriel010:44921] MCW rank 0 bound to socket 0[core 0[hwt 0]]: >>>>>>>>>> [B/././././././././././.][./././././././././././.] >>>>>>>>>> >>>>>>>>>> From git bisect the culprit seems to be: 48fc339 >>>>>>>>>> >>>>>>>>>> This bug happends only when I launch my mpirun command from a >>>>>> login node >>>>>>>>>> and not >>>>>>>>>> from a compute node. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cyril. >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> devel@lists.open-mpi.org >>>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> devel@lists.open-mpi.org >>>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> devel@lists.open-mpi.org >>>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jeff Squyres >>>>>>> jsquy...@cisco.com >>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> devel@lists.open-mpi.org >>>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> devel@lists.open-mpi.org >>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> devel@lists.open-mpi.org >>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>>>> >>>> _______________________________________________ >>>> devel mailing list >>>> devel@lists.open-mpi.org >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>> >>> _______________________________________________ >>> devel mailing list >>> devel@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >>> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel > > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel > _______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel