Re: [OMPI users] Open MPI internal error

2017-09-29 Thread Richard Graham
suggested by Rich. Thanks for propositions, and I'll write if I have new hints (it would require some days to the runs to potentially freeze) Ludovic De : users mailto:users-boun...@lists.open-mpi.org>> de la part de Richard Graham mailto:

Re: [OMPI users] Open MPI internal error

2017-09-28 Thread Richard Graham
I just talked with George, who brought me up to speed on this particular problem. I would suggest a couple of things: - Look at the HW error counters, and see if you have many retransmits. This would indicate a potential issue with the particular HW in use, such as a cable that is n

Re: [OMPI users] Open MPI Java Error

2017-02-08 Thread Graham, Nathaniel Richard
?Hello Thyago, What is your configure command? Do you know if you are using psm or psm2? -Nathan -- Nathaniel Graham HPC-DES Los Alamos National Laboratory From: users on behalf of Mota, Thyago Sent: Wednesday, February 8, 2017 10:01 AM To: users@lists.ope

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-11-15 Thread Graham, Nathaniel Richard
1:22877] ***and potentially your MPI job) [titan01.service:22872] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [titan01.service:22872] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages On 09/13/2016 08:06 PM, Gra

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-23 Thread Graham, Nathaniel Richard
RRORS_ARE_FATAL (processes in this win will now abort, [titan01:22877] ***and potentially your MPI job) [titan01.service:22872] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [titan01.service:22872] Set MCA parameter "orte_base_help_aggregate" to 0 t

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-23 Thread Graham, Nathaniel Richard
elp_aggregate" to 0 to see all help / error messages On 09/13/2016 08:06 PM, Graham, Nathaniel Richard wrote: Since you are getting the same errors with C as you are with Java, this is an issue with C, not the Java bindings. However, in the most recent output, you are using ./a.out to run t

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-15 Thread Graham, Nathaniel Richard
-- Nathaniel Graham HPC-DES Los Alamos National Laboratory From: users on behalf of Graham, Nathaniel Richard Sent: Wednesday, September 14, 2016 12:55 PM To: Open MPI Users Subject: Re: [OMPI users] Java-OpenMPI returns with SIGSEGV ​Thanks for reporting

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-14 Thread Graham, Nathaniel Richard
and potentially your MPI job) [titan01.service:22872] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [titan01.service:22872] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages On 09/13/2016 08:06 PM, Graham, Nathaniel Ri

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-13 Thread Graham, Nathaniel Richard
use the wrapper to default methods. Gundram On 09/07/2016 06:49 PM, Graham, Nathaniel Richard wrote: Hello Gundram, It looks like the test that is failing is TestMpiRmaCompareAndSwap.java​. Is that the one that is crashing? If so, could you try to run the C test from: http

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-07 Thread Graham, Nathaniel Richard
Hello Gundram, It looks like the test that is failing is TestMpiRmaCompareAndSwap.java​. Is that the one that is crashing? If so, could you try to run the C test from: http://git.mpich.org/mpich.git/blob/c77631474f072e86c9fe761c1328c3d4cb8cc4a5:/test/mpi/rma/compare_and_swap.c#l1 Ther

Re: [OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Graham, Nathaniel Richard
​Hello Siegmar and Gilles, I made a reply where Gilles suggested, but figured I leave a note here in case the other was missed. -Nathan -- Nathaniel Graham HPC-DES Los Alamos National Laboratory From: users on behalf of Gilles Gouaillardet Sent: Monday, Au

Re: [OMPI users] OS X El Capitan 10.11.6 ld: symbol(s) not found for architecture x86_64

2016-08-23 Thread Richard G French
Problem solved! I had to remove a macports directory from my path that had the improper MPI library, and at last the code is working. Thanks so much to everyone for your friendly and prompt suggestions - I appreciate it very much. Dick On Tue, Aug 23, 2016 at 4:51 PM, Richard G French wrote

Re: [OMPI users] OS X El Capitan 10.11.6 ld: symbol(s) not found for architecture x86_64

2016-08-23 Thread Richard G French
016 um 21:43 schrieb Richard G French: > > > Hi, all - > > I'm trying to build the SPH code Gadget2 (http://wwwmpa.mpa-garching. > mpg.de/gadget/) under OS X 10.11.6 and I am getting the following type of > error: > > > > 222 rfrench@cosmos> make > > >

Re: [OMPI users] OS X El Capitan 10.11.6 ld: symbol(s) not found for architecture x86_64

2016-08-23 Thread Richard G French
can do > > nm library_file_name|grep ompi_mpi_byte > > And that wil tell you if that library contains ompi_mpi_byte > > Doug > > On Aug 23, 2016, at 2:30 PM, Richard G French > wrote: > > Thanks for the suggestion, Doug - but I can't seem to find the missing >

Re: [OMPI users] OS X El Capitan 10.11.6 ld: symbol(s) not found for architecture x86_64

2016-08-23 Thread Richard G French
Thanks for the suggestion, Doug - but I can't seem to find the missing function ompi_mpi_byte in any of those other libraries. I'll keep looking! I wonder if I failed to configure mpich properly when I built it. Dick On Tue, Aug 23, 2016 at 4:01 PM, Douglas L Reeder wrote: > Ri

[OMPI users] OS X El Capitan 10.11.6 ld: symbol(s) not found for architecture x86_64

2016-08-23 Thread Richard G French
Any suggestions would be welcome. Thanks! Dick French -- Richard G. French McDowell and Whiting Professor of Astrophysics Chair of the Astronomy Department, Wellesley College Director of the Whitin Observatory Cassini Mission to Saturn Radio Science Team Leader Wellesley, MA 024

Re: [OMPI users] FORTH and MPI

2016-06-27 Thread Richard C. Wagner
John: that does sound very interesting! As others have said - haven't heard of FORTH for many years, As I recall there was a UK PC programmed in FORTH. The Dragon I think. I've been using Forth in scientific applications and control systems since 1989. Take a look at my website, http://www.

Re: [OMPI users] Shared Libraries

2016-06-24 Thread Richard C. Wagner
Ralph and Jeff: Thanks for your replies to my questions about compiling a 32-bit MPI library for Forth. Ralph wrote: IIRC, you would need to write a wrapper to let Forth access C-based functions, yes? You could configure and build OMPI as a 32-bit library, and libmpi.so is C, so that isn?t an

[OMPI users] Shared Libraries

2016-06-22 Thread Richard C. Wagner
Hi Everyone: I'm trying to employ MPI in an unconventional programming language, Forth, running over Debian Linux. The Forth I have can import a Linux shared library in the .so file format and then compile in the executable functions as externals. The question: how to do it? I'm looking to a

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Richard Shaw
On 8 May 2014 16:59, Rob Latham wrote: > > Richard: may I add this test case to ROMIO's test suite? I'm always on > the hunt for small self-contained tests. > Please do. I'm glad it's being so useful - it seems to be hitting a surprising amount of bugs of di

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Richard Shaw
On 7 May 2014 16:25, Jeff Squyres (jsquyres) wrote: > > "Periodically". > > Hopefully, the fix will be small and we can just pull that one fix down to > OMPI. Okay, thanks for letting me know Jeff. Richard

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Richard Shaw
ntime I'll investigate using MPICH 3.1 with ROMIO vs. OpenMPI with OMPIO. Thanks, Richard

Re: [OMPI users] ROMIO bug reading darrays

2014-05-07 Thread Richard Shaw
.open-mpi.org/community/lists/users/2012/07/19762.php). Have those been pulled into OpenMPI? I've been staying clear of ROMIO for a while (in favour of OMPIO), to avoid those issues. Thanks, Richard On 7 May 2014 12:36, Rob Latham wrote: > > > On 05/05/2014 09:20 PM, Richard Sha

[OMPI users] ROMIO bug reading darrays

2014-05-05 Thread Richard Shaw
ROMIO and OMPIO if I set the block shape to 2x2. This was run on OS X using 1.8.2a1r31632. I have also run this on Linux with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO I just get segfaults. Thanks, Richard #include #include #include #define NSIDE 5 #define NBLOCK 3 #defi

Re: [OMPI users] Extent of Distributed Array Type?

2014-04-10 Thread Richard Shaw
Thanks Ralph! In future, I'll try and remember to follow up on these things :) Cheers, Richard On 10 April 2014 11:16, Ralph Castain wrote: > Not really - it's the responsibility of the developer to file the CMR. > Some folks are good about it, and some aren't. In

Re: [OMPI users] Extent of Distributed Array Type?

2014-04-10 Thread Richard Shaw
Okay. Thanks for having a look Ralph! For future reference, is there a better process I can go through if I find bugs like this that makes sure they don't get forgotten? Thanks, Richard On 10 April 2014 00:39, Ralph Castain wrote: > Wow - that's an ancient one. I'll see if

Re: [OMPI users] Extent of Distributed Array Type?

2014-04-09 Thread Richard Shaw
's still in the SVN trunk, but hasn't made it into any of the intervening releases (neither stables 1.6.2-, 1.8; nor feature releases 1.7 onwards). Will this end up in the 1.9 series? Richard On 24 July 2012 19:02, Richard Shaw wrote: > Thanks George, I'm glad it wasn't

[OMPI users] MPI_ERR_BUFFER with MPI_SENDRECEV

2014-02-19 Thread Samuel Richard
Hello In a code written in Fortran, I have a problem with this part : if (num_node == 0) then ... else down_node = num_node-1 ! send to down recive from down CALL MPI_SENDRECV(tab1(3 :4,:,:), size( tab1(3 :4, : ,:)), & & MPI_REAL8, down_node, 101, tab1(1:2, : ,:), size(tab1(1:2, : ,

Re: [OMPI users] Addendum to: Assembler instruction errors for push and pop during make

2013-08-21 Thread Richard Haney
and definitive solution to resolve this issue fairly soon, I think I may take a little vacation and perhaps come back to this question later. Richard Haney On Tue, Aug 20, 2013 at 7:41 PM, Jeremiah Willcock wrote: > The file win_compat.h seems to be very strange (many #defines of function >

Re: [OMPI users] Addendum to: Assembler instruction errors for push and pop during make

2013-08-20 Thread Richard Haney
Ah! Thanks, Jeff. Here is a link to the relevant zip file " openmpi-1.6.5_configure_and_make.zip". It contains the modified configure (essentially replacing the compound "if" statement that assigns ompi_cv_asm_arch="IA

Re: [OMPI users] Addendum to: Assembler instruction errors for push and pop during make

2013-08-20 Thread Richard Haney
ot;Aw snap!". So it looks like that resource will not be much help either. Richard Haney On Tue, Aug 20, 2013 at 12:58 PM, Jeff Squyres (jsquyres) < jsquy...@cisco.com> wrote: > FWIW, the only restriction we have on the OMPI list is a max size of the > message (including attach

Re: [OMPI users] Addendum to: Assembler instruction errors for push and pop during make

2013-08-20 Thread Richard Haney
Re-sending: On Tue, Aug 20, 2013 at 12:16 PM, Richard Haney wrote: > Thanks Jeremiah, > > Your comments were most helpful. > > I did find where configure sets > > ompi_cv_asm_arch="IA32" > > and subsequently prints out [sic] > > checking for as

[OMPI users] Addendum to: Assembler instruction errors for push and pop during make

2013-08-19 Thread Richard Haney
erhaps I could use CCAS and perhaps CCASFLAGS to invoke as.exe specifically for the assembly .s source processing, but, if I use the flag --32 for the .s assembly programs, would this also create 32-bit/64-bit incompatibilities? -- *- Richard Haney* -- Forwarded message -- From:

[OMPI users] Assembler instruction errors for push and pop during make

2013-08-18 Thread Richard Haney
gcc can understand. Is there some flag I can set to tell gcc that a particular assembly language (dialect) is being used? And, if so, can I set it for make without having to re-run configure? -- *- Richard Haney* * * <> <> <> <>

Re: [OMPI users] 2 GB limitation of MPI_File_write_all

2012-11-01 Thread Richard Shaw
owing your hints and trying to fix the bug myself, but I've been short on time so haven't gotten around to it yet. Richard On Saturday, 20 October, 2012 at 10:12 AM, Rayson Ho wrote: > Hi Eric, > > Sounds like it's also related to this problem reported by Scinet back in

Re: [OMPI users] mpi test program "ring" failed: blocked at MPI_Send

2012-09-25 Thread Richard
Tom might be correct, I checked my system. Using rpm -qa, I did not find Xen, but found libvirt. At 2012-09-25 21:38:23,"Tom Bryan (tombry)" wrote: >On 9/25/12 9:10 AM, "Jeff Squyres (jsquyres)" wrote: > >>>problem, so i fixed it using "--mca btl_tcp_if_include bond0" because I >>>know this is

Re: [OMPI users] mpi test program "ring" failed: blocked at MPI_Send

2012-09-25 Thread Richard
Jeff, It was a typo in my last post, I did use "--mca btl_tcp_if_exclude virbr0" and it did not work. At 2012-09-25 21:10:24,"Jeff Squyres" wrote: >On Sep 25, 2012, at 2:56 PM, Richard wrote: > >> thanks a lot ! >> using "--mca btl_if_exclude

Re: [OMPI users] mpi test program "ring" failed: blocked at MPI_Send

2012-09-25 Thread Richard
thanks a lot ! using "--mca btl_if_exclude virbr0" does not work, but you have pointed out the problem, so i fixed it using "--mca btl_tcp_if_include bond0" because I know this is the high speed network interface I should use on each node. At 2012-09-25 20:30:16,"Jeff Squyres" wrote: >On Sep

[OMPI users] mpi test program "ring" failed: blocked at MPI_Send

2012-09-25 Thread Richard
, in the second round of pass, B failed to send message to C. I checked firewall config using chkconfig --list iptables on all the nodes. none of them are set as "on". Attached is all the information needed, my openmpi version is 1.6.1. thanks for your help. Richard At 2012-09-2

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
I used "chkconfig --list iptables ", none of computer is set as "on". At 2012-09-25 17:54:53,"Jeff Squyres" wrote: >Hav you disabled firewalls on your nodes (e.g., iptables)? > >On Sep 25, 2012, at 11:08 AM, Richard wrote: > >> sometimes the

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
refused (111 At 2012-09-25 16:53:50,Richard wrote: if I tried the ring program, the first round of pass is fine, but the second round is blocked at some node. here is the message printed out Process 0 sending 10 to 1, tag 201 (3 processes in ring) Process 0 sent to 1 rank 1, message 10,start

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
nk %d, message %d,start===\n", rank, message); 61 MPI_Send(&message, 1, MPI_INT, next, tag, MPI_COMM_WORLD); 62 printf("rank %d, message %d,end-\n", rank, message); At 2012-09-25 16:30:01,Richard wrote: Hi Jody, thanks for your suggestion

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
Hi Jody,thanks for your suggestion and you are right. if I use the ring example, the same happened.I have put a printf statement, it seems that all the three processed have reached the line calling "PMPI_Allreduce", any further suggestion? Thanks. Richard Message: 12 List-P

[OMPI users] mpi job is blocked

2012-09-25 Thread Richard
est cannot tell me what exactly it is. Can anyone help me? thanks. Richard

Re: [OMPI users] Can't read more than 2^31 bytes with MPI_File_read, regardless of type?

2012-08-07 Thread Richard Shaw
he creation and reading of multiple darray types, each designed to read in the correct number of blocks less than 2^31 bytes. This seems like it could be a bit fragile. Thanks again, Richard

Re: [OMPI users] Extent of Distributed Array Type?

2012-07-24 Thread Richard Shaw
Thanks George, I'm glad it wasn't just me being crazy. I'll try and test that one soon. Cheers, Richard On Tuesday, 24 July, 2012 at 6:28 PM, George Bosilca wrote: > Richard, > > Thanks for identifying this issue and for the short example. I can confirm > your

Re: [OMPI users] Extent of Distributed Array Type?

2012-07-24 Thread Richard Shaw
ank 0, size=40, extent=80, lb=0 Rank 1, size=40, extent=88, lb=0 Can anyone else confirm this? Thanks Richard On Sunday, 15 July, 2012 at 6:21 PM, Richard Shaw wrote: > Hello, > > I'm getting thoroughly confused trying to work out what is the correct extent > of a block-cy

[OMPI users] Extent of Distributed Array Type?

2012-07-15 Thread Richard Shaw
7;d be very grateful if someone could explain what the extent means for a darray type? And why it isn't the global array size? Thanks, Richard == OpenMPI (v1.4.4 and 1.6) == $ mpirun -np 4 ./testextent Rank 0, size=288, extent=800, lb=0 Rank 1, size=192, extent=824, lb=0 Rank 2, size=19

Re: [OMPI users] Problem running an mpi applicatio​n on nodes with more than one interface

2012-02-17 Thread Richard Bardwell
face Did you have both of the ethernet ports on the same subnet, or were they on different subnets? On Feb 17, 2012, at 5:36 AM, Richard Bardwell wrote: I had exactly the same problem. Trying to run mpi between 2 separate machines, with each machine having 2 ethernet ports, causes really we

Re: [OMPI users] Problem running an mpi applicatio​n on nodes with more than one interface

2012-02-17 Thread Richard Bardwell
I had exactly the same problem. Trying to run mpi between 2 separate machines, with each machine having 2 ethernet ports, causes really weird behaviour on the most basic code. I had to disable one of the ethernet ports on each of the machines and it worked just fine after that. No idea why though !

[OMPI users] MPI_Waitall strange behaviour on remote nodes

2012-02-14 Thread Richard Bardwell
In trying to debug an MPI_Waitall hang on a remote node, I created a simple code to test. If we run the simple code below on 2 nodes on a local machine, we send the number 1 and receive number 1 back. If we run the same code on a local node and a remote node, we send number 1 but get 32767 back.

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
nts, or conditional blocks that are only invoked during interactive logins, for example. On Feb 14, 2012, at 5:40 AM, Richard Bardwell wrote: Jeff, I wiped out all versions of openmpi on all the nodes including the distro installed version. I reinstalled version 1.4.4 on all nodes. I no

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
OK, I get it running if I specify /usr/local/bin/mpiexec instead of just mpiexec. Now, the program hangs at the first MPI_Waitall on the remote node. The program runs just fine if both nodes are on the same machine. Any ideas how to debug this ? Many Thanks Richard - Original Message

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
points to /usr/local/lib where the file exists. Any ideas ? Many Thanks Richard - Original Message - From: "Jeff Squyres" To: "Open MPI Users" Sent: Monday, February 13, 2012 6:28 PM Subject: Re: [OMPI users] MPI orte_init fails on remote nodes You might want to

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
Any ideas ?? Thanks Richard - Original Message - From: "Gustavo Correa" To: "Open MPI Users" Sent: Monday, February 13, 2012 4:22 PM Subject: Re: [OMPI users] MPI orte_init fails on remote nodes On Feb 13, 2012, at 11:02 AM, Richard Bardwell wrote: Ralph I had

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
My mistake Ralph, should have done a make uninstall instead ! Thanks Richard - Original Message - From: Ralph Castain To: Open MPI Users Sent: Monday, February 13, 2012 3:41 PM Subject: Re: [OMPI users] MPI orte_init fails on remote nodes You need to clean out the old

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
: Re: [OMPI users] MPI orte_init fails on remote nodes You need to clean out the old attempt - that is a stale file Sent from my iPad On Feb 13, 2012, at 7:36 AM, "Richard Bardwell" wrote: OK, I installed 1.4.4, rebuilt the exec and guess what .. I now get some weird

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
OK, I installed 1.4.4, rebuilt the exec and guess what .. I now get some weird errors as below: mca: base: component_find: unable to open /usr/local/lib/openmpi/mca_ras_dash_host along with a few other files even though the .so / .la files are all there ! - Original Message - Fro

[OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
Gentlemen I am struggling to get MPI working when the hostfile contains different nodes. I get the error below. Any ideas ?? I can ssh without password between the two nodes. I am running 1.2.8 MPI on both machines. Any help most appreciated ! MPITEST/v8_mpi_test> mpiexec -n 2 --debug-da

Re: [OMPI users] IO performance

2012-02-06 Thread Richard Walsh
-5Monday/3A-Canon/canon-paper.pdf Cheers, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Miracles are delivered to order by great intelligence, or when it is absent, through the passage of time and a series of mere

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2012-01-31 Thread Richard Walsh
entry point your code crashed in: opal_memory_ptmalloc2_int_malloc is renamed to: rename.h:#define _int_malloc opal_memory_ptmalloc2_int_malloc in the malloc.c routine in 1.5.5. Perhaps you should lower the optimization level to zero and see what you get. Sincerely, rbw Richard Walsh Parallel Appli

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2012-01-30 Thread Richard Walsh
y. I would also try things with the very latest release. Those are my thoughts ... good luck. rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Miracles are delivered to order by great intelligence, or when it is

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2012-01-04 Thread Richard Walsh
1.5.5 release or add the above section to the malloc.c code. Note earlier releases have a slightly different directory location for the 'memory.c' code, but it is easy to find. Thanks Tim ... !! Sincerely, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Sta

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2012-01-03 Thread Richard Walsh
7; version was released by Intel JUST BEFORE SC11 in October of 2011. Thanks, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Right, as the world goes, is only in question between equals in power, while the strong

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2012-01-03 Thread Richard Walsh
produced also >>RUNS FINE<<. So ... it looks to me like there is something wrong with using the 'opal' wrappper generated-used in the Intel build. Can someone make a suggestion ... ?? I would like to use the wrappers of course. Thanks, rbw Richard Walsh Parallel Application

Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2011-12-20 Thread Richard Walsh
All, I have not heard anything back on the inquiry below, so I take it that no one has had any issues with Intel's latest compiler release, or perhaps has not tried it yet. Thanks, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3

[OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...

2011-12-16 Thread Richard Walsh
3 Build 20110811 Copyright (C) 1985-2011 Intel Corporation. All rights reserved. Has anyone else encountered this problem ... ?? Suggestions ... ?? Thanks, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Right, as the

Re: [OMPI users] How closely tied is a specific release of OpenMPI to the host operating system and other system software?

2011-02-01 Thread Richard Walsh
rk for someone ... around here is it me ... ;-) ... rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382-4620 Reason does give the heart pause; As the heart gives reason fits. Yet, to live where reason always rules; Is to k

[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM has retired

2011-01-01 Thread Richard Treumann
I am out of the office until 03/31/2011. I will be out of the office from now on - I will not see email or check phone messages. Good luck to you all Contact my former team leader: Charles J. Archer E-mail: arch...@us.ibm.com Phone: 553-0346 / 1-507-253-0346 OR My former manager: Carl F. (Carl

Re: [OMPI users] Method for worker to determine its "rank" on a single machine?

2010-12-10 Thread Richard Treumann
It seems to me the MPI_Get_processor_name description is too ambiguous to make this 100% portable. I assume most MPI implementations simply use the hostname so all processes on the same host will return the same string. The suggestion would work then. However, it would also be reasonable for a

Re: [OMPI users] curious behavior during wait for broadcast: 100% cpu

2010-12-08 Thread Richard Treumann
Also - HPC clusters are commonly dedicated to running parallel jobs with exactly one process per CPU. HPC is about getting computation done and letting a CPU time slice among competing processes always has overhead (CPU time not spent on the computation). Unless you are trying to run extra pr

[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM is out of the office until 01/02/2001. (returning 11/01/2010)

2010-10-22 Thread Richard Treumann
I am out of the office until 11/01/2010. I will be out of the office on vacation the last week of Oct. Back Nov 1. I will not see any email. Note: This is an automated response to your message "[OMPI users] OPEN MPI data transfer error" sent on 10/22/10 15:19:05. This is the only notification

Re: [OMPI users] busy wait in MPI_Recv

2010-10-20 Thread Richard Treumann
Brian Most HPC applications are run with one processor and one working thread per MPI process. In this case, the node is not being used for other work so if the MPI process does release a processor, there is nothing else important for it to do anyway. In these applications, the blocking MPI c

Re: [OMPI users] a question about [MPI]IO on systems without network filesystem

2010-10-19 Thread Richard Treumann
Subject: Re: [OMPI users] a question about [MPI]IO on systemswithout network filesystem Sent by: users-boun...@open-mpi.org On Thu, Sep 30, 2010 at 09:00:31AM -0400, Richard Treumann wrote: > It is possible for MPI-IO to be implemented in a way that lets a single > process or the set

Re: [OMPI users] hdf5 build error using openmpi and Intel Fortran

2010-10-08 Thread Richard Walsh
--with-zlib=/share/apps/zlib/1.2.3/lib --with-szlib=/share/apps/szip/2.1/lib --disable-shared With some tweaking I was able to build the whole WRF and NCAR-NCL stack here. Regards, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382

Re: [OMPI users] Shared memory

2010-10-06 Thread Richard Treumann
When you use MPI message passing in your application, the MPI library decides how to deliver the message. The "magic" is simply that when sender process and receiver process are on the same node (shared memory domain) the library uses shared memory to deliver the message from process to process

Re: [OMPI users] a question about [MPI]IO on systems without network filesystem

2010-09-30 Thread Richard Treumann
I will add to what Terry said by mentioning that the MPI implementation has no awareness of ordinary POSIX or Fortran disk I/O routines. It cannot help on those. Any automated help the MPI implementation can provide would only apply to MPI_File_xxx disk I/O. These are implemented by the MPI

Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-24 Thread Richard Treumann
Amb It sounds like you have more workers than you can keep fed. Workers are finishing up and requesting their next assignment but sit idle because there are so many other idle workers too. Load balance does not really matter if the choke point is the master. The work is being done as fast as

Re: [OMPI users] Question about Asynchronous collectives

2010-09-23 Thread Richard Treumann
Sent by: users-boun...@open-mpi.org Sorry Richard, what is CC issue order on the communicator?, in particular, "CC", what does it mean? 2010/9/23 Richard Treumann request_1 and request_2 are just local variable names. The only thing that determines matching order is CC issue or

Re: [OMPI users] Question about Asynchronous collectives

2010-09-23 Thread Richard Treumann
request_1 and request_2 are just local variable names. The only thing that determines matching order is CC issue order on the communicator. At each process, some CC is issued first and some CC is issued second. The first issued CC at each process will try to match the first issued CC at the

Re: [OMPI users] Continued functionality across a SLES10 to SLES11 upgrade ...

2010-09-22 Thread Richard Walsh
be the most palatable (and clever idea). Thanks much. I will report back. Regards, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382-4620 Reason does give the heart pause; As the heart gives reason fits. Yet, to live where

[OMPI users] Continued functionality across a SLES10 to SLES11 upgrade ...

2010-09-20 Thread Richard Walsh
OpenMPI 1.4.1 binaries under SLES 11. I would have thought NOTHING, but maybe that is not quite right. Perhaps we can run using GE under SLES 11 with the old binaries until I get things recompiled (ugh!) under SLES 11? Thanks, Richard Walsh Richard Walsh Parallel Applications and Systems Manager CUNY

Re: [OMPI users] send and receive buffer the same on root

2010-09-16 Thread Richard Treumann
Tony You are depending on luck. The MPI Standard allows the implementation to assume that send and recv buffers are distinct unless MPI_IN_PLACE is used. Any MPI implementation may have more than one algorithm for a given MPI collective communication operation and the policy for switching al

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
gt; > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > Richard Treumann wrote: > > Hi Ashley > > I understand the problem with descriptor flooding can be serious in > an application with unidirectional data dependancy. Perhaps we have >

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
; to: > > Open MPI Users > > 09/09/2010 05:37 PM > > Sent by: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > > On 9 Sep 2010, at 21:40, Richard Treumann wrote: > > > > > Ashley > > > > Can you provide an e

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley Can you provide an example of a situation in which these semantically redundant barriers help? I may be missing something but my statement for the text book would be "If adding a barrier to your MPI program makes it run faster, there is almost certainly a flaw in it that is better solv

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
I was pointing out that most programs have some degree of elastic synchronization built in. Tasks (or groups or components in a coupled model) seldom only produce data.they also consume what other tasks produce and that limits the potential skew. If step n for a task (or group or coupled compo

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley's observation may apply to an application that iterates on many to one communication patterns. If the only collective used is MPI_Reduce, some non-root tasks can get ahead and keep pushing iteration results at tasks that are nearer the root. This could overload them and cause some extra

[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM is out of the office until 01/02/2001. (returning 09/07/2010)

2010-08-30 Thread Richard Treumann
I am out of the office until 09/07/2010. I will be out of the office on vacation the week before Labor Day. I will not see any email. Note: This is an automated response to your message "[OMPI users] random IB failures when running medium core counts" sent on 8/30/10 12:22:19. This is the onl

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

2010-08-23 Thread Richard Treumann
Network saturation could produce arbitrary long delays the total data load we are talking about is really small. It is the responsibility of an MPI library to do one of the following: 1) Use a reliable message protocol for each message (e.g. Infiniband RC or TCP/IP) 2) detect lost packets and

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

2010-08-23 Thread Richard Treumann
It is hard to imagine how a total data load of 41,943,040 bytes could be a problem. That is really not much data. By the time the BCAST is done, each task (except root) will have received a single half meg message form one sender. That is not much. IMB does shift the root so some tasks may be i

Re: [OMPI users] Accessing to the send buffer

2010-08-18 Thread Richard Treumann
As of MPI 2.2 there is no longer a restriction against read access to a live send buffer. The wording was changed to now prohibit the user to "modify". You can look the subsection of Communication Modes in chapter 3 but you will need to compare MPI 2.1 and 2.2 carefully to see the change. The

Re: [OMPI users] Does OpenMPI 1.4.1 support the MPI_IN_PLACE designation ...

2010-08-17 Thread Richard Walsh
into one of the routines that works but this will save me the trouble. Swapping the two modules as suggested works. Thanks! rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382-4620 Reason does give the heart pause; As the heart gives

[OMPI users] Does OpenMPI 1.4.1 support the MPI_IN_PLACE designation ...

2010-08-16 Thread Richard Walsh
mpi.h header. Any thoughts? rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382-4620 Reason does give the heart pause; As the heart gives reason fits. Yet, to live where reason always rules; Is to kill one's heart with

Re: [OMPI users] A Problem with RAxML

2010-08-16 Thread Richard Walsh
... rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY 718-982-3319 612-382-4620 Reason does give the heart pause; As the heart gives reason fits. Yet, to live where reason always rules; Is to kill one's heart with

Re: [OMPI users] MPI_Bcast issue

2010-08-12 Thread Richard Treumann
- yes I know this should > not happen, the question is why. > > --- On Wed, 11/8/10, Richard Treumann wrote: > > From: Richard Treumann > Subject: Re: [OMPI users] MPI_Bcast issue > To: "Open MPI Users" > Received: Wednesday, 11 August, 2010, 11:34 PM >

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Richard Treumann
Randolf I am confused about using multiple, concurrent mpirun operations. If there are M uses of mpirun and each starts N tasks (carried out under pvm or any other way) I would expect you to have M completely independent MPI jobs with N tasks (processes) each. You could have some root in eac

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann
Sorry - I missed the statement that all works when you add sleeps. That probably rules out any possible error in the way MPI_Bcast was used. Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 F

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann
I did not take the time to try to fully understand your approach so this may sound like a dumb question; Do you have an MPI_Bcast ROOT process in every MPI_COMM_WORLD and does every non-ROOT MPI_Bcast call correctly identify the rank of ROOT in its MPI_COMM_WORLD ? An MPI_Bcast call when the

Re: [OMPI users] Accessing to the send buffer

2010-08-02 Thread Richard Treumann
For reading the data from an isend buffer to cause problems, the underlying hardware would need to have very unusual characteristic that the MPI implementation is exploiting. People have imagined hardware characteristics that could make reading an Isend buffer a problem but I have never heard

  1   2   >