Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-23 Thread Howard Pritchard via users
HI All, I opened a new issue to track the coll_perf failure in case its not related to the HDF5 problem reported earlier. https://github.com/open-mpi/ompi/issues/8246 Howard Am Mo., 23. Nov. 2020 um 12:14 Uhr schrieb Dave Love via users < users@lists.open-mpi.org>: > Mark Dixon via users wri

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-15 Thread Howard Pritchard via users
:02457] *** MPI_ERR_UNKNOWN: unknown error* > > *[osboxes:02457] *** MPI_ERRORS_ARE_FATAL (processes in this communicator > will now abort,* > > *[osboxes:02457] ***and potentially your MPI job)* > > *[osboxes:02458] 1 more process has sent help message help-orted.txt / >

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-14 Thread Howard Pritchard via users
> I need spawn only in “worker”. Is there a way or workaround for doing this > without mpirun? > > Thanks a lot for your assistance. > > > > Martín > > > > > > > > > > *From: *Howard Pritchard > *Sent: *lunes, 10 de agosto de 2020 19:13 >

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-13 Thread Howard Pritchard via users
rkaround for doing this > without mpirun? > Thanks a lot for your assistance. > > Martín > > > > > *From: *Howard Pritchard > *Sent: *lunes, 10 de agosto de 2020 19:13 > *To: *Martín Morales > *Cc: *Open MPI Users > *Subject: *Re: [OMPI users] OMPI 4.0.4 crashe

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-10 Thread Howard Pritchard via users
Hi Howard. Unfortunately the issue persists in OMPI 4.0.5rc1. Do I have > to post this on the bug section? Thanks and regards. > > > > Martín > > > > *From: *Howard Pritchard > *Sent: *lunes, 10 de agosto de 2020 14:44 > *To: *Open MPI Users > *Cc: *Martín M

Re: [OMPI users] OMPI 4.0.4 crashes (or hangs) with dynamically processes allocation. OMPI 4.0.1 don't.

2020-08-10 Thread Howard Pritchard via users
Hello Martin, Between Open MPI 4.0.1 and Open MPI 4.0.4 we upgraded the internal PMIx version that introduced a problem with spawn for the 4.0.2-4.0.4 versions. This is supposed to be fixed in the 4.0.5 release. Could you try the 4.0.5rc1 tarball and see if that addresses the problem you're seein

Re: [OMPI users] Differences 4.0.3 -> 4.0.4 (Regression?)

2020-08-08 Thread Howard Pritchard via users
Hello Michael, Not sure what could be causing this in terms of delta between v4.0.3 and v4.0.4. Two things to try - add --debug-daemons and --mca pmix_base_verbose 100 to the mpirun line and compare output from the v4.0.3 and v4.0.4 installs - perhaps try using the --enable-mpirun-prefix-by-defau

Re: [OMPI users] OMPI returns error 63 on AMD 7742 when utilizing 100+ processors per node

2020-01-29 Thread Howard Pritchard via users
itialize with >100 processes per > node. I get the same error message for multiple different codes, so the > error code is mpi related rather than being program specific. > > > > Collin > > > > *From:* Howard Pritchard > *Sent:* Monday, January 27, 2020 11:20

Re: [OMPI users] OMPI returns error 63 on AMD 7742 when utilizing 100+ processors per node

2020-01-27 Thread Howard Pritchard via users
Hello Collen, Could you provide more information about the error. Is there any output from either Open MPI or, maybe, UCX, that could provide more information about the problem you are hitting? Howard Am Mo., 27. Jan. 2020 um 08:38 Uhr schrieb Collin Strassburger via users < users@lists.open-m

Re: [OMPI users] Do idle MPI threads consume clock cycles?

2019-02-25 Thread Howard Pritchard
Hello Mark, You may want to checkout this package: https://github.com/lanl/libquo Another option would be to do something like use an MPI_Ibarrier in the application with all the MPI processes but rank 0 going into a loop over waiting for completion of the barrier and doing a sleep. Once rank 0

Re: [OMPI users] OpenMPI v4.0.0 signal 11 (Segmentation fault)

2019-02-20 Thread Howard Pritchard
oebe:07408] [ 9] IMB-MPI1[0x401d49] >> > [phoebe:07408] *** End of error message *** >> > IMB-MPI1[0x4022ea] >> > [titan:07169] [ 8] >> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fc025d5a3d5] >> > [titan:07169] [ 9] IMB-MPI1[0x401d49] >> >

Re: [OMPI users] OpenMPI v4.0.0 signal 11 (Segmentation fault)

2019-02-20 Thread Howard Pritchard
HI Adam, As a sanity check, if you try to use --mca btl self,vader,tcp do you still see the segmentation fault? Howard Am Mi., 20. Feb. 2019 um 08:50 Uhr schrieb Adam LeBlanc < alebl...@iol.unh.edu>: > Hello, > > When I do a run with OpenMPI v4.0.0 on Infiniband with this command: > mpirun --

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-20 Thread Howard Pritchard
Hi Matt Definitely do not include the ucx option for an omnipath cluster. Actually if you accidentally installed ucx in it’s default location use on the system Switch to this config option —with-ucx=no Otherwise you will hit https://github.com/openucx/ucx/issues/750 Howard Gilles Gouaillard

Re: [OMPI users] Segmentation fault using openmpi-master-201901030305-ee26ed9

2019-01-04 Thread Howard Pritchard
Hi Sigmar, I observed this problem yesterday myself and should have a fix in to master later today. Howard Am Fr., 4. Jan. 2019 um 05:30 Uhr schrieb Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de>: > Hi, > > I've installed (tried to install) openmpi-master-201901030305-ee26ed9 on > my "

Re: [OMPI users] Unable to build Open MPI with external PMIx library support

2018-12-17 Thread Howard Pritchard
th", is set and doesn't point to any location > that includes at least one usable plugin for this framework. > > Please check your installation and environment. > ------ > > Regards, > Eduardo &g

Re: [OMPI users] Unable to build Open MPI with external PMIx library support

2018-12-15 Thread Howard Pritchard
Hi Eduardo Could you post the config.log for the build with internal PMIx so we can figure that out first. Howard Eduardo Rothe via users schrieb am Fr. 14. Dez. 2018 um 09:41: > Open MPI: 4.0.0 > PMIx: 3.0.2 > OS: Debian 9 > > I'm building a debian package for Open MPI and either I get the fo

Re: [OMPI users] [Open MPI Announce] Open MPI 4.0.0 Released

2018-11-14 Thread Howard Pritchard
Hi Bert, If you'd prefer to return to the land of convenience and don't need to mix MPI and OpenSHMEM, then you may want to try the path I outlined in the email archived at the following link https://www.mail-archive.com/users@lists.open-mpi.org/msg32274.html Howard Am Di., 13. Nov. 2018 um 23

Re: [OMPI users] [Open MPI Announce] Open MPI 4.0.0 Released

2018-11-13 Thread Howard Pritchard
Hello Bert, What OS are you running on your notebook? If you are running Linux, and you have root access to your system, then you should be able to resolve the Open SHMEM support issue by installing the XPMEM device driver on your system, and rebuilding UCX so it picks up XPMEM support. The sou

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 3.1.0 Lock Up on POWER9 w/ CUDA9.2

2018-07-02 Thread Howard Pritchard
HI Si, Could you add --disable-builtin-atomics to the configure options and see if the hang goes away? Howard 2018-07-02 8:48 GMT-06:00 Jeff Squyres (jsquyres) via users < users@lists.open-mpi.org>: > Simon -- > > You don't currently have another Open MPI installation in your PATH / > LD_LIBR

Re: [OMPI users] A couple of general questions

2018-06-14 Thread Howard Pritchard
Hello Charles You are heading in the right direction. First you might want to run the libfabric fi_info command to see what capabilities you picked up from the libfabric RPMs. Next you may well not actually be using the OFI mtl. Could you run your app with export OMPI_MCA_mtl_base_verbose=100

Re: [OMPI users] Problem running with UCX/oshmem on single node?

2018-05-09 Thread Howard Pritchard
Hi Craig, You are experiencing problems because you don't have a transport installed that UCX can use for oshmem. You either need to go and buy a connectx4/5 HCA from mellanox (and maybe a switch), and install that on your system, or else install xpmem (https://github.com/hjelmn/xpmem). Note ther

Re: [OMPI users] Debug build of v3.0.1 tarball

2018-05-04 Thread Howard Pritchard
../../opal/.libs/libopen-pal.so: undefined reference to > `pthread_atfork' > > collect2: error: ld returned 1 exit status > > make[2]: *** [opal_wrapper] Error 1 > > > > Also setting LDFLAGS fixes that up. Just wondering whether I’m going > about it the right way in t

Re: [OMPI users] Debug build of v3.0.1 tarball

2018-05-04 Thread Howard Pritchard
HI Adam, Sorry didn't notice you did try the --enable-debug flag. That should not have led to the link error building the opal dso. Did you do a make clean after rerunning configure? Howard 2018-05-04 8:22 GMT-06:00 Howard Pritchard : > Hi Adam, > > Did you try using the

Re: [OMPI users] Debug build of v3.0.1 tarball

2018-05-04 Thread Howard Pritchard
Hi Adam, Did you try using the --enable-debug configure option along with your CFLAGS options? You may want to see if that simplifies your build. In any case, we'll fix the problems you found. Howard 2018-05-03 15:00 GMT-06:00 Moody, Adam T. : > Hello Open MPI team, > > I'm looking for the re

Re: [OMPI users] Eager RDMA causing slow osu_bibw with 3.0.0

2018-04-05 Thread Howard Pritchard
Hello Ben, Thanks for the info. You would probably be better off installing UCX on your cluster and rebuilding your Open MPI with the --with-ucx configure option. Here's what I'm seeing with Open MPI 3.0.1 on a ConnectX5 based cluster using ob1/openib BTL: mpirun -map-by ppr:1:node -np 2 ./osu

Re: [OMPI users] OpenMPI with Portals4 transport

2018-02-08 Thread Howard Pritchard
46.97 > 524288 87.55 > 1048576 168.89 > 2097152 331.40 > 4194304 654.08 > > > On Feb 7, 2018, at 9:04 PM, Howard Pritchard wrote: > > HI Brian, > > As a sanity check, can you see if the ob1 pml works

Re: [OMPI users] Using OpenSHMEM with Shared Memory

2018-02-07 Thread Howard Pritchard
HI Ben, I'm afraid this is bad news for using UCX. The problem is that when UCX was configured/built, it did not find a transport for doing one sided put/get transfers. If you're feeling lucky, you may want to install xpmem (https://github.com/hjelmn/xpmem) and rebuild UCX. This requires buildi

Re: [OMPI users] OpenMPI with Portals4 transport

2018-02-07 Thread Howard Pritchard
HI Brian, As a sanity check, can you see if the ob1 pml works okay, i.e. mpirun -n 2 --mca pml ob1 --mca btl self,vader,openib ./osu_latency Howard 2018-02-07 11:03 GMT-07:00 brian larkins : > Hello, > > I’m doing some work with Portals4 and am trying to run some MPI programs > using the Por

Re: [OMPI users] Using OpenSHMEM with Shared Memory

2018-02-07 Thread Howard Pritchard
HI Ben, Could you set these environment variables and post the output ? export OMPI_MCA_spml=ucx export OMPI_MCA_spml_base_verbose=100 then run your test? Also, what OS are you using? Howard 2018-02-06 20:10 GMT-07:00 Jeff Hammond : > > On Tue, Feb 6, 2018 at 3:58 PM Benjamin Brock > wrot

Re: [OMPI users] About my GPU performance using Openmpi-2.0.4

2017-12-13 Thread Howard Pritchard
Hi Phanikumar It’s unlikely the warning message you are seeing is related to GPU performance. Have you tried adding —with-verbs=no to your config line? That should quash openib complaint. Howard Phanikumar Pentyala schrieb am Mo. 11. Dez. 2017 um 22:43: > Dear users and developers, > > Cur

Re: [OMPI users] [EXTERNAL] Re: Using shmem_int_fadd() in OpenMPI\'s SHMEM

2017-11-22 Thread Howard Pritchard
Hi Ben, Actually I did some checking about the brew install for OFi libfabric. It looks like if your brew is up to date, it will pick up libfabric 1.5.2. Howard 2017-11-22 15:21 GMT-07:00 Howard Pritchard : > HI Ben, > > Even on one box, the yoda component doesn't work any mo

Re: [OMPI users] [EXTERNAL] Re: Using shmem_int_fadd() in OpenMPI\'s SHMEM

2017-11-22 Thread Howard Pritchard
HI Ben, Even on one box, the yoda component doesn't work any more. If you want to do OpenSHMEM programming on you Macbook pro (like I do) and you don't want to set up a VM to use UCX, then you can use Sandia OpenSHMEM implementation. https://github.com/Sandia-OpenSHMEM/SOS You will need to inst

Re: [OMPI users] [EXTERNAL] Re: Using shmem_int_fadd() in OpenMPI's SHMEM

2017-11-22 Thread Howard Pritchard
ss rank 1 with PID 0 on node shepard-lsm1 > exited on signal 11 (Segmentation fault). > > -- > > [shepard-lsm1:49499] 1 more process has sent help message > help-mpi-btl-openib.txt / no active ports found > >

Re: [OMPI users] Using shmem_int_fadd() in OpenMPI's SHMEM

2017-11-20 Thread Howard Pritchard
HI Ben, What version of Open MPI are you trying to use? Also, could you describe something about your system. If its a cluster what sort of interconnect is being used. Howard 2017-11-20 14:13 GMT-07:00 Benjamin Brock : > What's the proper way to use shmem_int_fadd() in OpenMPI's SHMEM? > > A

Re: [OMPI users] Problems building OpenMPI 2.1.1 on Intel KNL

2017-11-20 Thread Howard Pritchard
Hello Ake, Would you mind opening an issue on Github so we can track this? https://github.com/open-mpi/ompi/issues There's a template to show what info we need to fix this. Thanks very much for reporting this, Howard 2017-11-20 3:26 GMT-07:00 Åke Sandgren : > Hi! > > When the xppsl-libmemki

Re: [OMPI users] OMPI 2.1.2 and SLURM compatibility

2017-11-17 Thread Howard Pritchard
Hello Bennet, What you are trying to do using srun as the job launcher should work. Could you post the contents of /etc/slurm/slurm.conf for your system? Could you also post the output of the following command: ompi_info --all | grep pmix to the mail list. the config.log from your build would

Re: [OMPI users] [OMPI devel] Open MPI 2.0.4rc2 available for testing

2017-11-02 Thread Howard Pritchard
ecursive] Error 1 > make[2]: Leaving directory '/export2/src/openmpi-2.0.4/op > enmpi-2.0.4rc2-Linux.x86_64.64_cc/opal/mca/pmix/pmix112' > Makefile:2301: recipe for target 'all-recursive' failed > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory 

Re: [OMPI users] Strange benchmarks at large message sizes

2017-09-19 Thread Howard Pritchard
Hello Cooper Could you rerun your test with the following env. variable set export OMPI_MCA_coll=self,basic,libnbc and see if that helps? Also, what type of interconnect are you using - ethernet, IB, ...? Howard 2017-09-19 8:56 GMT-06:00 Cooper Burns : > Hello, > > I have been running some

Re: [OMPI users] openmpi-2.1.2rc2: warnings from "make" and "make check"

2017-08-30 Thread Howard Pritchard
Hi Siegmar, Opened issue 4151 to track this. Thanks, Howard 2017-08-21 7:13 GMT-06:00 Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de>: > Hi, > > I've installed openmpi-2.1.2rc2 on my "SUSE Linux Enterprise Server 12.2 > (x86_64)" with Sun C 5.15 (Oracle Developer Studio 12.6) and gcc-7.

Re: [OMPI users] openmpi-master-201708190239-9d3f451: warnings from "make" and "make check"

2017-08-30 Thread Howard Pritchard
Hi Siegmar, I opened issue 4151 to track this. This is relevant to a project to get open mpi to build with -Werror. Thanks very much, Howard 2017-08-21 7:27 GMT-06:00 Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de>: > Hi, > > I've installed openmpi-master-201708190239-9d3f451 on my "SU

Re: [OMPI users] pmix, lxc, hpcx

2017-05-26 Thread Howard Pritchard
Hi John, In the 2.1.x release stream a shared memory capability was introduced into the PMIx component. I know nothing about LXC containers, but it looks to me like there's some issue when PMIx tries to create these shared memory segments. I'd check to see if there's something about your contain

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
Forgot you probably need an equal sign after btl arg Howard Pritchard schrieb am Mi. 22. März 2017 um 18:11: > Hi Goetz > > Thanks for trying these other versions. Looks like a bug. Could you post > the config.log output from your build of the 2.1.0 to the list? > > Also cou

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
) Howard Götz Waschk schrieb am Mi. 22. März 2017 um 13:09: On Wed, Mar 22, 2017 at 7:46 PM, Howard Pritchard wrote: > Hi Goetz, > > Would you mind testing against the 2.1.0 release or the latest from the > 1.10.x series (1.10.6)? Hi Howard, after sending my mail I have tested bot

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-22 Thread Howard Pritchard
Hi Goetz, Would you mind testing against the 2.1.0 release or the latest from the 1.10.x series (1.10.6)? Thanks, Howard 2017-03-22 6:25 GMT-06:00 Götz Waschk : > Hi everyone, > > I'm testing a new machine with 32 nodes of 32 cores each using the IMB > benchmark. It is working fine with 512 p

Re: [OMPI users] Shared Windows and MPI_Accumulate

2017-03-03 Thread Howard Pritchard
master on my laptop. Please let me > know if I can help with anything else. > > Thanks, > Joseph > > On 03/01/2017 11:24 PM, Howard Pritchard wrote: > > Hi Joseph, > > I built this test with craypich (Cray MPI) and it passed. I also tried > with Open MPI master an

Re: [OMPI users] sharedfp/lockedfile collision between multiple program instances

2017-03-03 Thread Howard Pritchard
Hi Edgar Please open an issue too so we can track the fix. Howard Edgar Gabriel schrieb am Fr. 3. März 2017 um 07:45: > Nicolas, > > thank you for the bug report, I can confirm the behavior. I will work on > a patch and will try to get that into the next release, should hopefully > not be too

Re: [OMPI users] Shared Windows and MPI_Accumulate

2017-03-01 Thread Howard Pritchard
Hi Joseph, I built this test with craypich (Cray MPI) and it passed. I also tried with Open MPI master and the test passed. I also tried with 2.0.2 and can't seem to reproduce on my system. Could you post the output of config.log? Also, how intermittent is the problem? Thanks, Howard 20

Re: [OMPI users] Issues with different IB adapters and openmpi 2.0.2

2017-02-27 Thread Howard Pritchard
Hi Orion Does the problem occur if you only use font2 and 3? Do you have MXM installed on the font1 node? The 2.x series is using PMIX and it could be that is impacting the PML sanity check. Howard Orion Poplawski schrieb am Mo. 27. Feb. 2017 um 14:50: > We have a couple nodes with differen

Re: [OMPI users] MPI_THREAD_MULTIPLE: Fatal error on MPI_Win_create

2017-02-18 Thread Howard Pritchard
Hi Joseph What OS are you using when running the test? Could you try running with export OMPI_mca_osc=^pt2pt and export OMPI_mca_osc_base_verbose=10 This error message was put in to this OMPI release because this part of the code has known problems when used multi threaded. Joseph Schuchart

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Howard Pritchard
Hi Anastasia, Definitely check the mpirun when in batch environment but you may also want to upgrade to Open MPI 2.0.2. Howard r...@open-mpi.org schrieb am Mi. 15. Feb. 2017 um 07:49: > Nothing immediate comes to mind - all sbatch does is create an allocation > and then run your script in it.

Re: [OMPI users] OpenMPI not running any job on Mac OS X 10.12

2017-02-06 Thread Howard Pritchard
stead of ORTE_SUCCESS -- On Thu, Feb 2, 2017 at 12:29 PM, Howard Pritchard wrote: Hi Michel Try adding --enable-static to the configure. That fixed the problem for me. Howard Michel Lesoinne schrieb am Mi. 1. Feb. 2017

Re: [OMPI users] Open MPI over RoCE using breakout cable and switch

2017-02-03 Thread Howard Pritchard
gt; > > > Hello Howard, > > Here is the error output after building with debug enabled. These CX4 > Mellanox cards view each port as a separate device and I am using port 1 on > the card which is device mlx5_0. > > > > Thank you, > > Brendan > > > >

Re: [OMPI users] OpenMPI not running any job on Mac OS X 10.12

2017-02-02 Thread Howard Pritchard
Hi Michel Try adding --enable-static to the configure. That fixed the problem for me. Howard Michel Lesoinne schrieb am Mi. 1. Feb. 2017 um 19:07: > I have compiled OpenMPI 2.0.2 on a new Macbook running OS X 10.12 and have > been trying to run simple program. > I configured openmpi with > ../

Re: [OMPI users] OpenMPI not running any job on Mac OS X 10.12

2017-02-02 Thread Howard Pritchard
Hi Michael, I reproduced this problem on my Mac too: pn1249323:~/ompi/examples (v2.0.x *)$ mpirun -np 2 ./ring_c [pn1249323.lanl.gov:94283] mca_base_component_repository_open: unable to open mca_patcher_overwrite: File not found (ignored) [pn1249323.lanl.gov:94283] mca_base_component_repository

Re: [OMPI users] OpenMPI not running any job on Mac OS X 10.12

2017-02-02 Thread Howard Pritchard
Hi Michel It's somewhat unusual to use the disable-shared configure option. That may be causing this. Could you try to build without using this option and see if you still see the problem? Thanks, Howard Michel Lesoinne schrieb am Mi. 1. Feb. 2017 um 21:07: > I have compiled OpenMPI 2.0.2

Re: [OMPI users] Error using hpcc benchmark

2017-01-31 Thread Howard Pritchard
Hi Wodel Randomaccess part of HPCC is probably causing this. Perhaps set PSM env. variable - Export PSM_MQ_REVCREQ_MAX=1000 or something like that. Alternatively launch the job using mpirun --mca plm ob1 --host to avoid use of psm. Performance will probably suffer with this option

Re: [OMPI users] Open MPI over RoCE using breakout cable and switch

2017-01-24 Thread Howard Pritchard
2017-01-23 8:23 GMT-07:00 Brendan Myers : > Hello Howard, > > Thank you for looking into this. Attached is the output you requested. > Also, I am using Open MPI 2.0.1. > > > > Thank you, > > Brendan > > > > *From:* users [mailto:users-boun...@lists.open-

Re: [OMPI users] Open MPI over RoCE using breakout cable and switch

2017-01-20 Thread Howard Pritchard
Hi Brendan I doubt this kind of config has gotten any testing with OMPI. Could you rerun with --mca btl_base_verbose 100 added to the command line and post the output to the list? Howard Brendan Myers schrieb am Fr. 20. Jan. 2017 um 15:04: > Hello, > > I am attempting to get Open MPI to ru

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-09 Thread Howard Pritchard
Comm_spawn > [loki:13586] *** reported by process [2873294849,0] > [loki:13586] *** on communicator MPI_COMM_WORLD > [loki:13586] *** MPI_ERR_UNKNOWN: unknown error > [loki:13586] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [loki:13586] ***a

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-08 Thread Howard Pritchard
HI Siegmar, Could you post the configury options you use when building the 2.0.2rc3? Maybe that will help in trying to reproduce the segfault you are observing. Howard 2017-01-07 2:30 GMT-07:00 Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de>: > Hi, > > I have installed openmpi-2.0.2rc3 o

Re: [OMPI users] segmentation fault with openmpi-2.0.2rc2 on Linux

2017-01-03 Thread Howard Pritchard
[loki:05572] mca: base: close: component self closed > [loki:05572] mca: base: close: unloading component self > [loki:05572] mca: base: close: component tcp closed > [loki:05572] mca: base: close: unloading component tcp > loki spawn 125 > > > Kind regards and thank you v

Re: [OMPI users] segmentation fault with openmpi-2.0.2rc2 on Linux

2017-01-02 Thread Howard Pritchard
HI Siegmar, I've attempted to reproduce this using gnu compilers and the version of this test program(s) you posted earlier in 2016 but am unable to reproduce the problem. Could you double check that the slave program can be successfully run when launched directly by mpirun/mpiexec? It might also

Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v

2016-12-23 Thread Howard Pritchard
Hi Paul, Thanks very much Christmas present. The Open MPI README has been updated to include a note about issues with the Intel 16.0.3-4 compiler suites. Enjoy the holidays, Howard 2016-12-23 3:41 GMT-07:00 Paul Kapinos : > Hi all, > > we discussed this issue with Intel compiler support and

Re: [OMPI users] device failed to appear .. Connection timed out

2016-12-08 Thread Howard Pritchard
ke sure you have the hfi1 module/driver loaded. >> >> In addition, please confirm the links are in active state on all the >> nodes `opainfo` >> >> >> >> _MAC >> >> >> >> *From:* users [mailto:users-boun...@lists.open-mpi.org] *On B

Re: [OMPI users] device failed to appear .. Connection timed out

2016-12-08 Thread Howard Pritchard
hello Daniele, Could you post the output from ompi_info command? I'm noticing on the RPMS that came with the rhel7.2 distro on one of our systems that it was built to support psm2/hfi-1. Two things, could you try running applications with mpirun --mca pml ob1 (all the rest of your args) and se

Re: [OMPI users] Follow-up to Open MPI SC'16 BOF

2016-11-22 Thread Howard Pritchard
Hi Jeff, I don't think it was the use of memkind itself, but a need to refactor the way Open MPI is using info objects that was the issue. I don't recall the details. Howard 2016-11-22 16:27 GMT-07:00 Jeff Hammond : > >> >>1. MPI_ALLOC_MEM integration with memkind >> >> It would sense to

[OMPI users] Follow-up to Open MPI SC'16 BOF

2016-11-22 Thread Howard Pritchard
whether to go with a v2.2.x release next year or to go from v2.1.x to v3.x in late 2017 or early 2018 at the link below: https://www.open-mpi.org/sc16/ Thanks very much, Howard -- Howard Pritchard HPC-DES Los Alamos National Laboratory ___ users mailing

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-18 Thread Howard Pritchard
Hi Christof, Thanks for trying out 2.0.1. Sorry that you're hitting problems. Could you try to run the tests using the 'ob1' PML in order to bypass PSM2? mpirun --mca pml ob1 (all the rest of the args) and see if you still observe the failures? Howard 2016-11-18 9:32 GMT-07:00 Christof Köhle

Re: [OMPI users] How to verify RDMA traffic (RoCE) is being sent over a fabric when running OpenMPI

2016-11-08 Thread Howard Pritchard
HI Brenda, I should clarify as my response may confuse folks. We had configured the connectx4 cards to use ethernet/RoCE rather than IB transport for these measurements. Howard 2016-11-08 16:08 GMT-07:00 Howard Pritchard : > Hi Brenda, > > What type of ethernet device (is this a Mel

Re: [OMPI users] How to verify RDMA traffic (RoCE) is being sent over a fabric when running OpenMPI

2016-11-08 Thread Howard Pritchard
Hi Brenda, What type of ethernet device (is this a Mellanox HCA?) and ethernet switch are you using? The mpirun configure options look correct to me. Is it possible that you have all the mpi processes on a single node? It should be pretty obvious from the SendRecv IMB test if you're using RoCE.

Re: [OMPI users] how to tell if pmi or pmi2 is being used?

2016-10-13 Thread Howard Pritchard
HI David, If you are using srun, you can export OMPI_MCA_pmix_base_verbose=10 and there will be output to show which SLURM pmi library you are using. Howard 2016-10-13 12:55 GMT-06:00 David Shrader : > That is really good to know. Thanks! > David > > > On 10/13/2016 12:27 PM, r...@open-mpi.org

Re: [OMPI users] Regression: multiple memory regions in dynamic windows

2016-08-25 Thread Howard Pritchard
Hi Joseph, Thanks for reporting this problem. There's an issue now (#2012) https://github.com/open-mpi/ompi/issues/2012 to track this. Howard 2016-08-25 7:44 GMT-06:00 Christoph Niethammer : > Hello, > > The Error is not 100% reproducible for me every time but seems to > disappear entirely i

Re: [OMPI users] shmem_init problem with v2.0.0

2016-07-28 Thread Howard Pritchard
Hello Chao, Could you send the output of ompi_info and your configure options to the mail list? Also, if you could describe your system that would be useful. Thanks for trying the 2.0.0 release. Howard 2016-07-28 11:53 GMT-06:00 Chao Liu : > Hi, > > I installed the latest version of openmpi,

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-07-08 Thread Howard Pritchard
Hi Gundram Could you configure without the disable dlopen option and retry? Howard Am Freitag, 8. Juli 2016 schrieb Gilles Gouaillardet : > the JVM sets its own signal handlers, and it is important openmpi dones > not override them. > this is what previously happened with PSM (infinipath) but t

Re: [OMPI users] problem with exceptions in Java interface

2016-05-24 Thread Howard Pritchard
Hi Siegmar, Sorry for the delay, I seem to have missed this one. It looks like there's an error in the way the native methods are processing java exceptions. The code correctly builds up an exception message for cases where MPI 'c' returns non-success but, not if the problem occured in one of th

Re: [OMPI users] mpirun java

2016-05-23 Thread Howard Pritchard
0* > > * Env[4]: OMPI_MCA_orte_peer_init_barrier_id=1* > > * Env[5]: OMPI_MCA_orte_peer_fini_barrier_id=2* > > * Env[6]: TMPDIR=/var/folders/5t/6tqp003x4fn09fzgtx46tjdhgn/T/* > > * Env[7]: __CF_USER_TEXT_ENCODING=0x1F5:0x0:0x4* > > > What do you think ? > &

Re: [OMPI users] mpirun java

2016-05-23 Thread Howard Pritchard
Hello Claudio, mpirun should be combining your java.library.path option with the one needed to add the Open MPI's java bindings as well. Which version of Open MPI are you using? Could you first try to compile the Ring.java code in ompi/examples and run it with the following additional mpirun par

Re: [OMPI users] libfabric verb provider for iWARP RNIC

2016-04-04 Thread Howard Pritchard
Hi Durga, I'd suggest reposting this to the libfabric-users mail list. You can join that list at http://lists.openfabrics.org/mailman/listinfo/libfabric-users I'd suggest including the output of config.log. If you installed ofed in non-canonical location, you may need to give an explicit path as

Re: [OMPI users] Java MPI Code for NAS Benchmarks

2016-03-11 Thread Howard Pritchard
Hello Saliya, Sorry i did not see this email earlier. There are a bunch of java test codes including performance tests like used in the paper at https://github.com/open-mpi/ompi-java-test Howard 2016-02-27 23:01 GMT-07:00 Saliya Ekanayake : > Hi, > > I see this paper from Oscar refers to a J

Re: [OMPI users] Issues Building Open MPI static with Intel Fortran 16

2016-01-22 Thread Howard Pritchard
HI Matt, If you don't need oshmem, you could try again with --disable-oshmem added to the config line Howard 2016-01-22 12:15 GMT-07:00 Matt Thompson : > All, > > I'm trying to duplicate an issue I had with ESMF long ago (not sure if I > reported it here or at ESMF, but...). It had been a whil

Re: [OMPI users] How to allocate more memory to java OpenMPI

2016-01-19 Thread Howard Pritchard
HI Ibrahim, Are you using a 32bit or 64bit JVM? I don't think this is an Open MPI issue, but likely something owing to your app or your java setup. You may want to checkout http://javaeesupportpatterns.blogspot.com/2012/09/outofmemoryerror-unable-to-create-new.html If you'd like to post the jav

Re: [OMPI users] problem with execstack and openmpi-v1.10.1-140-g31ff573

2016-01-14 Thread Howard Pritchard
HI Sigmar, Would you mind posting your MsgSendRecvMain to the mail list? I'd like to see if I can reproduce it on my linux box. Thanks, Howard 2016-01-14 7:30 GMT-07:00 Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de>: > Hi, > > I've successfully built openmpi-v1.10.1-140-g31ff573 on

Re: [OMPI users] RMA operations with java buffers

2016-01-13 Thread Howard Pritchard
Hi Marko, You can probably find examples of what you'd like to do on github: https://github.com/open-mpi/ompi-java-test There are numerous MPI-2 RMA examples in the one-sided subdirectory. If you've never used github before, jus click on the download as zip button in the upper right hand corner

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
ce for use either on edison or cori. Howard 2015-11-19 17:11 GMT-07:00 Howard Pritchard : > Hi Jeff H. > > Why don't you just try configuring with > > ./configure --prefix=my_favorite_install_dir > --with-libfabric=install_dir_for_libfabric > make -j 8 install > >

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
Hi Jeff H. Why don't you just try configuring with ./configure --prefix=my_favorite_install_dir --with-libfabric=install_dir_for_libfabric make -j 8 install and see what happens? Make sure before you configure that you have PrgEnv-gnu or PrgEnv-intel module loaded. Those were the configure/com

Re: [OMPI users] mpijavac doesn't compile any thing

2015-11-19 Thread Howard Pritchard
Hi Ibrahim, If you just try to compile with the javac do you at least see a "error: package mpi..." does not exist? Adding the "-verbose" option may also help with diagnosing the problem. If the javac doesn't get that far then your problem is with the java install. Howard 2015-11-19 6:45 GMT-

Re: [OMPI users] mpijavac doesn't compile any thing

2015-11-18 Thread Howard Pritchard
Hello Ibrahim As a sanity check, could you try to compile the Hello.java in examples? mpijavac --verbose Hello.java you should see something like: /usr/bin/javac -cp /global/homes/h/hpp/ompi_install/lib/mpi.jar:/global/homes/h/hpp/ompi_install/lib/shmem.jar Hello.java You may also want to doub

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Howard Pritchard
gh the default is for Open MPI to use mtl/psm on that network. > > Please forgive my ignorance, the amount of different options is rather > overwhelming.. > > Marcin > > > > On 09/30/2015 04:26 PM, Howard Pritchard wrote: > > Hello Marcin > > What configure

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Howard Pritchard
Hello Marcin What configure options are you using besides with-libfabric? Could you post your config.log file tp the list? Looks like you only install fi_ext_usnic.h if you could build the usnic libfab provider. When you configured libfabric what providers were listed at the end of configure ru

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-15 Thread Howard Pritchard
abled > > Cheers, > > Gilles > > On Saturday, August 15, 2015, Howard Pritchard > wrote: > >> Hi Jeff, >> >> I don't know why Gilles keeps picking on the persistent request problem >> and mixing >> it up with this user bug. I do think fo

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-14 Thread Howard Pritchard
eatly surprised if > he had InfiniPath on his systems where he ran into this segv issue...? > > > > On Aug 14, 2015, at 1:08 PM, Howard Pritchard > wrote: > > > > Hi Gilles, > > > > Good catch! Nate we hadn't been testing on a infinipath system. &g

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-14 Thread Howard Pritchard
/www.dropbox.com/sh/pds5c5wecfpb2wk/AAAcz17UTDQErmrUqp2SPjpqa?dl=0 > > *You can run it with and without MPI:* > > > java MPITestBroke data/ > > mpirun -np 1 java MPITestBroke data/ > > *Attached is a text file of what I see when I run it with mpirun and your > debug

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-13 Thread Howard Pritchard
see when I run it with mpirun and your > debug flag. Lots of debug lines.* > > > Nate > > > > > > On Wed, Aug 12, 2015 at 11:09 AM, Howard Pritchard > wrote: > >> Hi Nate, >> >> Sorry for the delay in getting back to you. >> >> W

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-12 Thread Howard Pritchard
jre > --enable-libgcj-multifile --enable-java-maintainer-mode > --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib > --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 > --build=x86_64-redhat-linux > Thread model: posix > gcc version 4.4.7 20120313

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-06 Thread Howard Pritchard
of data and does various > processing to it. > > Attached is a tweets.tgz file that you can uncompress to have an input > directory. The text file is just the same line over and over again. Run it > as: > > *java MPITestBroke tweets/* > > > Nate > > > >

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-05 Thread Howard Pritchard
an input > directory. The text file is just the same line over and over again. Run it > as: > > *java MPITestBroke tweets/* > > > Nate > > > > > > On Wed, Aug 5, 2015 at 8:29 AM, Howard Pritchard > wrote: > >> Hi Nate, >> >> Sorry

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-05 Thread Howard Pritchard
;> that's the proper way to do it. Attached is my config log. The behavior >>> when running our code appears to be the same. The output is the same error >>> I pasted in my email above. It occurs when calling MPI.init(). >>> >>> I'm not great at debug

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-04 Thread Howard Pritchard
nit(). > > I'm not great at debugging this sort of stuff, but happy to try things out > if you need me to. > > Nate > > > On Tue, Aug 4, 2015 at 5:09 AM, Howard Pritchard > wrote: > >> Hello Nate, >> >> As a first step to addressing this, cou

Re: [OMPI users] segfault on java binding from MPI.init()

2015-08-04 Thread Howard Pritchard
Hello Nate, As a first step to addressing this, could you please try using gcc rather than the Intel compilers to build Open MPI? We've been doing a lot of work recently on the java bindings, etc. but have never tried using any compilers other than gcc when working with the java bindings. Thanks

Re: [OMPI users] Running with native ugni on a Cray XC

2015-06-30 Thread Howard Pritchard
gt; Cray, Inc. > ------ > *From:* users [users-boun...@open-mpi.org] on behalf of Howard Pritchard [ > hpprit...@gmail.com] > *Sent:* Thursday, June 25, 2015 11:00 PM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] Running with native ugni on a Cray XC &g

  1   2   >