[OMPI users] freeing attributes on communicators
Hello all. I'm using openmpi-1.3 in this example, linux, gcc-4.3.2, configured with nothing special. If I run the following simple C code under valgrind, single process, I get some errors about reading and writing already-freed memory: --- #include #include int delete_fn(MPI_Comm comm, int keyval, void *attr, void *extra) { MPI_Keyval_free(); return 0; } int main (int argc, char **argv) { MPI_Comm duped; int keyval; MPI_Init(, ); MPI_Comm_dup(MPI_COMM_SELF, ); MPI_Keyval_create(MPI_NULL_COPY_FN, delete_fn, , NULL); MPI_Attr_put(MPI_COMM_SELF, keyval, NULL); MPI_Attr_put(duped, keyval, NULL); MPI_Comm_free(); MPI_Finalize(); return 0; } --- My main question here: Am I doing something wrong, or have I managed to confuse openmpi's reference counts somehow? ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] MPI_File_write_ordered does not truncate files
On Wed, Feb 18, 2009 at 02:24:03PM -0700, Ralph Castain wrote: > Hi Rob > > Guess I'll display my own ignorance here: > >>> MPI_File_open( MPI_COMM_WORLD, "foo.txt", >>>MPI_MODE_CREATE | MPI_MODE_WRONLY, >>>MPI_INFO_NULL, ); > > > Since the file was opened with MPI_MODE_CREATE, shouldn't it have been > truncated so the prior contents were removed? I think that's the root of > the confusion here. It appears that MPI_MODE_CREATE doesn't cause the > opened file to be truncated, but instead just leaves it "as-is". > > Is that correct? "The modes MPI_MODE_RDONLY, MPI_MODE_RDWR, MPI_MODE_WRONLY, MPI_MODE_CREATE, and MPI_MODE_EXCL have identical semantics to their POSIX counterparts" MPI_MODE_CREATE behaves like O_CREATE There is no MPI-IO flag corresponding to O_TRUNK. Guess you'd have to MPI_FILE_SET_SIZE after MPI_FILE_OPEN ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] MPI_Type_create_darray causes MPI_File_set_view to crash when ndims=2, array_of_gsizes[0]>array_of_gsizes[1]
On Fri, Oct 31, 2008 at 11:19:39AM -0400, Antonio Molins wrote: > Hi again, > > The problem in a nutshell: it looks like, when I use > MPI_Type_create_darray with an argument array_of_gsizes where > array_of_gsizes[0]>array_of_gsizes[1], the datatype returned goes > through MPI_Type_commit() just fine, but then it causes > MPI_File_set_view to crash!! Any idea as to why this is happening? Hi. It sounds like from your description (and confirmed in your backtrace) that this is a ROMIO bug. Do you happen to have a small test case for this crash? Thanks ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] ADIOI_GEN_DELETE
On Thu, Oct 23, 2008 at 12:41:45AM -0200, Davi Vercillo C. Garcia (ダヴィ) wrote: > Hi, > > I'm trying to run a code using OpenMPI and I'm getting this error: > > ADIOI_GEN_DELETE (line 22): **io No such file or directory > > I don't know why this occurs, I only know this happens when I use more > than one process. Hey, sorry, I don't check in here very often, but I'm the "ROMIO guy" around these parts. This is a harmless warning message. You see this with more than one process because one process "won" and deleted the file, and the other N-1 processes then try to delete a file that doesn't exist. If you ignore errors from MPI_File_delete, then you won't see this error :> MPI_FILE_DELETE is not a collective operation, so you can also just call this from one process. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] bug in MPI_File_get_position_shared ?
On Sat, Aug 16, 2008 at 08:05:14AM -0400, Jeff Squyres wrote: > On Aug 13, 2008, at 7:06 PM, Yvan Fournier wrote: > >> I seem to have encountered a bug in MPI-IO, in which >> MPI_File_get_position_shared hangs when called by multiple processes >> in >> a communicator. It can be illustrated by the following simple test >> case, >> in which a file is simply created with C IO, and opened with MPI-IO. >> (defining or undefining MY_MPI_IO_BUG on line 5 enables/disables the >> bug). From the MPI2 documentation, It seems that all processes should >> be >> able to call MPI_File_get_position_shared, but if more than one >> process >> uses it, it fails. Setting the shared pointer helps, but this should >> not >> be necessary, and the code still hangs (in more complete code, after >> writing data). >> >> I encounter the same problem with Open MPI 1.2.6 and MPICH2 1.0.7, so >> I may have misread the documentation, but I suspect a ROMIO bug. > > Bummer. :-( > > It would be best to report this directly to the ROMIO maintainers via > romio-ma...@mcs.anl.gov. They lurk on this list, but they may not be > paying attention to every mail. Hi, that would be me, and yup, as you can see I don't check in too often. Just to wrap this up, I'm glad you found workarounds. Shared file pointers have a certain seductive quality about them, but they are a pain in the neck to implement in the library. You will almost assuredly scale to larger numbers of processors and achieve higher I/O bandwidth if you do just a little bit of work. Keep track of file offsets on your own and instead of doing independent I/O use MPI_File_read_at_all or MPI_File_write_at_all. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] Parallel I/O with MPI-1
On Wed, Jul 23, 2008 at 09:47:56AM -0400, Robert Kubrick wrote: > HDF5 supports parallel I/O through MPI-I/O. I've never used it, but I > think the API is easier than direct MPI-I/O, maybe even easier than raw > read/writes given its support for hierarchal objects and metadata. In addition to the API provided by parallel HDF5 and parallel-NetCDF, these high level libraries offer a self-describing portable file format. Pretty nice when collaborating with others. Plus there are a host of viewers for these file formats, so that's another thing you don't have to worry about. > HDF5 supports multiple storage models and it supports MPI-IO. > HDF5 has an open interface to access raw storage. This enables HDF5 > files to be written to a variety of media, including sequential files, > families of files, memory, Unix sockets (i.e., a network). > New "Virtual File" drivers can be added to support new storage access > mechanisms. > HDF5 also supports MPI-IO with Parallel HDF5. When building HDF5, > parallel support is included by configuring with the --enable-parallel > option. A tutorial for Parallel HDF5 is included with the HDF5 Tutorial > at: > /HDF5/Tutor/ It's a very good tutorial. Do read the parallel I/O chapter closely, especially the parts about enabling collective I/O via property lists and transfer templates. For many HDF5 workloads today, collective I/O is the key to getting good performance (this was not always the case back in the bad old days of MPICH1 and LAM, but has been since at least the HDF5-1.6 series). ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] Parallel I/O with MPI-1
On Wed, Jul 23, 2008 at 02:24:03PM +0200, Gabriele Fatigati wrote: > >You could always effect your own parallel IO (e.g., use MPI sends and > receives to coordinate parallel reads and writes), but >why? It's already > done in the MPI-IO implementation. > > Just a moment: you're saying that i can do fwrite without any lock? OpenMPI > does this? You use MPI to describe your I/O regions. In fact, these I/O regions can even overlap (something that you can't do efficiently with lock-based approaches). Even better, if you do your I/O "collectively" the MPI library will optimize the heck out of your accesses. When I was learning all this way back when, it took me a long time to get all the details straight (memory types, file views, tiling, independent vs. collective), but a few readings of the I/O chapter of "Using MPI-2" set me straight. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] Problem with NFS + PVFS2 + OpenMPI
On Thu, May 29, 2008 at 04:48:49PM -0300, Davi Vercillo C. Garcia wrote: > > Oh, I see you want to use ordered i/o in your application. PVFS > > doesn't support that mode. However, since you know how much data each > > process wants to write, a combination of MPI_Scan (to compute each > > processes offset) and MPI_File_write_at_all (to carry out the > > collective i/o) will give you the same result with likely better > > performance (and has the nice side effect of working with pvfs). > > I don't understand very well this... what do I need to change in my code ? MPI_File_write_ordered has an interesting property (which you probably know since you use it, but i'll spell it out anyway): writes end up in the file in rank-order, but are not necessarily carried out in rank-order. Once each process knows the offsets and lengths of the writes the other process will do, that process can writes its data. Observe that rank 0 can write immediately. Rank 1 only needs to know how much data rank 0 will write. and so on. Rank N can compute its offset by knowing how much data the proceeding N-1 processes want to write. The most efficent way to collect this is to use MPI_Scan and collect a sum of data: http://www.mpi-forum.org/docs/mpi-11-html/node84.html#Node84 Once you've computed these offsets, MPI_File_write_at_all has enough information to cary out a collective write of the data. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] Problem with NFS + PVFS2 + OpenMPI
On Thu, May 29, 2008 at 04:24:18PM -0300, Davi Vercillo C. Garcia wrote: > Hi, > > I'm trying to run my program in my environment and some problems are > happening. My environment is based on PVFS2 over NFS (PVFS is mounted > over NFS partition), OpenMPI and Ubuntu. My program uses MPI-IO and > BZ2 development libraries. When I tried to run, a message appears: > > File locking failed in ADIOI_Set_lock. If the file system is NFS, you > need to use NFS version 3, ensure that the lockd daemon is running on > all the machines, and mount the directory with the 'noac' option (no > attribute caching). > [campogrande05.dcc.ufrj.br:05005] MPI_ABORT invoked on rank 0 in > communicator MPI_COMM_WORLD with errorcode 1 > mpiexec noticed that job rank 1 with PID 5008 on node campogrande04 > exited on signal 15 (Terminated). Hi. NFS has some pretty sloppy consistency semantics. If you want parallel I/O to NFS you have to turn off some caches (the 'noac' option in your error message) and work pretty hard to flush client-side caches (which ROMIO does for you using fcntl locks). If you do this, note that your performance will be really bad, but you'll get correct results. Your nfs-exported PVFS volumes will give you pretty decent serial i/o performance since NFS caching only helps in that case. I'd suggest, though, that you try using straight PVFS for your MPI-IO application, as long as the parallell clients have access to all of the pvfs servers (if tools like pvfs2-ping and pvfs2-ls work, then you do). You'll get better performance for a variety of reasons and can continue to keep your NFS-exported PVFS volumes up at the same time. Oh, I see you want to use ordered i/o in your application. PVFS doesn't support that mode. However, since you know how much data each process wants to write, a combination of MPI_Scan (to compute each processes offset) and MPI_File_write_at_all (to carry out the collective i/o) will give you the same result with likely better performance (and has the nice side effect of working with pvfs). ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] MPI_Request and attributes
On Fri, Nov 02, 2007 at 12:18:54PM +0100, Oleg Morajko wrote: > Is there any standard way of attaching/retrieving attributes to MPI_Request > object? > > Eg. Typically there are dynamic user data created when starting the > asynchronous operation and freed when it completes. It would be convenient > to associate them with the request object itself to simplify the code. You might find generalized requests offer what you want if you don't mind spawning threads. You don't get to hook an attribute onto an MPI_Request object, but you do get a void * ponter which the implementation then associates with your user-defined request. This void * could be a datatype containing the orginali MPI_Request and your user data. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] mpiio romio etc
On Fri, Sep 14, 2007 at 02:31:51PM -0400, Jeff Squyres wrote: > Ok. Maybe we'll just make a hard-coded string somewhere "ROMIO from > MPICH2 vABC, on AA/BB/" or somesuch. That'll at least give some > indication of what version you've got. That sort-of reminds me: ROMIO (well, all of MPICH2) is going to move to SVN "one of these days". Once we've done that, you'll be able to sync up with both MPICH2 releases and our development branch. I think it wouldn't be a problem for us to tag ROMIO whenever you sync up with it. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] mpiio romio etc
On Fri, Sep 14, 2007 at 02:16:46PM -0400, Jeff Squyres wrote: > Rob -- is there a public constant/symbol somewhere where we can > access some form of ROMIO's version number? If so, we can also make > that query-able via ompi_info. There really isn't. We used to have a VERSION variable in configure.in, but more often than not it would be out of date. When you sync with ROMIO, you could update a datestamp maybe? Just throwing out ideas. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] mpiio romio etc
On Fri, Sep 07, 2007 at 10:18:55AM -0400, Brock Palen wrote: > Is there a way to find out which ADIO options romio was built with? not easily. You can use 'nm' and look at the symbols :> > Also does OpenMPI's romio come with pvfs2 support included? What > about Luster or GPFS. OpenMPI has shipped with PVFS v2 support for a long time. Not sure how you enable it, though. --with-filesystems=ufs+nfs+pvfs2 might work for OpenMPI as it does for MPICH2. All versions of ROMIO support Lustre and GPFS the same way: with the "generic unix filesystem" (UFS) driver. Weikuan Yu at ORNL has been working on a native "AD_LUSTRE" driver and some improvements to ROMIO collective I/O. Likely to be in the next ROMIO release. For GPFS, the only optimized MPI-IO implementation is IBM's MPI for AIX. You're likely to see decent performance with the UFS driver, though. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] DataTypes with "holes" for writing files
On Tue, Jul 10, 2007 at 04:36:01PM +, jody wrote: > Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks > [aim-nano_02:9] MPI_ABORT invoked on rank 0 in communicator > MPI_COMM_WORLD with errorcode 1 Hi Jody: OpenMPI uses an old version of ROMIO. You get this error because the ADIOI_Count_contiguous_blocks routine in this version of ROMIO does not understand all MPI datatypes. You can verify that this is the case by building your test against MPICH2, which should succeed. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] nfs romio
On Mon, Jul 02, 2007 at 12:49:27PM -0500, Adams, Samuel D Contr AFRL/HEDR wrote: > Anyway, so if anyone can tell me how I should configure my NFS server, > or OpenMPI to make ROMIO work properly, I would appreciate it. Well, as Jeff said, the only safe way to run NFS servers for ROMIO is by disabling all caching, which in turn will dramatically slow down performance. Since NFS is performing so slowly for you, I'd suggest taking this opportunity to deploy a parallel file system. PVFS, Lustre, or GPFS might make fine choices. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] MPI_Type_create_subarray fails!
On Tue, Jan 30, 2007 at 04:55:09PM -0500, Ivan de Jesus Deras Tabora wrote: > Then I find all the references to the MPI_Type_create_subarray and > create a little program just to test that part of the code, the code I > created is: ... > After running this little program using mpirun, it raises the same error. This small program runs fine under MPICH2. Either you have found a bug in OpenMPI (passing it a datatype it should be able to handle), or a bug in MPICH2 (passing it a datatype it handled, but should have complained about). ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] external32 i/o not implemented?
On Tue, Jan 09, 2007 at 02:53:24PM -0700, Tom Lund wrote: > Rob, >Thank you for your informative reply. I had no luck finding the > external32 data representation in any of several mpi implementations and > thus I do need to devise an alternative strategy. Do you know of a good > reference explaining how to combine HDF5 with mpi? Sure. Start here: http://hdf.ncsa.uiuc.edu/HDF5/PHDF5/ You might also find the Parallel-NetCDF project (disclaimer: I work on that project) interesting: http://www.mcs.anl.gov/parallel-netcdf/ ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] external32 i/o not implemented?
On Mon, Jan 08, 2007 at 02:32:14PM -0700, Tom Lund wrote: > Rainer, >Thank you for taking time to reply to my querry. Do I understand > correctly that external32 data representation for i/o is not > implemented? I am puzzled since the MPI-2 standard clearly indicates > the existence of external32 and has lots of words regarding how nice > this feature is for file interoperability. So do both Open MPI and > MPIch2 not adhere to the standard in this regard? If this is really the > case, how difficult is it to define a custom data representation that is > 32-bit big endian on all platforms? Do you know of any documentation > that explains how to do this? >Thanks again. Hi Tom You do understand correctly. I do not know of an MPI-IO implementation that supports external32. When you say "custom data representation" do you mean an MPI-IO user-defined data representation? An alternate approach would be to use a higher level library like parallel-netcdf or HDF5 (configured for parallel i/o). Those libraries already define a file format and implement all the necessary data conversion routines, and they have a wealth of ancilary tools and programs to work with their respective file formats. Additionally, those higher-level libraries will offer you more features than MPI-IO such as the ability to define atributes on variables and datafiles. Even better, there is the potential that these libraries might offer some clever optimizations for your workload, saving you the effort. Further, you can use those higher-level libraries on top of any MPI-IO implementation, not just OpenMPI or MPICH2. This is a little bit of a diversion from your original question, but to sum it up, I'd say one potential answer to the lack of external32 is to use a higher level library and sidestep the issue of MPI-IO data representations altogether. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
Re: [OMPI users] pvfs2 and romio
On Mon, Aug 14, 2006 at 10:57:34AM -0400, Brock Palen wrote: > We will be evaluating pvfs2 (www.pvfs.org) in the future. Is their > any special considerations to take to get romio support with openmpi > with pvfs2 ? Hi Since I wrote the ad_pvfs2 driver for ROMIO, and spend a lot of time on PVFS2 in general, I've got a special interest in this thread :> I hope your evaluation went well. I don't know how well the PVFS2 support in OpenMPI has tracked "upstream". The last official ROMIO release was 1.2.5.1 (and that was ... gosh, 3 years ago at least. sorry!). In the meantime, ROMIO's PVFS2 driver has seen a lot of changes. the two codes (ROMIO in OpenMPI vs ROMIO in MPICH2) are laid out differently enough that it's hard to compare directly (too bad 'diff' isn't smarter about renamed files), but I think OpenMPI has got at least the biggest bug fixes. Do follow up on this thread to let us (or at least me) know how well OpenMPI works with PVFS2. If you run into problems, I may be able to provide a patch. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B
Re: [OMPI users] MPI_Info for MPI_Open_port
On Tue, Jul 11, 2006 at 12:14:51PM -0400, Abhishek Agarwal wrote: > Hello, > > Is there a way of providing a specific port number in MPI_Info when using a > MPI_Open_port command so that clients know which port number to connect. The other replies have covered this pretty well but if you are dead-set on using a tcp port (and not an MPI port) would MPI_Comm_join work for you? ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B
[OMPI users] MPI_Comm_connect and singleton init
Hello In playing around with process management routines, I found another issue. This one might very well be operator error, or something implementation specific. I've got two processes (a and b), linked with openmpi, but started independently (no mpiexec). - A starts up and calls MPI_Init - A calls MPI_Open_port, prints out the port name to stdout, then calls MPI_Comm_accept and blocks. - B takes as a command line argument the port name printed out by A. It calls MPI_Init and then and passes that port name to MPI_Comm_connect - B gets the following error: [leela.mcs.anl.gov:04177] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file ../../../orte/dps/dps_unpack.c at line 121 [leela.mcs.anl.gov:04177] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file ../../../orte/dps/dps_unpack.c at line 95 [leela.mcs.anl.gov:04177] *** An error occurred in MPI_Comm_connect [leela.mcs.anl.gov:04177] *** on communicator MPI_COMM_WORLD [leela.mcs.anl.gov:04177] *** MPI_ERR_UNKNOWN: unknown error [leela.mcs.anl.gov:04177] *** MPI_ERRORS_ARE_FATAL (goodbye) [leela.mcs.anl.gov:04177] [0,0,0] ORTE_ERROR_LOG: Not found in file ../../../../../orte/mca/pls/base/pls_base_proxy.c at line 183 - A is still waiting for someone to connect to it. Did I pass MPI port strings between programs the correct way, or is MPI_Publish_name/MPI_Lookup_name the prefered way to pass around this information? Thanks ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B
[OMPI users] comm_join and singleton init
Hi I've got a bit of an odd bug here. I've been playing around with MPI process management routines and I notied the following behavior with openmpi-1.0.1: Two processes (a and b), linked with ompi, but started independently (no mpiexec, just started the programs directly). - a and b: call MPI_Init - a: open a unix network socket on 'fd' - b: connect to a's socket - a and b: call MPI_Comm_join over 'fd' - a and b: call MPI_Intercomm_merge, get intracommunicator. These steps all work fine. Now the odd part: a and b call MPI_Comm_rank and MPI_Comm_size over the intracommunicator. Both (correctly) think Comm_size is two, but both also think (incorrectly) that they are rank 1. ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B