[AMD Official Use Only - General]
UCX will disqualify itself unless it finds cuda, rocm, or InfiniBand network to
use. To allow UCX to run on a regular shared memory job without GPUs or IB, you
have to set UCX_TLS environment variable explicitly allowe UCX to run for shm,
e.g :
mpirun -x UCX_T
There was work done in ompio in that direction, but the code wasn’t actually
committed into the main repository. It probably exists somewhere in a branch
somewhere. If you are interested, please ping me directly and I can put you in
contact with the person that wrote the code and to clarify the
[AMD Official Use Only - General]
I can also offer to help if there are any question regarding the ompio code,
but I do not have the bandwidth/resources to do that myself, and more
importantly, I do not have a platform to test the new component.
Edgar
From: users On Behalf Of Jeff Squyres
(js
Hi,
There are a few things that you could test to see whether they make difference.
1. Try to modify the number of aggregators used in collective I/O (assuming
that the code uses collective I/O). You could try e.g. to set it to the number
of nodes used (the algorithm determining the number
is the file I/O that you mentioned using MPI I/O for that? If yes, what
file system are you writing to?
Edgar
On 4/5/2018 10:15 AM, Noam Bernstein wrote:
On Apr 5, 2018, at 11:03 AM, Reuti wrote:
Hi,
Am 05.04.2018 um 16:16 schrieb Noam Bernstein :
Hi all - I have a code that uses MPI (va
ers,
Vahid
On Jan 22, 2018, at 6:05 PM, Edgar Gabriel <mailto:egabr...@central.uh.edu>> wrote:
well, my final comment on this topic, as somebody suggested earlier
in this email chain, if you provide the input with the -i argument
instead of piping from standard input, things seem to w
application does not complain about the 'end of file while
reading crystal k points'). So maybe that is the most simple solution.
Thanks
Edgar
On 1/22/2018 1:17 PM, Edgar Gabriel wrote:
after some further investigation, I am fairly confident that this is
not an MPI I/O problem.
uld be another bug. It does not
happen for me with intel14/openmpi-1.8.8.
Thanks for the update,
Vahid
On Jan 19, 2018, at 3:08 PM, Edgar Gabriel <mailto:egabr...@central.uh.edu>> wrote:
ok, here is what found out so far, will have to stop for now however
for today:
1. I can in fact r
this is most likely a different issue. The bug in the original case is
appearing also on a local file system/disk, it doesn't have to be NSF.
That being said, I would urge to submit a new issue ( or a new email
thread), I would be more than happy to look into your problem as well,
since we sub
:44 AM, Vahid Askarpour wrote:
Hi Edgar,
Just to let you know that the nscf run with --mca io ompio crashed
like the other two runs.
Thank you,
Vahid
On Jan 19, 2018, at 12:46 PM, Edgar Gabriel <mailto:egabr...@central.uh.edu>> wrote:
ok, thank you for the information. Two short questions
; and then “make
all install”. I did not enable or disable any other options.
Cheers,
Vahid
On Jan 19, 2018, at 10:23 AM, Edgar Gabriel <mailto:egabr...@central.uh.edu>> wrote:
thanks, that is interesting. Since /scratch is a lustre file system,
Open MPI should actually utilize rom
why the latest Open MPI is crashing.
Cheers,
Gilles
On Fri, Jan 19, 2018 at 10:55 AM, Edgar Gabriel wrote:
I will try to reproduce this problem with 3.0.x, but it might take me a
couple of days to get to it.
Since it seemed to have worked with 2.0.x (except for the running out file
handles pro
I will try to reproduce this problem with 3.0.x, but it might take me a
couple of days to get to it.
Since it seemed to have worked with 2.0.x (except for the running out file
handles problem), there is the suspicion that one of the fixes that we
introduced since then is the problem.
What file
the bug report!
Edgar
On 11/6/2017 8:25 AM, Edgar Gabriel wrote:
I'll have a look at it. I can confirm that I can replicate the problem,
and I do not see an obvious mistake in your code for 1 process
scenarios. Will keep you posted.
Thanks
Edgar
On 11/6/2017 7:52 AM, Christopher Brady
rateful.
Many Thanks
Chris Brady
Senior Research Software Engineer
University of Warwick
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
--
Edgar Gabriel
Associate Professor
Parallel Software Tec
I'll have a look at it. I can confirm that I can replicate the problem,
and I do not see an obvious mistake in your code for 1 process
scenarios. Will keep you posted.
Thanks
Edgar
On 11/6/2017 7:52 AM, Christopher Brady wrote:
I have been working with a Fortran code that has to write a la
I opened an issue on this, hope to have the fix available next week.
https://github.com/open-mpi/ompi/issues/4334
Thanks
Edgar
On 10/12/2017 8:36 PM, Edgar Gabriel wrote:
try for now to switch to the romio314 component with OpenMPI. There is
an issue with NFS and OMPIO that I am aware of and
try for now to switch to the romio314 component with OpenMPI. There is
an issue with NFS and OMPIO that I am aware of and working on, that
might trigger this behavior (although it should actually work for
collective I/O even in that case).
try to set something like
mpirun --mca io romio314 ..
thank you for the report and the code, I will look into this. What file
system is that occurring on?
Until I find the problem, note that you could switch to back to the
previous parallel I/O implementation (romio) by providing that as a
parameter to your mpirun command, e.g.
mpirun --mca io
I would argue that if the standard does not mention NULL as a valid
argument, we should probably remove it from the man pages. I can not
recall having seen a code using that feature to be honest.
Thanks
Edgar
On 5/11/2017 8:46 PM, Bert Wesarg via users wrote:
Hi,
the MPI_File_set_view.3 man
narrow it down.
On 4/28/2017 8:46 AM, Edgar Gabriel wrote:
actually, reading through the email in more details, I actually doubt
that it is OMPIO.
"I deem an output wrong if it doesn't follow from the parameters or if
the program crashes on execution.
The only difference between O
different behavior on error: OpenMPI will mostly write
wrong data but won't crash, whereas Intel MPI mostly crashes."
I will still look into that though.
THanks
Edgar
On 4/28/2017 8:26 AM, Edgar Gabriel wrote:
Thank you for the detailed analysis, I will have a look into that. It
would
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
-
Can you try to just skip the --with-lustre option ? The option really is
there to provide an alternative path, if the lustre libraries are not
installed in the default directories ( e.g.
--with-lustre=/opt/lustre/). There is obviously a bug that the
system did not recognize the missing arg
thank you for the report, it is on my to do list. I will try to get the
configure logic to recognize which file to use later this, should
hopefully be done for 2.0.3 and 2.1.1 series.
Thanks
Edgar
On 3/13/2017 8:55 AM, Åke Sandgren wrote:
Hi!
The lustre support in ompi/mca/fs/lustre/fs_lus
Edgar Gabriel <mailto:egabr...@central.uh.edu>> schrieb am Fr. 3. März 2017 um 07:45:
Nicolas,
thank you for the bug report, I can confirm the behavior. I will
work on
a patch and will try to get that into the next release, should
hopefully
not be too co
Nicolas,
thank you for the bug report, I can confirm the behavior. I will work on
a patch and will try to get that into the next release, should hopefully
not be too complicated.
Thanks
Edgar
On 3/3/2017 7:36 AM, Nicolas Joly wrote:
Hi,
We just got hit by a problem with sharedfp/lockedfi
just wanted to give a brief update on this. The problem was in fact that
we did not correctly move the shared file pointer to the end of the file
when a file is opened in append mode. (The individual file pointer did
the right thing however). The patch itself is not overly complected, I
filed a
I will look into this, I have a suspicion on what might be wrong. Give
me a day or three.
Thanks
EDgar
On 1/18/2017 9:36 AM, Nicolas Joly wrote:
Hi,
We have a tool where all workers will use MPI_File_write_shared() on a
file that was opened with MPI_MODE_APPEND, mostly because rank 0 will
h
arily means it has a MPI_Barrier()
semantic.
Cheers,
Gilles
On 5/31/2016 11:18 PM, Edgar Gabriel wrote:
just for my understanding, which bug in ompio are you referring? I am
only aware of a single (pretty minor) pending issue in the 2.x series
Thanks
Edgar
On 5/31/2016 1:28 AM, Gille
x27;m missing or is this a regression?
Thanks,
Cihan
___
users mailing list
us...@open-mpi.org
Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this
post:http://www.open-mpi.org/community/lists/users/2016/05/29333.php
--
What version of Open MPI did you execute your test with?
Edgar
On 4/7/2016 1:54 PM, david.froger...@mailoo.org wrote:
Hello,
Here is a simple `C` program reading a file in parallel with `MPI IO`:
#include
#include
#include "mpi.h"
#define N 10
main( int argc,
gt; > users mailing list
>> > us...@open-mpi.org
>> > Subscription:
http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> >
http://www.open-mpi.org/community/lists/users/2016/03/28793.php
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription:
http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/03/28794.php
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28796.php
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/03/28800.php
--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
--
On 3/16/2016 7:06 AM, Éric Chamberland wrote:
Le 16-03-14 15:07, Rob Latham a écrit :
On mpich's discussion list the point was made that libraries like HDF5
and (Parallel-)NetCDF provide not only the sort of platform
portability Eric desires, but also provide a self-describing file format.
==ro
directly calls the io/ompio component) and that was
fixed in master.
Can you confirm that ?
Delphine,
The problem should disappear if you use romio instead of ompio
Cheers,
Gilles
On Wednesday, February 10, 2016, Edgar Gabriel
> wrote:
which version of Open MPI is this?
Thanks
Ed
advance for your help.
Regards
Delphine
--
<http://www.univ-reunion.fr> *Delphine Ramalingom Barbary | Ingénieure
en Calcul Scientifique *
Direction des Usages du Numérique (DUN)
Centre de Développement du Calcul Scientifique
TEL : 02 62 93 84 87- FAX : 02 62 93 81 06
--
Edgar Gabriel
Ass
used by ompio.
Cheers,
Gilles
On Tuesday, January 26, 2016, Edgar Gabriel <mailto:egabr...@central.uh.edu>> wrote:
I can look into that, but just as a note, that code is now for
roughly 5 years in master in *all* fs components, so its not
necessarily new (it just shows
I can look into that, but just as a note, that code is now for roughly 5
years in master in *all* fs components, so its not necessarily new (it
just shows how often we compile with solaris). Based on what I see in
the opal/util/path.c the function opal_path_nfs does something very
similar, but
offer,
that I can run the tests if you provide me the source code).
Thanks
Edgar
On 12/9/2015 9:30 AM, Edgar Gabriel wrote:
what does the mount command return?
On 12/9/2015 9:27 AM, Paul Kapinos wrote:
Dear Edgar,
On 12/09/15 16:16, Edgar Gabriel wrote:
I tested your code in master and v1.10
ok, forget it, I found the issue. I totally forgot that in the 1.10
series I have to manually force ompio ( it is the default on master and
2.x). It fails now for me as well with v1.10, will elt you know what I find.
Thanks
Edgar
On 12/9/2015 9:30 AM, Edgar Gabriel wrote:
what does the mount
what does the mount command return?
On 12/9/2015 9:27 AM, Paul Kapinos wrote:
Dear Edgar,
On 12/09/15 16:16, Edgar Gabriel wrote:
I tested your code in master and v1.10 ( on my local machine), and I get for
both version of ompio exactly the same (correct) output that you had with romio
on
lustre. Did you run your test by any chance on a lustre file system?
Thanks
Edgar
On 12/9/2015 8:06 AM, Edgar Gabriel wrote:
I will look at your test case and see what is going on in ompio. That
being said, the vast number of fixes and improvements that went into
ompio over the last two
on: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/12/28145.php
--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Houston
/lists/users/2015/04/26634.php
___
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/04/26637.php
://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193
==rob
Thanks for replays
Hanousek Vít
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post
Here are the communication operations occurring in the best case
scenario in Open MPI right now:
Comm_create:
- Communicator ID allocation: 2 Allreduce operations per round of
negotiations
- 1 Allreduce operation for 'activating' the communicator
Comm_split:
- 1 Allgather operation fo
(collective), this requires calculating offset,
which is pre-determinable. It does not require "flock".
In summary, scenario 2 can avoid "flock" requirement by using 2B, but
scenario 1 cannot.
Thanks for the report. OMPIO might support shared file pointers better
-- Edgar Ga
nal Lab, IL USA
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/08/24934.php
--
Edgar Gabriel
Associate Professor
Parallel Software Technologie
4 11:23 AM, Rob Latham wrote:
>
>
> On 05/15/2014 08:32 AM, Edgar Gabriel wrote:
>> could you try just for curiosity to force to use OMPIO? e.g.
>> mpirun --mca io ompio
>
> Edgar, what is in the air that there are now three bug reports against
> ROMIO's
in+0xf5)[0x7f5844a2ede5]
> [oriol-VirtualBox:13972] [15] ./binary[0x40d679]
> [oriol-VirtualBox:13972] *** End of error message ***
> --
> mpirun noticed that process rank 2 with PID 13969 on node oriol-VirtualBox
> exited on signal 6 (
&status);
>> }
>> MPI_File_close(&cFile);
>> MPI_Type_free(&dcarray);
>> }
>>
>> Best regards,
>>
>> Oriol
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number
I will resubmit a new patch, Rob sent me a pointer to the correct
solution. Its on my to do list for tomorrow/this weekend.
Thanks
Edgar
On 3/27/2014 5:45 PM, Dave Love wrote:
> Edgar Gabriel writes:
>
>> not sure honestly. Basically, as suggested in this email chain earlier,
wrong underneath the hood.
Edgar
On 3/25/2014 9:21 AM, Rob Latham wrote:
>
>
> On 03/25/2014 07:32 AM, Dave Love wrote:
>> Edgar Gabriel writes:
>>
>>> I am still looking into the PVFS2 with ROMIO problem with the 1.6
>>> series, where (as I mentioned y
yes, the patch has been submitted to the 1.6 branch for review, not sure
what the precise status of it is. The problems found are more or less
independent of the PVFS2 version.
Thanks
Edga
On 3/25/2014 7:32 AM, Dave Love wrote:
> Edgar Gabriel writes:
>
>> I am still looking in
On 2/27/2014 9:44 AM, Dave Love wrote:
> Edgar Gabriel writes:
>
>> so we had ROMIO working with PVFS2 (not OrangeFS, which is however
>> registered as PVFS2 internally). We have one cluster which uses
>> OrangeFS, on that machine however we used OMPIO, not ROMIO.
>
sure whether Nathan did.
Thanks
Edgar
On 2/26/2014 4:52 PM, Latham, Robert J. wrote:
> On Tue, 2014-02-25 at 07:26 -0600, Edgar Gabriel wrote:
>> this was/is a bug in ROMIO, in which they assume a datatype is an int. I
>> fixed it originally in a previous version of Open MPI on the t
nousek Vít
>
>
>
> -- Původní zpráva --
> Od: Edgar Gabriel
> Komu: us...@open-mpi.org
> Datum: 26. 2. 2014 21:08:07
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
>
>
> not sure whether its the problem or not, but usually have an additiona
ompile OpenMPI 1.6.5 now (with
> fixed "switch =>ifs" in ROMIO).
>
> I will test if it is working in next hour (some configuration steps
> is needed).
>
> Thanks.
> Hanousek Vít
>
> -- Původní zpráva --
> Od: Ed
ned 1 exit status
> > make[10]: *** [otfmerge-mpi] Error 1
> > (...)
> >
> > Now I realy dont know, what is wrong.
> > Is there Anybody ho has OpenMPI working with OrangeFS?
> >
> > Thanks for replies
> > HanousekVít
>
.
Thanks
Edgar
On 2/25/2014 7:34 AM, Jeff Squyres (jsquyres) wrote:
> Edgar --
>
> Is there a fix that we should CMR to the v1.6 branch?
>
>
> On Feb 25, 2014, at 8:26 AM, Edgar Gabriel wrote:
>
>> this was/is a bug in ROMIO, in which they assume a datatype is an int. I
d source code of OpenMPI on our cluster.
> Is this bug of sourcecode and will be fixed, or am I doing something wrong?
>
> Thanks for reply
> Hanousek Vít
>
>
> _______
> users mailing list
> us...@open-mpi.org
> http://www.open-mp
:55 AM, Edgar Gabriel wrote:
> can you maybe detail more precisely what scenario you are particularly
> worried about? I would think that the return code of the operation
> should be reliable on whether opening the file successful or (i.e.
> MPI_SUCCESS vs. anything else).
>
> Edgar
can you maybe detail more precisely what scenario you are particularly
worried about? I would think that the return code of the operation
should be reliable on whether opening the file successful or (i.e.
MPI_SUCCESS vs. anything else).
Edgar
On 5/17/2013 4:00 AM, Peter van Hoof wrote:
> Dear use
tid % 2) == 0)
>> {
>>/* col_block[i][j] *= col_block[i][j] */
>>
>>*((double *) col_block + i * num_columns[group_w_mytid] + j) *=
>>*((double *) col_block + i * num_columns[group_w_mytid] + j);
>> }
>> else
>> {
>&
what file system is this on?
On 5/10/2012 12:37 PM, Ricardo Reis wrote:
>
> what is the communicator that you used to open the file? I am wondering
> whether it differs from the communicator used in MPI_Barrier, and some
> processes do not enter the Barrier at all...
>
> Thanks
> Edgar
>
>
> w
what is the communicator that you used to open the file? I am wondering
whether it differs from the communicator used in MPI_Barrier, and some
processes do not enter the Barrier at all...
Thanks
Edgar
On 5/10/2012 12:22 PM, Ricardo Reis wrote:
>
> Hi all
>
> I'm trying to run my code in a clu
;ve tried with and without this option. In both the
> result was the same... =(
>
> On Wed, Apr 4, 2012 at 5:04 PM, Edgar Gabriel <mailto:gabr...@cs.uh.edu>> wrote:
>
> did you try to start the program with the --mca coll ^inter switch that
> I mentioned? Collecti
Apr 4, 2012 at 5:04 PM, Edgar Gabriel <mailto:gabr...@cs.uh.edu>> wrote:
>
> did you try to start the program with the --mca coll ^inter switch that
> I mentioned? Collective dup for intercommunicators should work, its
> probably again the bcast over a communicator o
e
> tmp_inter_comm created without one process still has this process,
> because the other processes are waiting for it call the Dup too.
>
> What do you think?
>
> On Wed, Mar 28, 2012 at 6:03 PM, Edgar Gabriel <mailto:gabr...@cs.uh.edu>> wrote:
>
> it just uses a d
he client processes.
>
> I tryed it with 1 server and 3 clients and it worked properly. After
> I removed 1 of the clients, it stops working. So, the removal is
> affecting the functionality of Barrier, I guess.
>
> Anyone has an idea?
>
>
> On Mon,
; server, client and a .h containing some constants. I put some
> "prints" to show the behavior.
>
> Regards
>
> Rodrigo
>
>
> On Tue, Mar 20, 2012 at 11:47 AM, Edgar Gabriel
> mailto:gabr...@cs.uh.edu>> wrote:
do you have by any chance the actual or a small reproducer? It might be
much easier to hunt the problem down...
Thanks
Edgar
On 3/19/2012 8:12 PM, Rodrigo Oliveira wrote:
> Hi there.
>
> I am facing a very strange problem when using MPI_Barrier over an
> inter-communicator after some operations
you can have a look at the Netgauge tool of Torsten Hoefler, this tool
can report the LogGP parameters.
http://unixer.de/research/netgauge/
Thanks
Edgar
On 10/26/2011 11:48 AM, Mudassar Majeed wrote:
> Dear MPI people,
>I want to use LogGP model with MPI to
>
On 6/7/2011 10:23 AM, George Bosilca wrote:
>
> On Jun 7, 2011, at 11:00 , Edgar Gabriel wrote:
>
>> George,
>>
>> I did not look over all the details of your test, but it looks to
>> me like you are violating one of the requirements of
>> intercomm
solve this
>> problem ?
>>
>> Best regards,
>>
>> Frédéric.
>>
>> PS : of course I did an extensive web search without finding anything
>> usefull on my problem.
>>
>> ___
>> users mailing lis
On 8/8/2010 8:13 PM, Randolph Pullen wrote:
> Thanks, although “An intercommunicator cannot be used for collective
> communication.” i.e , bcast calls.,
yes it can. MPI-1 did not allow for collective operations on
intercommunicators, but the MPI-2 specification did introduce that notion.
Thank
> #7 0x08048a16 in main (argc=892352312, argv=0x32323038) at client.c:28
>
> I've tried both scenarios described: when hangs a client connecting
> from machines B and C. In both cases bt looks the same.
> How does it look like?
> Shall I repost that using a different subject as Ral
I could run over 130 processes with no
>>>> problems.
>>>> I'm sorry again that I've wasted your time. And thank you for the patch.
>>>>
>>>> 2010/7/21 Ralph Castain :
>>>>> We're having some problem replicating
Hm, so I am not sure how to approach this. First of all, the test case
works for me. I used up to 80 clients, and for both optimized and
non-optimized compilation. I ran the tests with trunk (not with 1.4
series, but the communicator code is identical in both cases). Clearly,
the patch from Ralph i
ication to basic mode, right ?
>
> sorry for my ignorance, but what's a NCA ?
sorry, I meant to type HCA (InifinBand networking card)
Thanks
Edgar
>
> thanks,
> éloi
>
> On Thursday 15 July 2010 16:20:54 Edgar Gabriel wrote:
>> you could try first to use the algori
you could try first to use the algorithms in the basic module, e.g.
mpirun -np x --mca coll basic ./mytest
and see whether this makes a difference. I used to observe sometimes a
(similar ?) problem in the openib btl triggered from the tuned
collective component, in cases where the ofed libraries
thanks for pointing the problem out. I checked in the code, the problem
is the MPI layer itself. The following check prevents us from doing
anything
e.g. ompi/mpi/c/allgather.c
if ((MPI_IN_PLACE != sendbuf && 0 == sendcount) ||
(0 == recvcount)) {
return MPI_SUCCESS;
}
-ls, make sure you can pvfs2-cp files to and from your volume.
If those 3 utilities work, then your OpenMPI installation should work
as well.
==rob
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University o
orian
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Ho
_
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Houston
Philip G. Hoffman Hall, Room 524Houst
g list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, U
bug.c
The command to run was
orterun --np 8 --mca btl tcp,self -- ./a.out
-r
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Edg
. Any
chance of getting a smaller code fragment which replicates the problem?
It could use the MUMPS library, I am fine with that since I just
compiled and installed it with the current ompi trunk...
Thanks
Edgar
Edgar Gabriel wrote:
I would say the probability is large that it is due to the r
arty libraries with this version of OMPI.
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab http://pstl.cs.uh.edu
Department of Computer Science University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
could still work...
Jeff Squyres wrote:
On Dec 5, 2008, at 12:22 PM, Edgar Gabriel wrote:
I hope you are aware, that *many* tools and application actually
profile the fortran MPI layer by intercepting the C function calls.
This allows them to not have to deal with f2c translation of MPI
objects
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies
ng a cpu_set_t similar to
sched_setaffinity.
Thanks
Edgar
Jeff Squyres wrote:
Fair enough; let me know what you find. It would be good to understand
exactly why you're seeing what you're seeing...
On Dec 2, 2008, at 5:47 PM, Edgar Gabriel wrote:
its on OpenSuSE 11 with kernel
in details.
Thanks
Edgar
Jeff Squyres wrote:
On Dec 2, 2008, at 11:27 AM, Edgar Gabriel wrote:
so I ran a couple of tests today and I can not confirm your statement.
I wrote simple a simple test code where a process first sets an
affinity mask and than spawns a number of threads. The threads m
ity(), which is a
per-process binding. Not a per-thread binding.
On Nov 20, 2008, at 7:34 AM, Edgar Gabriel wrote:
if you look at recent versions of libnuma, there are two functions
called numa_run_on_node() and numa_run_on_node_mask(), which allow
thread-based assignments to CPUs
--
Edg
ways assign dedicated nodes to
users. While users can be told to be sure to turn it "off" when using
these calls, it seems inevitable that they will forget - and complaints
will appear.
Thanks
Ralph
On Nov 20, 2008, at 7:34 AM, Edgar Gabriel wrote:
if you look at recent vers
I would guess that you can, if the library is installed, and as far as I
know it is part of most recent Linux distributions...
Thanks
Edgar
Gabriele Fatigati wrote:
Thanks Edgar,
but can i use these libraries also in a not NUMA machines?
2008/11/20 Edgar Gabriel :
if you look at recent
if you look at recent versions of libnuma, there are two functions
called numa_run_on_node() and numa_run_on_node_mask(), which allow
thread-based assignments to CPUs
Thanks
Edgar
Gabriele Fatigati wrote:
Is there a way to assign one thread to one core? Also from code, not
necessary with
is the code using shared file pointer operations (e.g.
MPI_File_write_shared/ordered)?
There was a fix which removed a warning/error about not being to delete
the file when using shared file pointer around v.1.2.6 ( I don't
remember precisely when it hit the trunk), and I was wandering whethe
here is a patch that we use on our development version to silence that
warning, you have to apply it to.
ompi/ompi/mca/io/romio/romio/mpi-io/io_romio_close.c
I would not like to commit that to the repository since I can not
oversee whether it causes problems in some other settings/scenario/fil
done...
Jeff Squyres wrote:
Edgar --
Can you file a CMR for v1.2?
On Apr 10, 2008, at 8:10 AM, Edgar Gabriel wrote:
thanks for reporting the bug, it is fixed on the trunk. The problem
was
this time not in the algorithm, but in the checking of the
preconditions. If recvcount was zero and
1 - 100 of 135 matches
Mail list logo