G'day
Just a quick basic question. in case of tcp btl, how do I limit the frag
size?
I do not want MPI to send a fragment of size greater than lets say 16K in size.
If I am not mistaken, should not the btl_tcp_min_send_size do the trick? If it
is supposed to do it, why do i see packets of
I accidently run job on the hostfile where one of hosts was not properly
mounted. As a result I got an error and a segfault.
/home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile
./mpi_p01 -t lt
bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or
directory
-
On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
- I don't think we can delete the MCA param ompi_paffinity_alone; it
exists in the v1.2 series and has historical precedent.
It will not be deleted,
It will just use the same infrastructure ( slot_list parameter and
opal_base functions ). It
Sorry, I missed this mail.
IIRC, the verbosity level for stream 0 is 0. It probably would not be
good to increase it; many places in the code use output stream 0.
Perhaps you could make a new stream with a different verbosity level
to do what you want...? See the docs in opal/util/output.
Jeff Squyres wrote:
On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
- I don't think we can delete the MCA param ompi_paffinity_alone; it
exists in the v1.2 series and has historical precedent.
It will not be deleted,
It will just use the same infrastructure ( slot_list parameter
OK,
I am putting it back.
> -Original Message-
> From: terry.don...@sun.com [mailto:terry.don...@sun.com]
> Sent: Monday, March 31, 2008 2:59 PM
> To: Open MPI Developers
> Cc: Lenny Verkhovsky; Sharon Melamed
> Subject: Re: [OMPI devel] RMAPS rank_file component patch and
> modificatio
Thanks Jeff. It appears to me that the first approach to reducing modex data
makes the most sense and has the largest impact - I would advocate pursuing
it first. We can look at further refinements later.
Along that line, one thing we also exchange in the modex (not IB specific)
is hostname and ar
On Mar 31, 2008, at 9:22 AM, Ralph H Castain wrote:
Thanks Jeff. It appears to me that the first approach to reducing
modex data
makes the most sense and has the largest impact - I would advocate
pursuing
it first. We can look at further refinements later.
Along that line, one thing we also
Ralph,
I've just noticed that it seems that the 'unity' routed component
seems to be broken when using more than one machine. I'm using Odin
and r18028 of the trunk, and have confirmed that this problem occurs
with SLURM and rsh. I think this break came in on Friday as that is
when some o
I figured out the issue - there is a simple and a hard way to fix this. So
before I do, let me see what makes sense.
The simple solution involves updating the daemons with contact info for the
procs so that they can send their collected modex info to the rank=0 proc.
This will measurably slow the
At the moment I only use unity with C/R. Mostly because I have not
verified that the other components work properly under the C/R
conditions. I can verify others, but that doesn't solve the problem
with the unity component. :/
It is not critical that these jobs launch quickly, but that they
The btl_tcp_min_send_size is not exactly what you expect it to be. It
drive only the send protocol (as implemented in Open MPI), and not the
put protocol the TCP BTL is using.
You can achieve what you want with 2 parameters:
1. btl_tcp_frag set to 9. This will force the send protocol over TCP
On 3/31/08 9:28 AM, "Josh Hursey" wrote:
> At the moment I only use unity with C/R. Mostly because I have not
> verified that the other components work properly under the C/R
> conditions. I can verify others, but that doesn't solve the problem
> with the unity component. :/
>
> It is not cri
On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:
On 3/31/08 9:28 AM, "Josh Hursey" wrote:
At the moment I only use unity with C/R. Mostly because I have not
verified that the other components work properly under the C/R
conditions. I can verify others, but that doesn't solve the probl
Okay - fixed with r18040
Thanks
Ralph
On 3/31/08 11:01 AM, "Josh Hursey" wrote:
>
> On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:
>
>>
>>
>>
>> On 3/31/08 9:28 AM, "Josh Hursey" wrote:
>>
>>> At the moment I only use unity with C/R. Mostly because I have not
>>> verified that the
I am unable to replicate the segfault. However, I was able to get the job to
hang. I fixed that behavior with r18044.
Perhaps you can test this again and let me know what you see. A gdb stack
trace would be more helpful.
Thanks
Ralph
On 3/31/08 5:13 AM, "Lenny Verkhovsky" wrote:
>
>
>
> I
Looks good. Thanks for the fix.
Cheers,
Josh
On Mar 31, 2008, at 1:43 PM, Ralph H Castain wrote:
Okay - fixed with r18040
Thanks
Ralph
On 3/31/08 11:01 AM, "Josh Hursey" wrote:
On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:
On 3/31/08 9:28 AM, "Josh Hursey" wrote:
At the mo
So does anyone know why the session directories are in $HOME instead
of /tmp?
I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?
-- Josh
I looked over the code and I don't see any problems with the changes.
The only think I did is replacing the getenv("HOME") by
opal_home_directory ...
Here is the logic for selecting the TMP directory:
if( NULL == (str = getenv("TMPDIR")) )
if( NULL == (str = getenv("TEMP")) )
Slightly OT but along the same lines..
We currently have an argument to mpirun to set the HNP tmpdir (--
tmpdir).
Why don't we have an mca param to set the tmpdir for all the orted's
and such?
- Galen
On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:
I looked over the code and I don't se
Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.
-- Josh
On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:
I looked over the code and I don't see any problems with the
cha
I confirm that this is new behavior.
Session directories have just started showing up in my $HOME as well,
and TMPDIR, TEMP, TMP have never been set on my cluster (for
interactive logins, anyway).
On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:
Nope. None of those environment variables ar
Taking a quick look at the commits it seems that r18037 looks like
the most likely cause of this problem.
Previously the session directory was forced to "/tmp" if no
environment variables were set. This revision removes this logic and
uses the opal_tmp_directory(). Though I agree with this
TMPDIR and TMP are standard on Unix. If they are not defined ... one
cannot guess where the temporary files should be located.
Unfortunately, if we start using the /tmp directly we might make the
wrong guess.
What mktemp is returning on your system ?
george.
On Mar 31, 2008, at 4:01 PM,
Here is the problem - the following code was changed in session_dir.c:
-#ifdef __WINDOWS__
-#define OMPI_DEFAULT_TMPDIR "C:\\TEMP"
-#else
-#define OMPI_DEFAULT_TMPDIR "/tmp"
-#endif
-
#define OMPI_PRINTF_FIX_STRING(a) ((NULL == a) ? "(null)" : a)
/
@@ -262,14 +257,8
I more than agree with Galen.
Aurelien
Le 31 mars 08 à 16:00, Shipman, Galen M. a écrit :
Slightly OT but along the same lines..
We currently have an argument to mpirun to set the HNP tmpdir (--
tmpdir).
Why don't we have an mca param to set the tmpdir for all the orted's
and such?
- Galen
O
Commit r18046 restore exactly the same logic as it was before r18037.
It redirects everything to /tmp is no special environment variable is
set.
george.
On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote:
Taking a quick look at the commits it seems that r18037 looks like
the most likely cause
Thanks for the fix.
Cheers,
Josh
On Mar 31, 2008, at 4:17 PM, George Bosilca wrote:
Commit r18046 restore exactly the same logic as it was before
r18037. It redirects everything to /tmp is no special environment
variable is set.
george.
On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote:
T
On Mon, Mar 31, 2008 at 10:15 PM, wrote:
> Author: bosilca
> Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008)
> New Revision: 18046
> URL: https://svn.open-mpi.org/trac/ompi/changeset/18046
>
> Modified: trunk/opal/util/opal_environ.c
> +#ifdef __WINDOWS__
> +#define OMPI_DEFAULT_TMPDIR "C:
You're right ... I'll make the change asap.
Thanks,
george.
On Mar 31, 2008, at 5:39 PM, Bert Wesarg wrote:
On Mon, Mar 31, 2008 at 10:15 PM, wrote:
Author: bosilca
Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008)
New Revision: 18046
URL: https://svn.open-mpi.org/trac/ompi/changeset/18
30 matches
Mail list logo