Hi Ralph,
On Jul 25, 2011, at 11:05 AM, Ralph Castain wrote:
On Jul 25, 2011, at 10:16 AM, Samuel K. Gutierrez wrote:
Hi Ralph,
It seems as if this issue is related to a missing shm_unlink
wrapper within Valgrind. I'm going to disable posix by default and
commit later today.
Hi Ralph,
It seems as if this issue is related to a missing shm_unlink wrapper
within Valgrind. I'm going to disable posix by default and commit
later today.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jul 23, 2011, at 8:54 PM, Samuel K. Gutierrez wrote:
Hi
Hi Ralph,
That's mine - I'll take a look.
Thanks,
Sam
> Whenever I run valgrind on orterun (or any OMPI tool), I get the following
> error msg:
>
> --
> A system call failed during shared memory initialization that should
>
In r24795.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 15, 2011, at 10:01 AM, Samuel K. Gutierrez wrote:
WHAT:
Bring in new shared memory backing facility framework (shmem) and
its components. shmem is simply a framework for the manipulation of
shared memory
/coll_sm_module.c
M orte/mca/odls/base/odls_base_default_fns.c
M orte/tools/orte-info/orte-info.c
M orte/tools/orte-info/components.c
WHEN:
Before 1.7.
TIMEOUT:
Teleconference, Tues 21 June 2011
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
Here is the 'pgCC -V' output from versions that I have access to.
$ pgCC -V
pgCC 7.1-6 64-bit target on x86-64 Linux -tp gh-64
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2007, STMicroelectronics, Inc. All Rights Reserved.
$ pgCC -V
pgCC 9.0-3 64-bit ta
Hi Tim,
Great news! Happy calculating :-).
--
Samuel K. Gutierrez
Los Alamos National Laboratory
> Dear Samuel,
>
> Just as you replied I was trying that on the compute nodes. Surprise,
> surprise...the value returned as the hard and soft limits is 1024.
>
> Thanks for confir
Hi,
It sounds like Open MPI is hitting your system's open file descriptor
limit. If that's the case, one potential workaround is to have your
system administrator raise file descriptor limits.
On a compute node, what does "ulimit -a" show (using bash)?
Hope that
Same here.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
> On Oct 11, 2010, at 11:41 PM, Ralph Castain wrote:
>
>> Does anyone know of a reason why mpirun can -not- be threaded, assuming
>> that all threads block and do not continuously chew cpu? Is there an
>>
Hi,
Just to be clear - do you see similar checkpoint performance
differences in 1.5rc6 and 1.4.2 with and without shared memory enabled?
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Sep 21, 2010, at 9:35 AM, > wrote:
Hello Samuel
This problem seems to be resol
Hi Ananda,
This issue should be resolved in r23781. Please let me know if it is
not.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Sep 20, 2010, at 11:26 AM, > wrote:
I have used following options to build:
./configure CC=/usr/bin/gcc CXX=/usr/bin/c++ F77=/usr/
Let me take a look at it. How did you configure your build?
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Sep 20, 2010, at 10:14 AM, > wrote:
Hi
I believe the new common shared memory component was committed to
the trunk sometime towards the later part of August
Will do.
Sam
On Aug 26, 2010, at 2:08 PM, Jeff Squyres wrote:
I think Sam already submitted CMR's for 1.5:
https://svn.open-mpi.org/trac/ompi/ticket/2545
Sam -- can you construct an equivalent for v1.4 and CC Paul so that
he knows not to follow up on it?
Thanks!
On Aug 26, 2010, at 3:
Code is in (see r23633). Note: mmap is still the default.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Aug 12, 2010, at 11:37 AM, Samuel K. Gutierrez wrote:
Sorry, I should have included the link containing the discussion of
the plot.
http://www.open-mpi.org/community/lists
Sorry, I should have included the link containing the discussion of
the plot.
http://www.open-mpi.org/community/lists/devel/2010/06/8078.php
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Aug 12, 2010, at 11:20 AM, Terry Dontje wrote:
Sorry Rich, I didn't realize there was a
Hi Terry,
One more thing... Before testing on Solaris 10, could you please
update (I just committed a Solaris 10 fix).
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Aug 11, 2010, at 1:15 PM, Samuel K. Gutierrez wrote:
Hi Terry,
On Aug 11, 2010, at 12:34 PM
rry about where the OMPI session directory is rooted.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
I still working on testing the code on Solaris but I don't imagine I
will see anything that will change my mind.
--td
Samuel K. Gutierrez wrote:
Hi Rich,
It's a mod
onent ?
Rich
On 8/10/10 10:52 AM, "Samuel K. Gutierrez" wrote:
Hi,
I wanted to give everyone a heads-up about a new POSIX shared memory
component
that has been in the works for a while now and is ready to be pushed
into the
trunk.
http://bitbucket.org/samuelkgutierrez/ompi_pos
Addressed some of George's code reuse concerns.
If there are no major objections by August 17th, I'll commit the code
after the
Tuesday morning conference call.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
All,
LANL is currently looking for an individual to participate in support
and development efforts in parallel code benchmarking, performance
analysis, tuning, and tools integration. Duties will include parallel
environment optimization for programming models targeted on deployed
archite
On Jun 10, 2010, at 1:47 AM, Sylvain Jeaugey wrote:
On Wed, 9 Jun 2010, Jeff Squyres wrote:
On Jun 9, 2010, at 3:26 PM, Samuel K. Gutierrez wrote:
System V shared memory cleanup is a concern only if a process dies
in
between shmat and shmctl IPC_RMID. Shared memory segment cleanup
should
segments.
System V shared memory cleanup is a concern only if a process dies in
between shmat and shmctl IPC_RMID. Shared memory segment cleanup
should happen automagically in most cases, including abnormal process
termination.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
Righ
Now in the trunk (see r23260).
Thanks everyone!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 1, 2010, at 11:08 AM, Samuel K. Gutierrez wrote:
WHAT: New System V shared memory component.
WHY: https://svn.open-mpi.org/trac/ompi/ticket/1320
WHERE:
M ompi/mca/btl/sm
Thanks Sylvain!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 9, 2010, at 9:58 AM, Sylvain Jeaugey wrote:
As stated at the conf call, I did some performance testing on a 32
cores node.
So, here is graph showing 500 timings of an allreduce operation
(repeated 15,000 times
On Jun 2, 2010, at 11:58 AM, Samuel K. Gutierrez wrote:
Good point - I forgot about that.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote:
Don't forget that the RML is also used to broadcast the success/
failure of the creation o
Good point - I forgot about that.
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote:
Don't forget that the RML is also used to broadcast the success/
failure of the creation of the shared memory segment.
If the RML goes away, be sure
Hi George,
That may work - I'll try it.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 2, 2010, at 10:59 AM, George Bosilca wrote:
How about ftok ? The init function takes a file_name as argument,
and this file name is unique per instance of the shared memory
r
memory ID (generated by exactly one process). I'm not sure how we
would go about passing along the shared memory ID without RML, but any
ideas are greatly appreciated.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
--
Jeff Squyres
jsquy...@cisco.com
For corporate
inalize).
Sam -- are the common parts really common? I.e., could they be
factored out? Or are they "just different enough" that factoring
them out would be a PITA?
I'm sure some refactoring could be done - let me take a look.
--
Samuel K. Gutierrez
Los Alamos National Labor
Hi all,
Configure option added: --enable-sysv (default: disabled).
For sysv testing purposes, please enable.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 1, 2010, at 11:11 AM, Samuel K. Gutierrez wrote:
Doh!
bitbucket repository: http://bitbucket.org
Hi Rich,
I'll add a configure-time option. This addition does not negatively
impact the performance of the current sm component.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 1, 2010, at 11:35 AM, Graham, Richard L. wrote:
Can you be a bit more explicit, p
Doh!
bitbucket repository: http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Jun 1, 2010, at 11:08 AM, Samuel K. Gutierrez wrote:
WHAT: New System V shared memory component.
WHY: https://svn.open-mpi.org/trac/ompi/ticket
map - or a comma delimited
combination
of them (order dependent). The first
component that
is successfully selected is used.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
iated!
http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On May 5, 2010, at 7:53 AM, Samuel K. Gutierrez wrote:
On May 5, 2010, at 6:10 AM, Jeff Squyres wrote:
On May 4, 2010, at 9:53 AM, Ashley Pittman wrote:
Point noted. But act
just need to communicate via
RML to sync up, I suppose.
I need to think about it a little more, but I like this solution.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
I should of course said fork where I mentioned spawn above to avoid
any confusion, spawn has a specific mea
Hi All,
New configure-time test added - thanks for the suggestion, Jeff.
Update and give it a whirl.
Ethan - could you please try again? This time, I'm hoping sysv
support will be disabled ;-).
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On May 3, 2010, at 9:
Hi Jeff,
Sounds like a plan :-).
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:
It might well be that you need a configure test to determine whether
this behavior occurs or not. Heck, it may even need to be a run-
time test
adequate level of sysv support.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:
On May 2 2010, Ashley Pittman wrote:
On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:
As to performance there should be no difference in use
nks for testing!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
> On Thu, Apr/29/2010 02:52:24PM, Samuel K. Gutierrez wrote:
>> Hi Ethan,
>> Bummer. What does the following command show?
>> sysctl -a | grep shm
>
> In this case, I think the Solaris equivalent
Hi Ethan,
Bummer. What does the following command show?
sysctl -a | grep shm
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Apr 29, 2010, at 1:32 PM, Ethan Mallove wrote:
Hi Samuel,
I'm trying to run off your HG clone, but I'm seeing issues with
c_
what my preliminary tests have shown. As it stands, I have
not come across a situation where the mmap SM component doesn't work
or is slower.
Hope that helps,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Apr 28, 2010, at 5:35 AM, Bogdan Costescu wrote:
On Tue, Apr 27, 20
an be
activated using: -mca mpi_common_sm sysv
Repository:
http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm
Input is greatly appreciated!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Apr 22, 2010, at 10:08 AM, Rainer Keller wrote:
Hello Oliver,
thanks for the update.
Just my $0.02: the upcoming Open MPI v1.5 will warn users, if their
session
directory is on NFS (or Lustre).
... or panfs :-)
Samuel K. Gutierrez
Best regards,
Rainer
On Thursday 22 April 2010
That's interesting... Works great now that carto is built. Why is
carto now required?
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Nov 5, 2009, at 4:11 PM, David Gunter wrote:
Oh, good catch. I'm not sure who updates the platform files or who
would have added
/lib64
I'll send the build log shortly.
Thanks!
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On Nov 5, 2009, at 3:07 PM, Jeff Squyres wrote:
How did you build?
I see one carto component named "auto_detect" in the 1.3.4 source
tree, but I don't see it in your omp
line 77
[rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not
found in file orterun.c at line 541
This may be an issue on our end regarding a runtime parameter that
isn't set correctly. See attached. Please let me know if you need
any more info.
Thanks!
--
Samuel K. Gutierrez
Los Al
Hi Jeff,
Sorry about the ambiguity. I just had another conversation with our
TotalView person and the problem -seems- to be unrelated to OMPI.
Guess I jumped the gun...
Thanks,
Samuel K. Gutierrez
On Sep 21, 2009, at 8:58 AM, Jeff Squyres wrote:
Can you more precisely define &quo
Hi,
According to our TotalView person, PGI and Intel versions of OMPI
1.3.3 are not working properly. She noted that 1.2.8 and 1.3.2 work
fine.
Thanks,
Samuel K. Gutierrez
On Sep 21, 2009, at 7:19 AM, Terry Dontje wrote:
Ralph Castain wrote:
I see it declared "extern" in
Hi Ashley,
My understanding is that this behavior would not be enabled by default
in the standard debug build. The "always convert to synchronous sends"
mode would be an additional configure-time option.
Samuel K. Gutierrez
Ashley Pittman wrote:
On Mon, 2009-08-24 at 13:27 -
Hi Jeff,
Sounds good to me.
Samuel K. Gutierrez
Jeff Squyres wrote:
The debug builds already have quite a bit of performance overhead. It
might be desirable to change this RFC to have a similar tri-state as
the MPI parameter checking:
- compiled out
- compiled in, always check
- compiled
Hi all,
How about exposing this functionality as a run-time parameter that is only
available in debug builds? This will make debugging easier and won't
impact the performance of optimized builds. Just an idea...
Samuel K. Gutierrez
>
> - "Jeff Squyres" wrote:
>
Hi,
Let me begin by stating that I'm at most an Open MPI novice - but you
may want to try the addition of the --enable-debug configure option.
That is, for example:
./configure --enable-debug; make
Hope this helps.
Samuel K. Gutierrez
On Jun 12, 2009, at 3:27 AM, Leo P. wrote:
modifications were needed on my end. That
being said, my peruse.h does not include ompi_config.h, only mpi.h.
Thanks again,
Samuel K. Gutierrez
>
> Regards,
> Kiril
>
>
> On Mon, 2009-03-23 at 16:34 -0400, George Bosilca wrote:
>> You are absolutely right, the peer shou
Hi Kiril,
Appreciate the quick response.
> Hi Samuel,
>
> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
> "Samuel K. Gutierrez" wrote:
>> Hi All,
>>
>> I'm writing a simple profiling library which utilizes
>>PERUSE. My callback
>
> So am I
if (spec->peer == rank) {
return MPI_SUCCESS;
}
rrCounts[spec->peer]++;
return MPI_SUCCESS;
}
Any insight is greatly appreciated.
Thanks,
Samuel K. Gutierrez
55 matches
Mail list logo