On Wed, Jul 30, 2014 at 8:53 PM, Paul Hargrove wrote:
[...]
> I have a clear answer to *what* is different (below) and am next looking
> into the why/how now.
> It seems that 1.8.1 has included all dependencies into libmpi_usempif08
> while 1.8.2rc2 does not.
>
[...]
The difference appears to s
Hi Paul,
Thank you for your investigation. I'm sure it's very
close to fix the problem although I myself can't do
that. So I must owe you something...
Please try Awamori, which is Okinawa's sake and very
good in such a hot day.
Tetsuya
> On Wed, Jul 30, 2014 at 8:53 PM, Paul Hargrove wrote:
>
Paul and all,
For what it's worth, with openmpi 1.8.2rc2 and the intel fortran
compiler version 14.0.3.174 :
$ nm libmpi_usempif08.so| grep -i sizeof
there is no such undefined symbol (mpi_f08_sizeof_)
as a temporary workaround, did you try to force the linker use
libforce_usempif08_internal_mo
Paul,
in .../ompi/mpi/fortran/use-mpi-f08, can you create the following dumb
test program,
compile and run nm | grep f08 on the object :
$ cat foo.f90
program foo
use mpi_f08_sizeof
implicit none
real :: x
integer :: size, ierror
call MPI_Sizeof_real_s_4(x, size, ierror)
stop
end program
wi
Gilles,
Just as you speculate, PGI is creating a _-suffixed reference to the module
name:
$ pgf90 -c test.f90
$ nm -u test.o | grep f08
U mpi_f08_sizeof_
U mpi_f08_sizeof_mpi_sizeof_real_s_4_
You suggested the following work-around in a previous email:
$ INS
Doesn't namespacing obviate the need for this convoluted identifier scheme?
See, for example, UML package import and include behaviors.
-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Dave Goodell
(dgoodell)
Sent: Wednesday, July 30, 2014 3:35 PM
To: Open MP
WHAT: Change default behavior in openib to not call ibv_fork_init() even if
available.
WHY: There are some strange interactions with ummunotify that cause errors. In
addition, see the additional points below.
WHEN: After next weekly meeting, August 5, 2014
DETAILS: This change will just be a co
+2^1000
This information is absolutely necessary at this point. If someone has a
better solution they can provide it as an alternative RFC. Until then
this is how it should be done... Otherwise we loose uGNI support on the
trunk. Because we ARE NOT going to remove the mailbox size optimizatio
Hi Folks,
I think given the way we want to use the btl's in lower levels like opal,
it is pretty disgusting for a btl to need to figure out on its own something
like a "global job size". That's not its business. Can't we add some
attributes
to the component's initialization method that provides
What is your definition of “global job size”?
George.
On Jul 31, 2014, at 11:06 , Pritchard Jr., Howard wrote:
> Hi Folks,
>
> I think given the way we want to use the btl's in lower levels like opal,
> it is pretty disgusting for a btl to need to figure out on its own something
> like a "gl
I'd like to suggest an alternative solution. A BTL can exploit whatever data it
wants, but should first test if the data is available. If the data is
*required*, then the BTL gracefully disqualifies itself. If the data is
*desirable* for optimization, then the BTL writer (if they choose) can inc
The maximum number of peer processes that may be added over the course
of the job will suffice. So either the world or universe size. This is a
reasonable piece of information to expect the upper layers to provide to
the communication layer.
And the impact of providing this information is no less
I definitively think you misunderstood this scope of this RFC. The information
that is so important to you to configure the mailbox size is available to you
when you need it. This information is made available by the PML through the
call to add_procs, which comes with all the procs in the MPI_CO
Hi George,
The ompi_process_info.num_procs thing that seems to have been an object
of some contention yesterday.
The ugni use of this is cloned off of the way I designed the mpich netmod.
Leveraging off size of the job was an easy way to scale the mailbox size.
If I'd been asked to have the ne
I do not like the fact that add_procs is called with every proc in the
MPI_COMM_WORLD. That needs to change, so, I will not rely on the number
of procs being added being the same as the world or universe size.
-Nathan
On Thu, Jul 31, 2014 at 09:22:00AM -0600, George Bosilca wrote:
>I definiti
Like I said, why don't we just do the following:
> I'd like to suggest an alternative solution. A BTL can exploit whatever data
> it wants, but should first test if the data is available. If the data is
> *required*, then the BTL gracefully disqualifies itself. If the data is
> *desirable* for
This approach will work now but we need to start thinking about how we
want to support multiple simultaneous btl users. Does each user call
add_procs with a single module (or set of modules) or does each user
call btl_component_init and get their own module? If we do the latter
then it might make
Fair enough - yeah, that is an issue I've been avoiding :-)
On Jul 31, 2014, at 9:14 AM, Nathan Hjelm wrote:
>
> This approach will work now but we need to start thinking about how we
> want to support multiple simultaneous btl users. Does each user call
> add_procs with a single module (or set
Yeah, I forgot that pure ANSI C doesn't really have namespaces, other than
to fully qualify modules and variables. Bummer.
Makes writing large, maintainable middleware more difficult.
-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Kenneth A.
Lloyd
Sent: Th
WHAT: Allow reservation of a symbol namespace that is independent of component
location.
WHY: All of the framework location/abstraction churn over the years has made it
challenging to maintain single source versions of MCA components (e.g., the
"usnic" BTL) that work with multiple versions of O
George --
Got 2 questions for ya:
1. I see some orte_* specific symbols/functions in ompi_mpi_init.c. Was that
intentional? Shouldn't that stuff be in the RTE framework, or some such?
2. In tracking down some stuff relating to process names, it looks like names
are now setting set by ompi/pr
On Jul 31, 2014, at 18:26 , Jeff Squyres (jsquyres) wrote:
> George --
>
> Got 2 questions for ya:
>
> 1. I see some orte_* specific symbols/functions in ompi_mpi_init.c. Was that
> intentional? Shouldn’t that stuff be in the RTE framework, or some such?
Good catch. Fixed in r32384.
> 2.
All,
Here is the patch that change the meaning of the atomics to make them always
return the previous value (similar to sync_fetch_and_<*>). I tested this with
the following atomics: OS X, gcc style intrinsics and AMD64.
I did not change the base assembly files used when GCC style assembly
ope
On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca wrote:
> Paul, I know you have a pretty diverse range computers. Can you try to
> compile and run a "make check" with the following patch?
I will see what I can do for ARMv7, MIPS, PPC and IA64 (or whatever subset
of those is still supported).
The
On Thu, Jul 31, 2014 at 4:22 PM, Paul Hargrove wrote:
>
> On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca
> wrote:
>
>> Paul, I know you have a pretty diverse range computers. Can you try to
>> compile and run a "make check" with the following patch?
>
>
> I will see what I can do for ARMv7, MIP
Awesome, thanks Paul. When the results will be in we will fix whatever is
needed for these less common architectures.
George.
On Thu, Jul 31, 2014 at 7:24 PM, Paul Hargrove wrote:
>
>
> On Thu, Jul 31, 2014 at 4:22 PM, Paul Hargrove wrote:
>
>>
>> On Thu, Jul 31, 2014 at 4:13 PM, George Bo
On Jul 31, 2014, at 3:41 PM, George Bosilca wrote:
>
> On Jul 31, 2014, at 18:26 , Jeff Squyres (jsquyres)
> wrote:
>
>> George --
>>
>> Got 2 questions for ya:
>>
>> 1. I see some orte_* specific symbols/functions in ompi_mpi_init.c. Was
>> that intentional? Shouldn’t that stuff be in
On the path to verifying George's atomics patch, I have started just by
verifying that I can still build the UNPATCHED trunk on each of the
platforms I listed.
I have tried two PPC64/Linux systems so far and am seeing the same problem
on both. Though I can pass "make check" both platforms SEGV on
Many thanks guys, this thread was most helpful in finding the fix.
Paul H. nailed 80% of it on the head in the post where he identified the
Makefile.am change. That Makefile.am change was due to three things:
1. Fixing a real bug (elsewhere in that commit)
2. My misunderstanding of how module f
Related question:
If I am understanding PGI's list of fixed-TPRs (bugs) then it looks like
one (certainly not the only) difference between 13.x and 14.1 is a fix to a
problem with PROCEDURE and zero-argument subroutines. As it happens, the
configure probe for PROCEEDURE is a zero-argument subrout
Second related issue:
Can/should examples/hello_usempif08.f90 be extended to use more of the
module such that it would have illustrated the bug found with Tetsuya's
example code? I don't know about MTT, but my scripts for testing a
release candidate includes running "make" in the example subdir.
Nevermind my suggestion to revise examples/hello_usempif08.f90
I've just determined that it is already sufficient to reproduce the problem.
(So now I need to see what's wrong in my testing scripts).
-Paul
On Thu, Jul 31, 2014 at 7:04 PM, Paul Hargrove wrote:
> Second related issue:
>
> Can/sho
Paul,
the ibm test suite from the non public ompi-tests repository has several
tests for usempif08.
Cheers,
Gilles
On 2014/08/01 11:04, Paul Hargrove wrote:
> Second related issue:
>
> Can/should examples/hello_usempif08.f90 be extended to use more of the
> module such that it would have illust
George:
Have a failure with your patch applied on PPC64/Linux and gcc-4.4.6:
Making all in asm
make[2]: Entering directory
`/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/BLD/opal/asm'
CC asm.lo
In file included from
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369
$ INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
ld.so.1: ring_c: fatal: relocation error: file
/home/hargrove/OMPI/openmpi-trunk-solaris10-sparcT2-ss12u3-v8plus/INST/lib/openmpi/mca_pml_ob1.so:
symbol alloca: referenced symbol not found
This platform has worked in the past.
I will be tr
A missing include. Should be fixed by r32388.
Thanks,
George.
On Thu, Jul 31, 2014 at 11:15 PM, Paul Hargrove wrote:
>
> $ INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
> ld.so.1: ring_c: fatal: relocation error: file
> /home/hargrove/OMPI/openmpi-trunk-solaris10-sparcT2-ss12u
Anything you can suggest to resolve that problem would be most appreciated,
Paul. It's been reported before, but we have no idea what it is looking for.
On Jul 31, 2014, at 8:15 PM, Paul Hargrove wrote:
>
> $ INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
> ld.so.1: ring_c: fatal: re
FWIW: we had Siegmar try that and it didn't solve the problem. Paul?
On Jul 31, 2014, at 8:28 PM, svn-commit-mai...@open-mpi.org wrote:
> Author: bosilca (George Bosilca)
> Date: 2014-07-31 23:28:23 EDT (Thu, 31 Jul 2014)
> New Revision: 32388
> URL: https://svn.open-mpi.org/trac/ompi/changeset/
Yes, I fear this will require some effort to chase all the breakage down given
that (to my knowledge, at least) we lack PPC machines in the devel group.
On Jul 31, 2014, at 5:46 PM, Paul Hargrove wrote:
> On the path to verifying George's atomics patch, I have started just by
> verifying that
39 matches
Mail list logo