Hi,
In v1.5, when mpirun is called with both the "-bind-to-core" and
"-npersocket" options, and the npersocket value leads to less procs than
sockets allocated on one node, we get a segfault
Testing environment:
openmpi v1.5
2 nodes with 4 8-cores sockets each
mpirun -n 10 -bind-to-core -npersock
On 8 Nov 2011, at 00:59, George Bosilca wrote:
> A started process is defined as being our mpirun. In Open MPI
> MPIR_partial_attach_ok is defined, so the tool will suppose that we provide a
> means to synchronize the processes not based on MPIR_debug_gate. Therefore
> only one behavior if acc
Looks fine to me - CMR filed. Thanks!
On Nov 8, 2011, at 1:01 AM, nadia.derbey wrote:
> Hi,
>
> In v1.5, when mpirun is called with both the "-bind-to-core" and
> "-npersocket" options, and the npersocket value leads to less procs than
> sockets allocated on one node, we get a segfault
>
> Test
On Nov 8, 2011, at 4:48 AM, Ashley Pittman wrote:
> I agree that it's not clear this, I don't think this spec is well understood
> by anyone, indeed it wasn't originally written with the intention of becoming
> a specification at all. I've looked at it a couple of times but never used
> this
On Nov 7, 2011, at 8:34 PM, Ralph Castain wrote:
> Best guess: from what I've seen, most debuggers don't seem to conform to what
> the MPI Forum has "accepted". It doesn't appear that the vendors and debugger
> developers pay too much attention to that document, possibly because it (a)
> came a
On Nov 7, 2011, at 9:48 PM, Nathan T. Hjelm wrote:
> In retrospect I should have done a RFC for the 3rd change with a short
> timeout. At the time (operating on little sleep) it seemed like the commits
> would have minimal impact. Please let me know if the commits have any
> negative impact.
FWIW
I think the only possible controversial change in this commit is changing
MPIR_Breakpoint() to return (void) instead of (void*). Oddly, I see that
MPICH2 has 2 different prototypes for MPIR_Breakpoint -- one returns (void*),
another returns (int). Assuming that MPICH2 works fine with the debug
Sure, I can do that. My only concern is with sending between hosts of
different endianness.
For example, if seg_key is 128 bits wide and the key32 is 64 bits then we
might run into this:
Host 1: (big endian)
Set seg_key.key32[0] = 0x
would result in seg_key: 0x 0x 0x1
> george.
>
>PS: Regarding the hand-copy instead of the memcpy, we tried to avoid using
>memcpy in performance critical codes, especially when we know the size of
>the data and the alignment. This relieves the compiler of adding ugly
>intrinsics,
>allowing it to nicely pipeline to load/stores. An
On Nov 8, 2011, at 07:52 , Jeff Squyres wrote:
> To be clear: that document simply standardizes what MPI implementations are
> supposed to provide in their MPIR implementation (prior to this, MPI
> implementations tended to have subtle differences between their MPIR
> implementations, which we
On Nov 8, 2011, at 8:25 AM, George Bosilca wrote:
>
> On Nov 8, 2011, at 07:52 , Jeff Squyres wrote:
>
>> To be clear: that document simply standardizes what MPI implementations are
>> supposed to provide in their MPIR implementation (prior to this, MPI
>> implementations tended to have subtl
On Tue, 8 Nov 2011 06:36:03 -0800, Rolf vandeVaart
wrote:
>> george.
>>
>>PS: Regarding the hand-copy instead of the memcpy, we tried to avoid
> using
>>memcpy in performance critical codes, especially when we know the size of
>>the data and the alignment. This relieves the compiler of adding u
On Nov 8, 2011, at 10:25 AM, George Bosilca wrote:
> However, based on what we have in the trunk today, Open MPI doesn't follow
> that document. As Ralph pinpointed it, the current version work with several
> tools (tv, stat, padb) as is, so that means the tools do not really follow
> that docu
On Nov 8, 2011, at 8:37 AM, Jeff Squyres wrote:
> On Nov 8, 2011, at 10:25 AM, George Bosilca wrote:
>
>> However, based on what we have in the trunk today, Open MPI doesn't follow
>> that document. As Ralph pinpointed it, the current version work with several
>> tools (tv, stat, padb) as is,
The good news is that the issue reported in R25290 is fixed in the latest Intel compilers release (2011.7.256). The bad news is that both the 2011.6.233 and 2011.7.256 releases identify themselves as V12.1.0 from the command line. (I reported this bug to Intel already.) They can only be reliably
Folks,
Wednesday November 15th at 12:15 PST, we will have an Open MPI BOF. We will
have two guest speakers: Rolf vandeVaart from NVIDIA and Shinji Sumimoto from
the K-computer. If you are at SC, you are all invited to participate to this
annual event. Blend for a moment with our user community,
MPIR_Breakpoint, as the name indicates, it is just a breakpoint used by the
startup process or the MPI application to signal changes to the debugger. No
return value, nothing more than a breakpoint.
I wonder how the volatile got there, there is no such requirement on variables
that cannot be ch
Elements in an array are always stored in the expected [increasing] order,
regardless of the endianess of the architecture. Moreover, due to the alignment
rules, all members in a union will start at the same address.
It turns out there is no endianess conversion on the keys, so I suppose both
p
Larry,
Thanks for following with us on this. I think your patch is cleaner than what
we currently have in the trunk, so I went ahead and push it in the trunk
(25461). I will request a push in 1.5 and 1.4 as well.
Regards,
george.
On Nov 8, 2011, at 13:57 , Larry Baker wrote:
> The good
I think the volatiles are there to ensure the compiler doesn't optimise away
reads or function calls which has been a problem with this interface in the
past.
On 8 Nov 2011, at 22:18, George Bosilca wrote:
> MPIR_Breakpoint, as the name indicates, it is just a breakpoint used by the
> startup
I guess people should check the commit before …
No way the volatile will do any good here:
-ORTE_DECLSPEC extern volatile char MPIR_executable_path[MPIR_MAX_PATH_LENGTH];
-ORTE_DECLSPEC extern volatile char MPIR_server_arguments[MPIR_MAX_ARG_LENGTH];
+ORTE_DECLSPEC extern char MPIR_executable_path
In theory, might a sufficiently smart compiler and linker eliminate some
MPIR_* variables after optimization? If that could potentially be true,
then perhaps the volatile qualifier would prevent such a removal, which
would break the existence check(s) by the debugger? Just a thought.
-Paul
Ok, that makes sense. Is there a reason why the members were all set the be
the same size?
Maybe seg_key should be:
union {
uint8_t key8;
uint16_t key16;
uint32_t key32;
uint64_t key64;
struct { uint64_t value[2] } key128;
};
-Nathan
On Tue, 8 Nov 2011 17:22:48 -0500, George Bosilca
On Nov 8, 2011, at 17:56 , Paul H. Hargrove wrote:
> In theory, might a sufficiently smart compiler and linker eliminate some
> MPIR_* variables after optimization?
Even if a compiler can optimize out symbols from an application, I doubt they
are allowed to apply the same optimization on librar
That makes sense to me.
-Original Message-
From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
Behalf Of Nathan T. Hjelm
Sent: Tuesday, November 08, 2011 8:36 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] Remote key sizes
On Tue, 8 Nov 2011 06:36:03 -0800, Rol
On Nov 8, 2011, at 3:56 PM, Paul H. Hargrove wrote:
> In theory, might a sufficiently smart compiler and linker eliminate some
> MPIR_* variables after optimization? If that could potentially be true, then
> perhaps the volatile qualifier would prevent such a removal, which would
> break the
I do not recall, and from the code there is no obvious reason. However, being
able to store multiple smaller members might be a good enough reason.
Btw, we don't use the key8 at all. I guess we can clean that code up to only
keep key32 and key64, eventually with the count to match up the right s
On Nov 8, 2011, at 18:32 , Ralph Castain wrote:
> That was the experience - after thrashing for quite some time, we finally
> found that the volatile qualifiers fixed the problem. Hence my request that
> people check to see if anything is broken.
I will therefore propose to forever ban all com
Now this thread is starting to read like an episode of The Big Bang Theory.
One possible guess as to how/why MPICH has managed w/o "volatile" would
be that they may pass less aggressive optimization flags to the
compilers. It is a then a question of which MPI implementation is
supporting a cho
On Nov 8, 2011, at 10:36 , Nathan T. Hjelm wrote:
> On Tue, 8 Nov 2011 06:36:03 -0800, Rolf vandeVaart
> wrote:
>>> george.
>>>
>>> PS: Regarding the hand-copy instead of the memcpy, we tried to avoid
>> using
>>> memcpy in performance critical codes, especially when we know the size of
>>> the
On 11/8/11 5:25 PM, "George Bosilca" wrote:
>2. one sided: A quick look in the OSC seems to indicate there are some
>special handling to be done in the RDMA one. Look at
>ompi_osc_rdma_sendreq_t in osc_rdma_sendreq.h, it is using a trick to
>store the remote segments. First, the mca_btl_base_segm
31 matches
Mail list logo