On 11/22/2011 6:59 PM, Lukas Razik wrote:
Roland Dreier<rol...@purestorage.com> wrote:
On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik<li...@razik.name> wrote:
#0 0xfffff8010229ba9c in mca_pml_ob1_send_request_start_copy
(sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551
551 hdr->hdr_match.hdr_ctx =
sendreq->req_send.req_base.req_comm->c_contextid;
(gdb) backtrace
If you can get into gdb here, I guess it would be useful to print the
address of hdr->hdr_match.hdr_ctx and
sendreq->req_send.req_base.req_comm->c_contextid to see which one is
misaligned.
Not sure of the gdb syntax... does it work to just do
p&hdr->hdr_match.hdr_ctx and sendreq->req_send.req_base.req
p&sendreq->req_send.req_base.req_comm->c_contextid
Oh, sorry that I didn't do that before...
The values are:
&hdr->hdr_match.hdr_ctx and sendreq->req_send.req_base.req = (uint16_t *)
0xad7393
&sendreq->req_send.req_base.req_comm->c_contextid = (uint32_t *) 0x201c20
So hdr_ctx is the bad one...
Regards,
Lukas
PS:
I always don't know the syntax of gdb - hence I use the nice kdbg. *g*
http://net.razik.de/linux/T5120/kdbg-openmpi-1.4.4-osu_latency-02.png
Lukas,
Can you try running the benchmark with coalescing off? To do that add
the following option to your mpirun line "-mca
btl_openib_use_message_coalescing 0".
thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>