Hi Rolf,
I applied your patch, the full output is rather big, even gzip >
10Mb, which is not good for the mailinglist, but the head and tail are
below for a 7 and 8 processor run.
Seem that the send requests are growing fast 4000 times in just 10 min.
Do you now of a method to bound the list such that it is not growing
excessivly ?
thanks
Max
7 Processor run
------------------
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11236]
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11236]
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11236]
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11236]
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
......
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11243]
[gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11243]
[gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11243]
[gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11243]
[gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11243]
[gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
8 Processor run
--------------------
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11315]
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11315]
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11315]
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11315]
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
...
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11322]
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11322]
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11322]
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
[gpu207.dev-env.lan:11322]
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-1
[gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
Am 12.09.2013 17:04, schrieb Rolf vandeVaart:
Can you apply this patch and try again? It will print out the sizes of the
free lists after every 100 calls into the mca_pml_ob1_send. It would be
interesting to see which one is growing.
This might give us some clues.
Rolf
-----Original Message-----
From: Max Staufer [mailto:max.stau...@gmx.net]
Sent: Thursday, September 12, 2013 3:53 AM
To: Rolf vandeVaart
Subject: Re: [OMPI devel] Nearly unlimited growth of pml free list
Hi Rolf,
the heap snapshots I do tell me where and when the memory has been
allocated, and a simple source trace of the in tells me that the calling
routine was mca_pml_ob1_send and that all of the ~100000 single allocations
during the run were called because of an MPI_ALLREDUCE command called in
exactly one place of the code.
The tool I use for doing that is MemorySCAPE but I thing Valgrind can tell you
the same thing. However, I was not able to reproduce the problem in a
simpler program yet, but I suspect it has something to do with the locking
mechanism of the list elements. I dont know enough about OMPI to comment
on that, but it looks like that the list is growing because all elements are
locked.
really any help is appreciated
Max
PS:
IF I MIMICK ALLREDUCE with 2*Nproc SEND and RECV commands (aggregating
on proc 0 and then sending out to all Proc) I get the same kind of behaviour.
Am 11.09.2013 17:12, schrieb Rolf vandeVaart:
Hi Max:
You say that that the function keeps "allocating memory in the pml free list."
How do you know that is happening?
Do you know which free list it is happening on? There are something like 8
free lists associated with the pml ob1 so it would be interesting to know which
one you observe is growing.
Rolf
-----Original Message-----
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Max
Staufer
Sent: Wednesday, September 11, 2013 10:23 AM
To: de...@open-mpi.org
Subject: [OMPI devel] Nearly unlimited growth of pml free list
Hi All,
as I already asked in the users list, I was told thats not the
right place to ask, I came across a "missbehaviour" of openmpi version
1.4.5 and 1.6.5 alike.
the mca_pml_ob1_send function keeps allocating memory in the pml free
list.
It does that indefinitly. In my case the list grew to about 100Gb.
I can controll the maximum using the pml_ob1_free_list_max parameter,
but then the application just stops working when this number of
entries in the list is reached.
The interesting part is that the growth only happens in a single
place in the code, which is RECURSIVE SUBROUTINE.
And the called function is an MPI_ALLREDUCE(... MPI_SUM)
Apparently its not easy to create a test program that shows the same
behaviour, just recursion is not enought.
Is there a mca parameter that allows to limit the total list size
without making the app. stop ?
or is there a way to enforce the lock on the free list entries ?
Thanks for all the help
Max
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
----------------------------------------------------------------------
------------- This email message is for the sole use of the intended
recipient(s) and may contain confidential information. Any
unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
----------------------------------------------------------------------
-------------