Bad news..
I've tried the latest patch with and without the prior one, but it
hasn't changed anything. I've also tried using the old code but with
the OMPI_DPM_BASE_MAXJOBIDS constant changed to 80, but it also didn't
help.
While looking through the sources of openmpi-1.4.2 I couldn't find any
call of the function ompi_dpm_base_mark_dyncomm.


2010/7/12 Ralph Castain <r...@open-mpi.org>:
> Just so you don't have to wait for 1.4.3 release, here is the patch (doesn't 
> include the prior patch).
>
>
>
>
> On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote:
>
>> 2010/7/12 Ralph Castain <r...@open-mpi.org>:
>>> Dug around a bit and found the problem!!
>>>
>>> I have no idea who or why this was done, but somebody set a limit of 64 
>>> separate jobids in the dynamic init called by ompi_comm_set, which builds 
>>> the intercommunicator. Unfortunately, they hard-wired the array size, but 
>>> never check that size before adding to it.
>>>
>>> So after 64 calls to connect_accept, you are overwriting other areas of the 
>>> code. As you found, hitting 66 causes it to segfault.
>>>
>>> I'll fix this on the developer's trunk (I'll also add that original patch 
>>> to it). Rather than my searching this thread in detail, can you remind me 
>>> what version you are using so I can patch it too?
>>
>> I'm using 1.4.2
>> Thanks a lot and I'm looking forward for the patch.
>>
>>>
>>> Thanks for your patience with this!
>>> Ralph
>>>
>>>
>>> On Jul 12, 2010, at 7:20 AM, Grzegorz Maj wrote:
>>>
>>>> 1024 is not the problem: changing it to 2048 hasn't change anything.
>>>> Following your advice I've run my process using gdb. Unfortunately I
>>>> didn't get anything more than:
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> [Switching to Thread 0xf7e4c6c0 (LWP 20246)]
>>>> 0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>>>
>>>> (gdb) bt
>>>> #0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>>> #1  0xf7e3ba95 in connect_accept () from
>>>> /home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
>>>> #2  0xf7f62013 in PMPI_Comm_connect () from 
>>>> /home/gmaj/openmpi/lib/libmpi.so.0
>>>> #3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43
>>>>
>>>> What's more: when I've added a breakpoint on ompi_comm_set in 66th
>>>> process and stepped a couple of instructions, one of the other
>>>> processes crashed (as usualy on ompi_comm_set) earlier than 66th did.
>>>>
>>>> Finally I decided to recompile openmpi using -g flag for gcc. In this
>>>> case the 66 processes issue has gone! I was running my applications
>>>> exactly the same way as previously (even without recompilation) and
>>>> I've run successfully over 130 processes.
>>>> When switching back to the openmpi compilation without -g it again 
>>>> segfaults.
>>>>
>>>> Any ideas? I'm really confused.
>>>>
>>>>
>>>>
>>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>:
>>>>> I would guess the #files limit of 1024. However, if it behaves the same 
>>>>> way when spread across multiple machines, I would suspect it is somewhere 
>>>>> in your program itself. Given that the segfault is in your process, can 
>>>>> you use gdb to look at the core file and see where and why it fails?
>>>>>
>>>>> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>>>>>
>>>>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>:
>>>>>>>
>>>>>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>>>>>
>>>>>>>> Hi Ralph,
>>>>>>>> sorry for the late response, but I couldn't find free time to play
>>>>>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>>>>>> my processes in the way you've described and I think it's working as
>>>>>>>> you expected. None of my processes runs the orted daemon and they can
>>>>>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>>>>>> processes issue :(
>>>>>>>> Maybe I'm doing something wrong.
>>>>>>>> I attach my source code. If anybody could have a look on this, I would
>>>>>>>> be grateful.
>>>>>>>>
>>>>>>>> When I run that code with clients_count <= 65 everything works fine:
>>>>>>>> all the processes create a common grid, exchange some information and
>>>>>>>> disconnect.
>>>>>>>> When I set clients_count > 65 the 66th process crashes on
>>>>>>>> MPI_Comm_connect (segmentation fault).
>>>>>>>
>>>>>>> I didn't have time to check the code, but my guess is that you are 
>>>>>>> still hitting some kind of file descriptor or other limit. Check to see 
>>>>>>> what your limits are - usually "ulimit" will tell you.
>>>>>>
>>>>>> My limitations are:
>>>>>> time(seconds)        unlimited
>>>>>> file(blocks)         unlimited
>>>>>> data(kb)             unlimited
>>>>>> stack(kb)            10240
>>>>>> coredump(blocks)     0
>>>>>> memory(kb)           unlimited
>>>>>> locked memory(kb)    64
>>>>>> process              200704
>>>>>> nofiles              1024
>>>>>> vmemory(kb)          unlimited
>>>>>> locks                unlimited
>>>>>>
>>>>>> Which one do you think could be responsible for that?
>>>>>>
>>>>>> I was trying to run all the 66 processes on one machine or spread them
>>>>>> across several machines and it always crashes the same way on the 66th
>>>>>> process.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Another thing I would like to know is if it's normal that any of my
>>>>>>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>>>>>>>> other side is not ready, is eating up a full CPU available.
>>>>>>>
>>>>>>> Yes - the waiting process is polling in a tight loop waiting for the 
>>>>>>> connection to be made.
>>>>>>>
>>>>>>>>
>>>>>>>> Any help would be appreciated,
>>>>>>>> Grzegorz Maj
>>>>>>>>
>>>>>>>>
>>>>>>>> 2010/4/24 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>> Actually, OMPI is distributed with a daemon that does pretty much 
>>>>>>>>> what you
>>>>>>>>> want. Checkout "man ompi-server". I originally wrote that code to 
>>>>>>>>> support
>>>>>>>>> cross-application MPI publish/subscribe operations, but we can 
>>>>>>>>> utilize it
>>>>>>>>> here too. Have to blame me for not making it more publicly known.
>>>>>>>>> The attached patch upgrades ompi-server and modifies the singleton 
>>>>>>>>> startup
>>>>>>>>> to provide your desired support. This solution works in the following
>>>>>>>>> manner:
>>>>>>>>> 1. launch "ompi-server -report-uri <filename>". This starts a 
>>>>>>>>> persistent
>>>>>>>>> daemon called "ompi-server" that acts as a rendezvous point for
>>>>>>>>> independently started applications.  The problem with starting 
>>>>>>>>> different
>>>>>>>>> applications and wanting them to MPI connect/accept lies in the need 
>>>>>>>>> to have
>>>>>>>>> the applications find each other. If they can't discover contact info 
>>>>>>>>> for
>>>>>>>>> the other app, then they can't wire up their interconnects. The
>>>>>>>>> "ompi-server" tool provides that rendezvous point. I don't like that
>>>>>>>>> comm_accept segfaulted - should have just error'd out.
>>>>>>>>> 2. set OMPI_MCA_orte_server=file:<filename>" in the environment where 
>>>>>>>>> you
>>>>>>>>> will start your processes. This will allow your singleton processes 
>>>>>>>>> to find
>>>>>>>>> the ompi-server. I automatically also set the envar to connect the MPI
>>>>>>>>> publish/subscribe system for you.
>>>>>>>>> 3. run your processes. As they think they are singletons, they will 
>>>>>>>>> detect
>>>>>>>>> the presence of the above envar and automatically connect themselves 
>>>>>>>>> to the
>>>>>>>>> "ompi-server" daemon. This provides each process with the ability to 
>>>>>>>>> perform
>>>>>>>>> any MPI-2 operation.
>>>>>>>>> I tested this on my machines and it worked, so hopefully it will meet 
>>>>>>>>> your
>>>>>>>>> needs. You only need to run one "ompi-server" period, so long as you 
>>>>>>>>> locate
>>>>>>>>> it where all of the processes can find the contact file and can open 
>>>>>>>>> a TCP
>>>>>>>>> socket to the daemon. There is a way to knit multiple ompi-servers 
>>>>>>>>> into a
>>>>>>>>> broader network (e.g., to connect processes that cannot directly 
>>>>>>>>> access a
>>>>>>>>> server due to network segmentation), but it's a tad tricky - let me 
>>>>>>>>> know if
>>>>>>>>> you require it and I'll try to help.
>>>>>>>>> If you have trouble wiring them all into a single communicator, you 
>>>>>>>>> might
>>>>>>>>> ask separately about that and see if one of our MPI experts can 
>>>>>>>>> provide
>>>>>>>>> advice (I'm just the RTE grunt).
>>>>>>>>> HTH - let me know how this works for you and I'll incorporate it into 
>>>>>>>>> future
>>>>>>>>> OMPI releases.
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>>>>>>>>>
>>>>>>>>> Hi Ralph,
>>>>>>>>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
>>>>>>>>> project/experiment.
>>>>>>>>> We definitely would like to give your patch a try. But could you 
>>>>>>>>> please
>>>>>>>>> explain your solution a little more?
>>>>>>>>> You still would like to start one mpirun per mpi grid, and then have
>>>>>>>>> processes started by us to join the MPI comm?
>>>>>>>>> It is a good solution of course.
>>>>>>>>> But it would be especially preferable to have one daemon running
>>>>>>>>> persistently on our "entry" machine that can handle several mpi grid 
>>>>>>>>> starts.
>>>>>>>>> Can your patch help us this way too?
>>>>>>>>> Thanks for your help!
>>>>>>>>> Krzysztof
>>>>>>>>>
>>>>>>>>> On 24 April 2010 03:51, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>>>>>
>>>>>>>>>> In thinking about this, my proposed solution won't entirely fix the
>>>>>>>>>> problem - you'll still wind up with all those daemons. I believe I 
>>>>>>>>>> can
>>>>>>>>>> resolve that one as well, but it would require a patch.
>>>>>>>>>>
>>>>>>>>>> Would you like me to send you something you could try? Might take a 
>>>>>>>>>> couple
>>>>>>>>>> of iterations to get it right...
>>>>>>>>>>
>>>>>>>>>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote:
>>>>>>>>>>
>>>>>>>>>>> Hmmm....I -think- this will work, but I cannot guarantee it:
>>>>>>>>>>>
>>>>>>>>>>> 1. launch one process (can just be a spinner) using mpirun that 
>>>>>>>>>>> includes
>>>>>>>>>>> the following option:
>>>>>>>>>>>
>>>>>>>>>>> mpirun -report-uri file
>>>>>>>>>>>
>>>>>>>>>>> where file is some filename that mpirun can create and insert its
>>>>>>>>>>> contact info into it. This can be a relative or absolute path. This 
>>>>>>>>>>> process
>>>>>>>>>>> must remain alive throughout your application - doesn't matter what 
>>>>>>>>>>> it does.
>>>>>>>>>>> It's purpose is solely to keep mpirun alive.
>>>>>>>>>>>
>>>>>>>>>>> 2. set OMPI_MCA_dpm_orte_server=FILE:file in your environment, where
>>>>>>>>>>> "file" is the filename given above. This will tell your processes 
>>>>>>>>>>> how to
>>>>>>>>>>> find mpirun, which is acting as a meeting place to handle the 
>>>>>>>>>>> connect/accept
>>>>>>>>>>> operations
>>>>>>>>>>>
>>>>>>>>>>> Now run your processes, and have them connect/accept to each other.
>>>>>>>>>>>
>>>>>>>>>>> The reason I cannot guarantee this will work is that these processes
>>>>>>>>>>> will all have the same rank && name since they all start as 
>>>>>>>>>>> singletons.
>>>>>>>>>>> Hence, connect/accept is likely to fail.
>>>>>>>>>>>
>>>>>>>>>>> But it -might- work, so you might want to give it a try.
>>>>>>>>>>>
>>>>>>>>>>> On Apr 23, 2010, at 8:10 AM, Grzegorz Maj wrote:
>>>>>>>>>>>
>>>>>>>>>>>> To be more precise: by 'server process' I mean some process that I
>>>>>>>>>>>> could run once on my system and it could help in creating those
>>>>>>>>>>>> groups.
>>>>>>>>>>>> My typical scenario is:
>>>>>>>>>>>> 1. run N separate processes, each without mpirun
>>>>>>>>>>>> 2. connect them into MPI group
>>>>>>>>>>>> 3. do some job
>>>>>>>>>>>> 4. exit all N processes
>>>>>>>>>>>> 5. goto 1
>>>>>>>>>>>>
>>>>>>>>>>>> 2010/4/23 Grzegorz Maj <ma...@wp.pl>:
>>>>>>>>>>>>> Thank you Ralph for your explanation.
>>>>>>>>>>>>> And, apart from that descriptors' issue, is there any other way to
>>>>>>>>>>>>> solve my problem, i.e. to run separately a number of processes,
>>>>>>>>>>>>> without mpirun and then to collect them into an MPI intracomm 
>>>>>>>>>>>>> group?
>>>>>>>>>>>>> If I for example would need to run some 'server process' (even 
>>>>>>>>>>>>> using
>>>>>>>>>>>>> mpirun) for this task, that's OK. Any ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Grzegorz Maj
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>>>>>> Okay, but here is the problem. If you don't use mpirun, and are 
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> operating in an environment we support for "direct" launch 
>>>>>>>>>>>>>> (i.e., starting
>>>>>>>>>>>>>> processes outside of mpirun), then every one of those processes 
>>>>>>>>>>>>>> thinks it is
>>>>>>>>>>>>>> a singleton - yes?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What you may not realize is that each singleton immediately
>>>>>>>>>>>>>> fork/exec's an orted daemon that is configured to behave just 
>>>>>>>>>>>>>> like mpirun.
>>>>>>>>>>>>>> This is required in order to support MPI-2 operations such as
>>>>>>>>>>>>>> MPI_Comm_spawn, MPI_Comm_connect/accept, etc.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So if you launch 64 processes that think they are singletons, 
>>>>>>>>>>>>>> then
>>>>>>>>>>>>>> you have 64 copies of orted running as well. This eats up a lot 
>>>>>>>>>>>>>> of file
>>>>>>>>>>>>>> descriptors, which is probably why you are hitting this 65 
>>>>>>>>>>>>>> process limit -
>>>>>>>>>>>>>> your system is probably running out of file descriptors. You 
>>>>>>>>>>>>>> might check you
>>>>>>>>>>>>>> system limits and see if you can get them revised upward.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, I know. The problem is that I need to use some special way 
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> running my processes provided by the environment in which I'm
>>>>>>>>>>>>>>> working
>>>>>>>>>>>>>>> and unfortunately I can't use mpirun.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>>>>>>>> Guess I don't understand why you can't use mpirun - all it 
>>>>>>>>>>>>>>>> does is
>>>>>>>>>>>>>>>> start things, provide a means to forward io, etc. It mainly 
>>>>>>>>>>>>>>>> sits there
>>>>>>>>>>>>>>>> quietly without using any cpu unless required to support the 
>>>>>>>>>>>>>>>> job.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sounds like it would solve your problem. Otherwise, I know of 
>>>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>> way to get all these processes into comm_world.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>> I'd like to dynamically create a group of processes 
>>>>>>>>>>>>>>>>> communicating
>>>>>>>>>>>>>>>>> via
>>>>>>>>>>>>>>>>> MPI. Those processes need to be run without mpirun and create
>>>>>>>>>>>>>>>>> intracommunicator after the startup. Any ideas how to do this
>>>>>>>>>>>>>>>>> efficiently?
>>>>>>>>>>>>>>>>> I came up with a solution in which the processes are 
>>>>>>>>>>>>>>>>> connecting
>>>>>>>>>>>>>>>>> one by
>>>>>>>>>>>>>>>>> one using MPI_Comm_connect, but unfortunately all the 
>>>>>>>>>>>>>>>>> processes
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> are already in the group need to call MPI_Comm_accept. This 
>>>>>>>>>>>>>>>>> means
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> when the n-th process wants to connect I need to collect all 
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> n-1
>>>>>>>>>>>>>>>>> processes on the MPI_Comm_accept call. After I run about 40
>>>>>>>>>>>>>>>>> processes
>>>>>>>>>>>>>>>>> every subsequent call takes more and more time, which I'd 
>>>>>>>>>>>>>>>>> like to
>>>>>>>>>>>>>>>>> avoid.
>>>>>>>>>>>>>>>>> Another problem in this solution is that when I try to connect
>>>>>>>>>>>>>>>>> 66-th
>>>>>>>>>>>>>>>>> process the root of the existing group segfaults on
>>>>>>>>>>>>>>>>> MPI_Comm_accept.
>>>>>>>>>>>>>>>>> Maybe it's my bug, but it's weird as everything works fine 
>>>>>>>>>>>>>>>>> for at
>>>>>>>>>>>>>>>>> most
>>>>>>>>>>>>>>>>> 65 processes. Is there any limitation I don't know about?
>>>>>>>>>>>>>>>>> My last question is about MPI_COMM_WORLD. When I run my 
>>>>>>>>>>>>>>>>> processes
>>>>>>>>>>>>>>>>> without mpirun their MPI_COMM_WORLD is the same as 
>>>>>>>>>>>>>>>>> MPI_COMM_SELF.
>>>>>>>>>>>>>>>>> Is
>>>>>>>>>>>>>>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>>>>>>>>>>>>>>> intracommunicator that I've created?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Grzegorz Maj
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>> <client.c><server.c>_______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to