Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT()
George, Thank you for response. In my opinion our solution with do/while() loop in OMPI_FREE_LIST_GET_MT is better for our MPI+OpenMP hybrid application than using OMPI_FREE_LIST_WAIT_MT. Because in case OMPI_FREE_LIST_WAIT_MT MPI_Irecv() will be suspended in opal_progress() until one of MPI_Irecv() requests from other thread is completed. And it is not the case when the list reached free_list_max_num limit. The situation is that the threads consumed all items from free list before one other thread completed ompi_free_list_grow() and that thread executing ompi_free_list_grow() got NULL. Sorry for my poor English. Alexey. *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *George Bosilca *Sent:* Wednesday, September 16, 2015 10:18 PM *To:* Open MPI Developers *Subject:* Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() On Wed, Sep 16, 2015 at 3:11 PM, Владимир Трущин wrote: Sorry, “We saw the following problem in OMPI_FREE_LIST_GET_MT…”. That's exactly what the WAIT macro is supposed to solve, wait (grow the freelist and call opal_progress) until an item become available. George. *From:* Владимир Трущин [mailto:vdtrusc...@compcenter.org] *Sent:* Wednesday, September 16, 2015 10:09 PM *To:* 'Open MPI Developers' *Subject:* RE: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() George, You are right. The sequence of calls in our test is MPI_Irecv -> mca_pml_ob1_irecv -> MCA_PML_OB1_RECV_REQUEST_ALLOC. We will try to use OMPI_FREE_LIST_WAIT_MT. We saw the following problem in OMPI_FREE_LIST_WAIT_MT. It returned NULL in case when thread A was suspended after the call of ompi_free_list_grow. At this time others threads took all items from free list at the first call of opal_atomic_lifo_pop in macro. So, when thread A was unsuspended and call the second opal_atomic_lifo_pop in macro - it returned NULL. Best regards, Vladimir. *From:* devel [mailto:devel-boun...@open-mpi.org ] *On Behalf Of *George Bosilca *Sent:* Wednesday, September 16, 2015 7:00 PM *To:* Open MPI Developers *Subject:* Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() Alexey, This is not necessarily the fix for all cases. Most of the internal uses of the free_list can easily accommodate to the fact that no more elements are available. Based on your description of the problem I would assume you encounter this problem once the MCA_PML_OB1_RECV_REQUEST_ALLOC is called. In this particular case the problem is that fact that we call OMPI_FREE_LIST_GET_MT and that the upper level is unable to correctly deal with the case where the returned item is NULL. In this particular case the real fix is to use the blocking version of the free_list accessor (similar to the case for send) OMPI_FREE_LIST_WAIT_MT. It is also possible that I misunderstood your problem. IF the solution above doesn't work can you describe exactly where the NULL return of the OMPI_FREE_LIST_GET_MT is creating an issue? George. On Wed, Sep 16, 2015 at 9:03 AM, Алексей Рыжих wrote: Hi all, We experimented with MPI+OpenMP hybrid application (MPI_THREAD_MULTIPLE support level) where several threads submits a lot of MPI_Irecv() requests simultaneously and encountered an intermittent bug OMPI_ERR_TEMP_OUT_OF_RESOURCE after MCA_PML_OB1_RECV_REQUEST_ALLOC() because OMPI_FREE_LIST_GET_MT() returned NULL. Investigating this bug we found that sometimes the thread calling ompi_free_list_grow() don’t have any free items in LIFO list at exit because other threads retrieved all new items at opal_atomic_lifo_pop() So we suggest to change OMPI_FREE_LIST_GET_MT() as below: #define OMPI_FREE_LIST_GET_MT(fl, item) \ { \ item = (ompi_free_list_item_t*) opal_atomic_lifo_pop(&((fl)->super)); \ if( OPAL_UNLIKELY(NULL == item) ) { \ if(opal_using_threads()) {\ int rc; \ opal_mutex_lock(&((fl)->fl_lock));\ do\ { \ rc = ompi_free_list_grow((fl), (fl)->fl_num_per_alloc); \ if( OPAL_UNLIKELY(rc != OMPI_SUCCESS)) break; \ \ item = (ompi_free_list_item_t*) opal_atomic_lifo_pop(&((fl)->super)); \ \ } while (!item); \ opal_mutex_unlock(&((fl)->fl_lock)); \ } else { \ ompi_free_list_grow((fl), (fl)->fl_num_per_alloc);\ item
Re: [OMPI devel] --enable-spare-groups build broken
No, it was not. Will fix. -Nathan On Wed, Sep 16, 2015 at 07:26:58PM -0700, Ralph Castain wrote: >Yes - Nathan made some changes related to the add_procs code. I doubt that >configure option was checked... >On Wed, Sep 16, 2015 at 7:13 PM, Jeff Squyres (jsquyres) > wrote: > > Did something change in the group structure in the last 24-48 hours? > > --enable-spare-groups groups are currently broken: > > > make[2]: Entering directory `/home/jsquyres/git/ompi/ompi/debuggers' >CC libdebuggers_la-ompi_debuggers.lo > In file included from ../../ompi/communicator/communicator.h:38:0, > from ../../ompi/mca/pml/base/pml_base_request.h:32, > from ompi_debuggers.c:67: > ../../ompi/group/group.h: In function `ompi_group_get_proc_ptr': > ../../ompi/group/group.h:366:52: error: `peer_id' undeclared (first use > in this function) > return ompi_group_dense_lookup (group, peer_id, allocate); > ^ > ../../ompi/group/group.h:366:52: note: each undeclared identifier is > reported only once for each function it appears in > - > > Can someone have a look? > > Thanks. > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18056.php > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18057.php pgpUgIJf38XsO.pgp Description: PGP signature
Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT()
On Sep 16, 2015, at 12:02 PM, George Bosilca wrote: > > ./opal/mca/btl/usnic/btl_usnic_compat.h:161:OMPI_FREE_LIST_GET_MT(list, > (item)) FWIW: This one exists because we use the same usnic BTL code between master and v1.8/v1.10. We have some configury that figures out in which tree the usNIC BTL is being compiles, and reacts accordingly. Hence, this OMPI_FREE_LIST_GET_MT is only used when compiling in v1.8/v1.10, and is ignored in master/v2.x. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI devel] regression running mpi applications with dvm
Hi (Ralph), Over the last months I have been focussing on exec throughput, and not so much on the application payload (read: mainly using /bin/sleep ;-) As things are stabilising now, I returned my attention to "real" applications. To discover that launching MPI applications (build with the same Open MPI version) within a DVM doesn't work anymore (see error below). I've been doing a binary search, but that turned out to be not so trivial because of other problems in the window of the change. So far I've narrowed it down to: 64ec498 - mar 5 - works on my laptop (but not on the Crays) b67b361 - mar 28 - works once per DVM launch on my laptop, but consecutive orte-submits get a connect error b209c9e - March 30 - same MPI_Init issue as HEAD Going further into mid-March I run into build issues with verb, runtime issues with default binding complaining about missing libnumactl, runtime tcp oob errors, etc. So I don't know whether the binary search will yield much more than I was able to dig up now. What can I do to get closer to debugging the actual issue? Thanks! Mark OMPI_PREFIX=/Users/mark/proj/openmpi/installed/HEAD OMPI_MCA_orte_hnp_uri=723386368.0;usock;tcp://192.168.0.103:56672 OMPI_MCA_ess=tool [netbook:70703] Job [11038,3] has launched -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_mpi_init: ompi_rte_init failed --> Returned "(null)" (-43) instead of "Success" (0) -- *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, ***and potentially your MPI job) [netbook:70704] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
Re: [OMPI devel] regression running mpi applications with dvm
Ouch - this is on current master HEAD? I'm on travel right now, but I'll be back Fri evening and can look at it this weekend. Probably something silly that needs to be fixed. On Thu, Sep 17, 2015 at 11:30 AM, Mark Santcroos wrote: > Hi (Ralph), > > Over the last months I have been focussing on exec throughput, and not so > much on the application payload (read: mainly using /bin/sleep ;-) > As things are stabilising now, I returned my attention to "real" > applications. > To discover that launching MPI applications (build with the same Open MPI > version) within a DVM doesn't work anymore (see error below). > > I've been doing a binary search, but that turned out to be not so trivial > because of other problems in the window of the change. > So far I've narrowed it down to: > > 64ec498 - mar 5 - works on my laptop (but not on the Crays) > b67b361 - mar 28 - works once per DVM launch on my laptop, but consecutive > orte-submits get a connect error > b209c9e - March 30 - same MPI_Init issue as HEAD > > Going further into mid-March I run into build issues with verb, runtime > issues with default binding complaining about missing libnumactl, runtime > tcp oob errors, etc. > So I don't know whether the binary search will yield much more than I was > able to dig up now. > > What can I do to get closer to debugging the actual issue? > > Thanks! > > Mark > > > OMPI_PREFIX=/Users/mark/proj/openmpi/installed/HEAD > OMPI_MCA_orte_hnp_uri=723386368.0;usock;tcp://192.168.0.103:56672 > OMPI_MCA_ess=tool > [netbook:70703] Job [11038,3] has launched > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: ompi_rte_init failed > --> Returned "(null)" (-43) instead of "Success" (0) > -- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > ***and potentially your MPI job) > [netbook:70704] Local abort before MPI_INIT completed completed > successfully, but am not able to aggregate error messages, and not able to > guarantee that all other processes were killed! > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18064.php >
Re: [OMPI devel] regression running mpi applications with dvm
> On 17 Sep 2015, at 20:34 , Ralph Castain wrote: > > Ouch - this is on current master HEAD? Yep! > I'm on travel right now, but I'll be back Fri evening and can look at it this > weekend. Probably something silly that needs to be fixed. Thanks! Obviously I didn't check every single version between March and now, but its safe to assume that it didn't work in between either I guess. > > > On Thu, Sep 17, 2015 at 11:30 AM, Mark Santcroos > wrote: > Hi (Ralph), > > Over the last months I have been focussing on exec throughput, and not so > much on the application payload (read: mainly using /bin/sleep ;-) > As things are stabilising now, I returned my attention to "real" applications. > To discover that launching MPI applications (build with the same Open MPI > version) within a DVM doesn't work anymore (see error below). > > I've been doing a binary search, but that turned out to be not so trivial > because of other problems in the window of the change. > So far I've narrowed it down to: > > 64ec498 - mar 5 - works on my laptop (but not on the Crays) > b67b361 - mar 28 - works once per DVM launch on my laptop, but consecutive > orte-submits get a connect error > b209c9e - March 30 - same MPI_Init issue as HEAD > > Going further into mid-March I run into build issues with verb, runtime > issues with default binding complaining about missing libnumactl, runtime tcp > oob errors, etc. > So I don't know whether the binary search will yield much more than I was > able to dig up now. > > What can I do to get closer to debugging the actual issue? > > Thanks! > > Mark > > > OMPI_PREFIX=/Users/mark/proj/openmpi/installed/HEAD > OMPI_MCA_orte_hnp_uri=723386368.0;usock;tcp://192.168.0.103:56672 > OMPI_MCA_ess=tool > [netbook:70703] Job [11038,3] has launched > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: ompi_rte_init failed > --> Returned "(null)" (-43) instead of "Success" (0) > -- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > ***and potentially your MPI job) > [netbook:70704] Local abort before MPI_INIT completed completed successfully, > but am not able to aggregate error messages, and not able to guarantee that > all other processes were killed! > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18064.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18065.php
Re: [OMPI devel] regression running mpi applications with dvm
Might not - there has been a very large amount of change over the last few months, and I confess I haven't been checking the DVM regularly. So let me take a step back and look at that code. I'll also include the extensions you requested on the other email - I didn't forget them, just somewhat overwhelmed lately On Thu, Sep 17, 2015 at 11:39 AM, Mark Santcroos wrote: > > > On 17 Sep 2015, at 20:34 , Ralph Castain wrote: > > > > Ouch - this is on current master HEAD? > > Yep! > > > I'm on travel right now, but I'll be back Fri evening and can look at it > this weekend. Probably something silly that needs to be fixed. > > Thanks! > > Obviously I didn't check every single version between March and now, but > its safe to assume that it didn't work in between either I guess. > > > > > > > > On Thu, Sep 17, 2015 at 11:30 AM, Mark Santcroos < > mark.santcr...@rutgers.edu> wrote: > > Hi (Ralph), > > > > Over the last months I have been focussing on exec throughput, and not > so much on the application payload (read: mainly using /bin/sleep ;-) > > As things are stabilising now, I returned my attention to "real" > applications. > > To discover that launching MPI applications (build with the same Open > MPI version) within a DVM doesn't work anymore (see error below). > > > > I've been doing a binary search, but that turned out to be not so > trivial because of other problems in the window of the change. > > So far I've narrowed it down to: > > > > 64ec498 - mar 5 - works on my laptop (but not on the Crays) > > b67b361 - mar 28 - works once per DVM launch on my laptop, but > consecutive orte-submits get a connect error > > b209c9e - March 30 - same MPI_Init issue as HEAD > > > > Going further into mid-March I run into build issues with verb, runtime > issues with default binding complaining about missing libnumactl, runtime > tcp oob errors, etc. > > So I don't know whether the binary search will yield much more than I > was able to dig up now. > > > > What can I do to get closer to debugging the actual issue? > > > > Thanks! > > > > Mark > > > > > > OMPI_PREFIX=/Users/mark/proj/openmpi/installed/HEAD > > OMPI_MCA_orte_hnp_uri=723386368.0;usock;tcp://192.168.0.103:56672 > > OMPI_MCA_ess=tool > > [netbook:70703] Job [11038,3] has launched > > > -- > > It looks like MPI_INIT failed for some reason; your parallel process is > > likely to abort. There are many reasons that a parallel process can > > fail during MPI_INIT; some of which are due to configuration or > environment > > problems. This failure appears to be an internal failure; here's some > > additional information (which may only be relevant to an Open MPI > > developer): > > > > ompi_mpi_init: ompi_rte_init failed > > --> Returned "(null)" (-43) instead of "Success" (0) > > > -- > > *** An error occurred in MPI_Init > > *** on a NULL communicator > > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > > ***and potentially your MPI job) > > [netbook:70704] Local abort before MPI_INIT completed completed > successfully, but am not able to aggregate error messages, and not able to > guarantee that all other processes were killed! > > > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18064.php > > > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18065.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18066.php >
Re: [OMPI devel] regression running mpi applications with dvm
> On 17 Sep 2015, at 20:48 , Ralph Castain wrote: > Might not - there has been a very large amount of change over the last few > months, and I confess I haven't been checking the DVM regularly. So let me > take a step back and look at that code. Ok. > I'll also include the extensions you requested on the other email - I didn't > forget them, just somewhat overwhelmed lately Don't worry too much about these, at least not on the short term, I actually worked around those ... still have to reply to that mail though, let me do that straight away!
[OMPI devel] papers/reports about Open MPI collective algorithms
Hi, Is there some technical reports/ papers to summarize the collective algorithms used in OpenMPI?, such as MPI_barrier, MPI_bcast, and MPI_Alltoall? Dahai
Re: [OMPI devel] Interaction between orterun and user program
Ralph is the guy who needs to answer this for you -- he's on travel at the moment; his response may be a little delayed... > On Sep 16, 2015, at 4:17 AM, Kay Khandan (Hamed) wrote: > > Hello everyone, > > My name is Kay. I’m a huge "oom-pi" fan, but only recently have been looking > at from devel perspective. > > I appreciate if somebody shows me the entry point into understanding how > orterun and user program interact, and more importantly how to change the way > they interact. > > The reason: I am making a plugin for MPI support in another message passing > system. This plugin is loaded from a dynamic library sometime after the > process is started and is run on a separated tread. Therefore, (1) it does > not receive any command line arguments, and (2) it is not allowed to use > standard pipes (file descriptors 0,1, 2). With that in mind, I’de like to > interface this plugin from inside so-called ARE (which is the name for the > runtime environment for this particular message passing system) to our old > friend ORTE. I have the option to run “are” as a user program run by orterun. > > $orterun are ./actual-user-program > > It might be wishful thinking, but I am also kindda hoping that I could get > orterun out of the way all together by embedding a part of its implementation > directly inside that plugin. > > I’de appreciate to hear your insights. > > Best, > — Kay > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18038.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT()
Alexey, There is a conceptual different between GET and WAIT: one can return NULL while the other cannot. If you want a solution with do {} while, I think the best place is specifically in the PML OB1 recv functions (around the OMPI_FREE_LIST_GET_MT) and not inside the OMPI_FREE_LIST_GET_MT macro itself. George. On Thu, Sep 17, 2015 at 2:35 AM, Алексей Рыжих wrote: > George, > > Thank you for response. > > In my opinion our solution with do/while() loop in OMPI_FREE_LIST_GET_MT > is better for our MPI+OpenMP hybrid application than using > OMPI_FREE_LIST_WAIT_MT. > > Because in case OMPI_FREE_LIST_WAIT_MT MPI_Irecv() will be suspended in > opal_progress() until one of MPI_Irecv() requests from other thread is > completed. > > And it is not the case when the list reached free_list_max_num limit. > The situation is that the threads consumed all items from free list > before one other thread completed ompi_free_list_grow() and that thread > executing ompi_free_list_grow() got NULL. > > > > Sorry for my poor English. > > > > Alexey. > > > > *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *George > Bosilca > *Sent:* Wednesday, September 16, 2015 10:18 PM > > *To:* Open MPI Developers > *Subject:* Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() > > > > On Wed, Sep 16, 2015 at 3:11 PM, Владимир Трущин < > vdtrusc...@compcenter.org> wrote: > > Sorry, “We saw the following problem in OMPI_FREE_LIST_GET_MT…”. > > > > That's exactly what the WAIT macro is supposed to solve, wait (grow the > freelist and call opal_progress) until an item become available. > > > > George. > > > > > > > > *From:* Владимир Трущин [mailto:vdtrusc...@compcenter.org] > *Sent:* Wednesday, September 16, 2015 10:09 PM > *To:* 'Open MPI Developers' > *Subject:* RE: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() > > > > George, > > > > You are right. The sequence of calls in our test is MPI_Irecv -> > mca_pml_ob1_irecv -> MCA_PML_OB1_RECV_REQUEST_ALLOC. We will try to use > OMPI_FREE_LIST_WAIT_MT. > > > > We saw the following problem in OMPI_FREE_LIST_WAIT_MT. It returned NULL > in case when thread A was suspended after the call of ompi_free_list_grow. > At this time others threads took all items from free list at the first call > of opal_atomic_lifo_pop in macro. So, when thread A was unsuspended and > call the second opal_atomic_lifo_pop in macro - it returned NULL. > > > > Best regards, > > Vladimir. > > > > *From:* devel [mailto:devel-boun...@open-mpi.org > ] *On Behalf Of *George Bosilca > *Sent:* Wednesday, September 16, 2015 7:00 PM > *To:* Open MPI Developers > *Subject:* Re: [OMPI devel] The issue with OMPI_FREE_LIST_GET_MT() > > > > Alexey, > > > > This is not necessarily the fix for all cases. Most of the internal uses > of the free_list can easily accommodate to the fact that no more elements > are available. Based on your description of the problem I would assume you > encounter this problem once the MCA_PML_OB1_RECV_REQUEST_ALLOC is called. > In this particular case the problem is that fact that we call > OMPI_FREE_LIST_GET_MT and that the upper level is unable to correctly deal > with the case where the returned item is NULL. In this particular case the > real fix is to use the blocking version of the free_list accessor (similar > to the case for send) OMPI_FREE_LIST_WAIT_MT. > > > > > > It is also possible that I misunderstood your problem. IF the solution > above doesn't work can you describe exactly where the NULL return of the > OMPI_FREE_LIST_GET_MT is creating an issue? > > > > George. > > > > > > On Wed, Sep 16, 2015 at 9:03 AM, Алексей Рыжих > wrote: > > Hi all, > > We experimented with MPI+OpenMP hybrid application (MPI_THREAD_MULTIPLE > support level) where several threads submits a lot of MPI_Irecv() requests > simultaneously and encountered an intermittent bug > OMPI_ERR_TEMP_OUT_OF_RESOURCE after MCA_PML_OB1_RECV_REQUEST_ALLOC() > because OMPI_FREE_LIST_GET_MT() returned NULL. Investigating this bug we > found that sometimes the thread calling ompi_free_list_grow() don’t have > any free items in LIFO list at exit because other threads retrieved all > new items at opal_atomic_lifo_pop() > > So we suggest to change OMPI_FREE_LIST_GET_MT() as below: > > > > #define OMPI_FREE_LIST_GET_MT(fl, item) >\ > > { > \ > > item = (ompi_free_list_item_t*) > opal_atomic_lifo_pop(&((fl)->super)); \ > > if( OPAL_UNLIKELY(NULL == item) ) > { \ > > if(opal_using_threads()) > {\ > > int rc; > \ > > > opal_mutex_lock(&((fl)->fl_lock));\ > > > do\ > > { > \ > >
Re: [OMPI devel] orte-dvm and orte_max_vm_size
Hi Ralph, Sorry for the late reply, something along the lines of "swamped" ;-) > On 03 Sep 2015, at 16:04 , Ralph Castain wrote: > The purpose of orte_max_vm_size is to subdivide the allocation - i.e., for a > given mpirun execution, you can specify to only use a certain number of the > allocated nodes. If you want to further limit the VM to specific nodes in the > allocation, then you would use -host option. *nods* Thanks, thats also how I interpreted it. > It’s a little more complicated for your use-case as orte-dvm defines the VM, > not orte-submit. The latter simply tells orte-dvm to launch an application - > the daemons have already been established by orte-dvm and cannot change. So > if you want to setup orte-dvm and then submit to only some of the nodes, you > would have to use the -host option. Note that -host supports an extended > syntax for this purpose - you can ask for a specific number of “empty” nodes, > you can tell it to use only so many slots on a node, etc. Ack. My question originated from running the dvm on a limited set. > I’m confused by your examples because the max_vm_size values don’t seem > right. If you have a VM of size 1 or 2, then max_vm_size can only be 1 or 2. > You can’t have a max_vm_size larger than the number of available nodes. This > is probably the source of the problem you are seeing - I can add some > protection to ensure this doesn’t happen. I screwed up my write-up, the actual calls were correct, but I understand your confusion :-) (In my code I have a "reservation size", which I mixed up with the VM size in my original mail) > We don’t appear to support either -host or -np as MCA params. > I’m not sure -np would make sense, I probably agree with that. > but we could add a param for -host. Yeah, that would help. > We do have a param for the default hostfile, but that probably wouldn’t help > here. I was expecting such a thing actually, that also raised my MCA question. > We can certainly extend the orte-dvm and orte-submit cmd lines. I only > brought over a minimal set at first in order to get things running quickly, > but no problem with increasing capability. Just a question of finding a > little time. Fully understandable! > For ompi_info, try doing “ompi_info -l 9” to get the full output of params. Right I tried that. So I don't understand it completely or it doesn't work as expected, as I dont manage to get e.g. "orte_max_vm_size" as output from that. (I also believe that -all sets the level to 9 already) Thanks! Mark > > >> On Sep 3, 2015, at 5:08 AM, Mark Santcroos >> wrote: >> >> Hi, >> >> I've been running into some funny issue with using orte-dvm (Hi Ralph ;-) >> and trying to define the size of the created vm and for that I use "--mca >> orte_max_vm_size" which in general seems to work. >> >> In this example I have a PBS job of 4 nodes and want to run the DVM on < 4 >> nodes. >> If I create the VM with size 3 or 4 (max_vm_size 1 and 0 respectively) >> everything works as expected. >> However, when I create a VM of size 1 or 2 (max_vm_size 3 and 2 >> respectively) I get the stack trace below once I use orte-submit to start >> something within the VM. >> >> [nid01280:02498] [[39239,0],0] orted:comm:process_commands() Processing >> Command: ORTE_DAEMON_SPAWN_JOB_CMD >> orte-dvm: ../../../../../src/ompi/opal/class/opal_list.h:547: >> _opal_list_append: Assertion `0 == item->opal_list_item_refcount' failed. >> [nid01280:02498] *** Process received signal *** >> [nid01280:02498] Signal: Aborted (6) >> [nid01280:02498] Signal code: (-6) >> [nid01280:02498] [ 0] /lib64/libpthread.so.0(+0xf810)[0x2ba3e274a810] >> [nid01280:02498] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x2ba3e298b885] >> [nid01280:02498] [ 2] /lib64/libc.so.6(abort+0x181)[0x2ba3e298ce61] >> [nid01280:02498] [ 3] /lib64/libc.so.6(__assert_fail+0xf0)[0x2ba3e2984740] >> [nid01280:02498] [ 4] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/lib/libopen-rte.so.0(+0x83f16)[0x2ba3e1687f16] >> [nid01280:02498] [ 5] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/lib/libopen-rte.so.0(orte_plm_base_setup_virtual_machine+0x473)[0x2ba3e16907fe] >> [nid01280:02498] [ 6] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/lib/openmpi/mca_plm_alps.so(+0x274d)[0x2ba3e666574d] >> [nid01280:02498] [ 7] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/lib/libopen-pal.so.0(opal_libevent2022_event_base_loop+0xd81)[0x2ba3e198cee1] >> [nid01280:02498] [ 8] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/bin/orte-dvm[0x402e20] >> [nid01280:02498] [ 9] >> /lib64/libc.so.6(__libc_start_main+0xe6)[0x2ba3e2977c36] >> [nid01280:02498] [10] >> /global/homes/m/marksant/openmpi/edison/installed/HEAD/bin/orte-dvm[0x401d19] >> [nid01280:02498] *** End of error message *** >> [nid05888:25419] >> [[39239,0],1]:../../../../../../src/ompi/orte/mca/errmgr/default_orted/errmgr_default_orted.c(251) >> updating exit status to 1 >