All,

I take advantage of this thread to clarify what is missing to have a perfectly 
MPI agnostic BTL interface. Some of these issues are pretty straightforward 
(getting rid of RTE and OMPI vestiges), some others will require some thinking 
from their developers in order to cope with a not conformant design (such as 
using MPI_COMM_WORLD in the BTL). So, here is an exhaustive list:

- Open IB uses quite a few ORTE internals: orte_proc_is_bound
- also it makes usage of some functions/define that I can’t find anywhere in 
the code base ompi_progress_threads

- UGNI uses MPI_COMM_WORLD for internal management
- USNIC uses num_procs for internal management. It also directly calls 
ompi_rte_abort
- common OFACM uses the num_procs to hash table allocation

Two items are of general interest as they affect our compatibility with past 
installations/usages:
- MPOOL alloc uses MPI level info keys … 
- most of the BTL MCA parameters have not been renamed (!!!). Personally, I 
would be in favor of creating synonyms for now and then deprecate the OMPI 
version in 2.0, but I don’t want to enforce this on everybody. So, the 
discussion is open on this topic.

Ralph and Jeff (I think you added the seq interface to TCP), please take a look 
at the following:
- the implementation of the TCP seq interface seems to be wrong: it used the 
my_node_rank to compute the sequence number instead of the my_local_rank (I 
changed this to my_local_rank)

If you have any issue with the move, I’ll be happy to help and/or support you 
on your last move toward a completely generic BTL. To facilitate your work I 
exposed a minimalistic set of OMPI information at the OPAL level. Take a look 
at opal/util/proc.h for more info, but please try not to expose more.

  Thanks,
    George.


On Jul 26, 2014, at 02:22 , Ralph Castain <r...@open-mpi.org> wrote:

> That's because you folks didn't completely cleanup the open fabrics stuff 
> prior to the move - something that we warned about, but folks said they would 
> resolve later :-)
> 
> On Jul 25, 2014, at 11:19 PM, Mike Dubman <mi...@dev.mellanox.co.il> wrote:
> 
>> Making all in mca/common/ofacm
>> make[2]: Entering directory 
>> `/hpc/local/benchmarks/hpc-stack-gcc/src/install/ompi-master/opal/mca/common/ofacm'
>>   CC       libmca_common_ofacm_la-common_ofacm_base.lo
>>   CC       libmca_common_ofacm_la-common_ofacm_oob.lo
>>   CC       libmca_common_ofacm_la-common_ofacm_empty.lo
>>   LN_S     libmca_common_ofacm.la
>> common_ofacm_oob.c: In function 'oob_component_query':
>> common_ofacm_oob.c:178: warning: passing argument 4 of 
>> 'orte_rml.recv_buffer_nb' from incompatible pointer type
>> common_ofacm_oob.c:178: note: expected 'orte_rml_buffer_callback_fn_t' but 
>> argument is of type 'void (*)(int,  opal_process_name_t *, struct 
>> opal_buffer_t *, ompi_rml_tag_t,  void *)'
>> common_ofacm_xoob.c: In function 'xoob_context_init':
>> common_ofacm_xoob.c:354: error: request for member 'jobid' in something not 
>> a structure or union
>> common_ofacm_xoob.c: In function 'xoob_endpoint_fina
>> common_ofacm_oob.c:728: warning: passing argument 4 of 
>> 'orte_rml.send_buffer_nb' from incompatible pointer type
>> common_ofacm_oob.c:728: note: expected 'orte_rml_buffer_callback_fn_t' but 
>> argument is of type 'void (*)(int,  opal_process_name_t *, struct 
>> opal_buffer_t *, ompi_rml_tag_t,  void *)'
>> common_ofacm_xoob.c: In function 'xoob_send_connect_data':
>> common_ofacm_xoob.c:791: warning: passing argument 1 of 
>> 'orte_rml.send_buffer_nb' from incompatible pointer type
>> common_ofacm_xoob.c:791: note: expected 'struct orte_process_name_t *' but 
>> argument is of type 'opal_process_name_t *'
>> common_ofacm_xoob.c:791: warning: passing argument 4 of 
>> 'orte_rml.send_buffer_nb' from incompatible pointer type
>> common_ofacm_xoob.c:791: note: expected 'orte_rml_buffer_callback_fn_t' but 
>> argument is of type 'void (*)(int,  opal_process_name_t *, struct 
>> opal_buffer_t *, ompi_rml_tag_t,  void *)'
>> common_ofacm_xoob.c: In function 'xoob_recv_qp_create':
>> common_ofacm_xoob.c:963: warning: 'ibv_create_xrc_rcv_qp' is deprecated 
>> (declared at /usr/include/infiniband/ofa_verbs.h:126)
>> common_ofacm_xoob.c:983: warning: 'ibv_modify_xrc_rcv_qp' is deprecated 
>> (declared at /usr/include/infiniband/ofa_verbs.h:152)
>> common_ofacm_xoob.c:1011: warning: 'ibv_modify_xrc_rcv_qp' is deprecated 
>> (declared at /usr/include/infiniband/ofa_verbs.h:152)
>> common_ofacm_xoob.c: In function 'xoob_recv_qp_connect':
>> common_ofacm_xoob.c:1032: warning: 'ibv_reg_xrc_rcv_qp' is deprecated 
>> (declared at /usr/include/infiniband/ofa_verbs.h:185)
>> common_ofacm_xoob.c: In function 'xoob_component_query':
>> common_ofacm_xoob.c:1407: warning: passing argument 4 of 
>> 'orte_rml.recv_buffer_nb' from incompatible pointer type
>> common_ofacm_xoob.c:1407: note: expected 'orte_rml_buffer_callback_fn_t' but 
>> argument is of type 'void (*)(int,  opal_process_name_t *, struct 
>> opal_buffer_t *, ompi_rml_tag_t,  void *)'
>> make[2]: *** [libmca_common_ofacm_la-common_ofacm_xoob.lo] Error 1
>> make[2]: *** Waiting for unfinished jobs....
>> make[2]: Leaving directory 
>> `/hpc/local/benchmarks/hpc-stack-gcc/src/install/ompi-master/opal/mca/common/ofacm'
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/07/15271.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15272.php

Reply via email to