Gilles,

I've made another observation about what I believe is an error in the XRC
configure probe.

If I am following the code below correctly, then *both* ConnectX and
ConnectIB depend on ibv_create_xrc_rcv_qp being defined.
However, that function is marked as deprecated (presumably in favor of
ibv_cmd_open_xrcd).
So, when a later revision of libibverbs removes the deprecated function,
*neither* the old or new interface will be detected as supported!

           # ibv_create_xrc_rcv_qp was added in OFED 1.3
           # ibv_cmd_open_xrcd (aka XRC Domains) was added in  OFED 3.12
           if test "$enable_connectx_xrc" = "yes"; then
               $1_have_xrc=1
               AC_CHECK_FUNCS([ibv_create_xrc_rcv_qp],
                              [], [$1_have_xrc=0])
               AC_CHECK_DECLS([IBV_SRQT_XRC],
                              [], [$1_have_xrc=0],
                              [#include <infiniband/verbs.h>])
           fi
           if test "$enable_connectx_xrc" = "yes" \
               && test $$1_have_xrc -eq 1; then
               AC_CHECK_FUNCS([ibv_cmd_open_xrcd], [$1_have_xrc_domains=1])
           fi

While I am not certain if a probe for IBV_SRQT_XRC is really necessary, my
suggested replacement for the logic above is:

           # ibv_create_xrc_rcv_qp was added in OFED 1.3
           # ibv_cmd_open_xrcd (aka XRC Domains) was added in  OFED 3.12
           if test "$enable_connectx_xrc" = "yes"; then
               $1_have_xrc=1
               AC_CHECK_FUNCS([ibv_cmd_open_xrcd],
                              [AC_CHECK_DECLS([IBV_SRQT_XRC],
                                              [$1_have_xrc_domains=1],
                                              [$1_have_xrc=0],
                                              [#include
<infiniband/verbs.h>])])
               AC_CHECK_FUNCS([ibv_create_xrc_rcv_qp],
                              [$1_have_xrc=1])
           fi

In summary
  $1_have_xrc_domains = HAVE_IBV_CMD_OPEN_XRCD && HAVE_IBV_SRQT_XRC
  $1_have_xrc = $1_have_xrc_domains || HAVE_IBV_CREATE_XRC_RCV_QP

This worked as expected on old (only ConnectX XRC support) and new (both
ConnectX and ConnectIB XRC support) in my testing.

-Paul

On Thu, Jul 9, 2015 at 7:06 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Gilles,
>
> The patch didn't apply to the 1.8.7rc1 tarball.
> So, I made the change manually and ran autogen.pl.
>
> The result is that one fewer configure test runs, but "ConnectX XRC
> support" is still disabled:
>
> Diffing the configure output:
>  checking for ibv_resize_cq... yes
>  checking for struct ibv_device.transport_type... yes
>  checking for ibv_create_xrc_rcv_qp... yes
> -checking for ibv_cmd_open_xrcd... no
>  checking whether IBV_SRQT_XRC is declared... no
>  checking infiniband/complib/cl_types_osd.h usability... no
>  checking infiniband/complib/cl_types_osd.h presence... no
>
>
> You will note that "IBV_SRQT_XRC" did not appear when I grepped for XRC in
> /usr/include/infiniband/verbs.h (in a previous message).
> I am not sure, but suspect that identifier is related to "ConnectIB XRC
> support" (not ConnectX).
> If you look back at the 1.8.4 release you will find only a check for
> ibv_create_xrc_rcv_qp.
>
> -Paul
>
> On Thu, Jul 9, 2015 at 6:17 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
>
>>  Thanks Paul,
>>
>> i just found an other bug ...
>> (and i should be blamed for it)
>>
>> here is attached a patch.
>>
>> basically, xrc was incorrectly disabled on "older" ofed stacks
>>
>> Cheers,
>>
>> Gilles
>>
>>
>>
>> On 7/10/2015 10:06 AM, Paul Hargrove wrote:
>>
>> Gilles,
>>
>>  A bzip2-compressed config.log is attached.
>>
>>  I am unsure how to determine the OFED version, because the admins have
>> prevented normal users from reading the RPM database.
>> Perhaps the following helps:
>>
>>  $ nm /usr/lib64/libibverbs.a | grep -i xrc
>> 00000000000000e0 T ibv_cmd_close_xrc_domain
>> 0000000000000230 T ibv_cmd_create_xrc_rcv_qp
>> 00000000000003b0 T ibv_cmd_create_xrc_srq
>> 0000000000000a40 T ibv_cmd_modify_xrc_rcv_qp
>> 0000000000000150 T ibv_cmd_open_xrc_domain
>> 0000000000001e30 T ibv_cmd_query_xrc_rcv_qp
>> 0000000000000070 T ibv_cmd_reg_xrc_rcv_qp
>> 0000000000000000 T ibv_cmd_unreg_xrc_rcv_qp
>> 00000000000002b0 T ibv_close_xrc_domain
>> 00000000000002d0 T ibv_create_xrc_rcv_qp
>> 00000000000007a0 T ibv_create_xrc_srq
>> 0000000000000310 T ibv_modify_xrc_rcv_qp
>> 0000000000000280 T ibv_open_xrc_domain
>> 0000000000000340 T ibv_query_xrc_rcv_qp
>> 0000000000000370 T ibv_reg_xrc_rcv_qp
>> 0000000000000390 T ibv_unreg_xrc_rcv_qp
>>
>>  $ grep XRC /usr/include/infiniband/verbs.h
>>         IBV_DEVICE_XRC                  = 1 << 20
>>         IBV_XRC_QP_EVENT_FLAG = 0x80000000,
>>         IBV_QPT_XRC,
>> [matches in comments have been removed].
>>
>>  When tonight's master tarball is posted (perhaps 10 minutes from now) I
>> will test it and report what I find.
>>
>>  -Paul
>>
>>
>> On Thu, Jul 9, 2015 at 5:17 PM, Gilles Gouaillardet <gil...@rist.or.jp>
>> wrote:
>>
>>>  Paul,
>>>
>>> can you please compress and post your config.log ?
>>> what is the OFED version you are running ?
>>>
>>> on master, that fix did the trick on mellanox test cluster (recent OFED
>>> version) but did not
>>> enable XRC on lanl test clusters (my best bet is an old OFED library)
>>>
>>> Thanks
>>>
>>> Gilles
>>>
>>>
>>> On 7/10/2015 9:08 AM, Paul Hargrove wrote:
>>>
>>>  Preliminary report:
>>>
>>> 1) I find that "ConnectX XRC support" is still not detected as it was in
>>> 1.8.4 and earlier:
>>>
>>>  $ grep  'ConnectX XRC support' openmpi-1.*-icc-14/LOG/configure.log|
>>>  sort -u
>>>   openmpi-1.8-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... yes
>>>   openmpi-1.8.1-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... yes
>>>   openmpi-1.8.2-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... yes
>>>   openmpi-1.8.3-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... yes
>>>   openmpi-1.8.4-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... yes
>>>   openmpi-1.8.5-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... no
>>>   openmpi-1.8.6-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... no
>>>   openmpi-1.8.7rc1-linux-x86_64-icc-14/LOG/configure.log:checking if
>>> ConnectX XRC support is enabled... no
>>>
>>>
>>>
>>>  2) I noticed a cosmetic "glitch" in the configure output:
>>>
>>>  checking for working epoll library interface... checking if epoll can 
>>> build... yes
>>>
>>>   yes
>>>
>>>  This just means AC_MSG_{CHECKING,RESULT} macros are nested when they
>>> shouldn't be.
>>>  There is nothing to suggest that the results of the configure probes
>>> are incorrect.
>>>
>>>
>>>  -Paul
>>>
>>> On Thu, Jul 9, 2015 at 1:03 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>>> In the usual place:
>>>>
>>>>  http://www.open-mpi.org/software/ompi/v1.8/
>>>>
>>>>  Please test and let me know of any issues that surface. My intent is
>>>> to release this next week.
>>>>  Ralph
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2015/07/17604.php
>>>>
>>>
>>>
>>>
>>>  --
>>>   Paul H. Hargrove                          phhargr...@lbl.gov
>>> Computer Languages & Systems Software (CLaSS) Group
>>> Computer Science Department               Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>>
>>>
>>>  _______________________________________________
>>> devel mailing listde...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/07/17606.php
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/07/17607.php
>>>
>>
>>
>>
>>  --
>>   Paul H. Hargrove                          phhargr...@lbl.gov
>> Computer Languages & Systems Software (CLaSS) Group
>> Computer Science Department               Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>
>>
>> _______________________________________________
>> devel mailing listde...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/07/17608.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/07/17609.php
>>
>
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to