[openib-general] [PATCH] uDAPL - include dapltest and dtest in build
This uDAPL patch adds both dapltest and dtest utilities, including manual pages, to the DAPL project build. The dapltest required some modifications to build on x86_64. James, please review. Signed-off by: Arlin Davis [EMAIL PROTECTED] diff --git a/Makefile.am b/Makefile.am index 1190f20..e2bf4dc 100644 --- a/Makefile.am +++ b/Makefile.am @@ -179,7 +179,9 @@ libdatinclude_HEADERS = dat/include/dat/dat.h \ dat/include/dat/udat.h \ dat/include/dat/udat_redirection.h \ dat/include/dat/udat_vendor_specific.h - + +man_MANS = man/dtest.1 man/dapltest.1 + EXTRA_DIST = dat/common/dat_dictionary.h \ dat/common/dat_dr.h \ dat/common/dat_init.h \ @@ -228,8 +230,10 @@ EXTRA_DIST = dat/common/dat_dictionary.h \ dat/udat/libdat.map \ doc/dat.conf \ dapl/udapl/libdaplcma.map \ -dapl/udapl/libdaplscm.map \ -libdat.spec.in +libdat.spec.in \ +$(man_MANS) dist-hook: libdat.spec cp libdat.spec $(distdir) + +SUBDIRS = . test/dtest test/dapltest diff --git a/configure.in b/configure.in index bf5ec09..324bfa1 100644 --- a/configure.in +++ b/configure.in @@ -1,11 +1,11 @@ dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.57) -AC_INIT(dapl, 1.2.0, [EMAIL PROTECTED]) +AC_INIT(dapl, 1.2.1, openib-general@openib.org) AC_CONFIG_SRCDIR([dat/udat/udat.c]) AC_CONFIG_AUX_DIR(config) AM_CONFIG_HEADER(config.h) -AM_INIT_AUTOMAKE(dapl, 1.2.0) +AM_INIT_AUTOMAKE(dapl, 1.2.1) AM_PROG_LIBTOOL @@ -60,5 +60,6 @@ AC_CACHE_CHECK(whether this is an RHEL system, ac_cv_rhel, fi) AM_CONDITIONAL(OS_RHEL, test $ac_cv_rhel = yes) -AC_CONFIG_FILES([Makefile libdat.spec]) +AC_CONFIG_FILES([Makefile test/dtest/Makefile test/dapltest/Makefile libdat.spec]) + AC_OUTPUT diff --git a/man/dapltest.1 b/man/dapltest.1 new file mode 100644 index 000..8ff4493 --- /dev/null +++ b/man/dapltest.1 @@ -0,0 +1,390 @@ +. Text automatically generated by txt2man +.TH dapltest 1 February 23, 2007 uDAPL 1.2 USER COMMANDS + +.SH NAME +\fB +\fBdapltest \fP- test for the Direct Access Programming Library (DAPL) +\fB +.SH DESCRIPTION + +Dapltest is a set of tests developed to exercise, characterize, +and verify the DAPL interfaces during development and porting. +At least two instantiations of the test must be run. One acts +as the server, fielding requests and spawning server-side test +threads as needed. Other client invocations connect to the server +and issue test requests. The server side of the test, once invoked, +listens continuously for client connection requests, until quit or +killed. Upon receipt of a connection request, the connection is +established, the server and client sides swap version numbers to +verify that they are able to communicate, and the client sends +the test request to the server. If the version numbers match, +and the test request is well-formed, the server spawns the threads +needed to run the test before awaiting further connections. +.SH USAGE + +dapltest [ -f script_file_name ] +[ -T S|Q|T|P|L ] [ -D device_name ] [ -d ] [ -R HT|LL|EC|PM|BE ] +.PP +With no arguments, dapltest runs as a server using default values, +and loops accepting requests from clients. + +The -f option allows all arguments to be placed in a file, to ease +test automation. + +The following arguments are common to all tests: +.TP +.B +[ -T S|Q|T|P|L ] +Test function to be performed: +.RS +.TP +.B +S +- server loop +.TP +.B +Q +- quit, client requests that server +wait for any outstanding tests to +complete, then clean up and exit +.TP +.B +T +- transaction test, transfers data between +client and server +.TP +.B +P +- performance test, times DTO operations +.TP +.B +L +- limit test, exhausts various resources, +runs in client w/o server interaction +Default: S +.RE +.TP +.B +[ -D device_name ] +Specifies the interface adapter name as documented in +the /etc/dat.conf static configuration file. This name +corresponds to the provider library to open. +Default: none +.TP +.B +[ -d ] +Enables extra debug verbosity, primarily tracing +of the various DAPL operations as they progress. +Repeating this parameter increases debug spew. +Errors encountered result in the test spewing some +explanatory text and stopping; this flag provides +more detail about what lead up to the error. +Default: zero +.TP +.B +[ -R BE ] +Indicate the quality of service (QoS) desired. +Choices are: +.RS +.TP +.B +HT +- high throughput +.TP +.B +LL +- low latency +.TP +.B +EC +- economy (neither HT nor LL) +.TP +.B +PM +- premium +.TP +.B +BE +- best effort +Default: BE +.RE +.RE +.PP +.B +Usage - Quit test client +.PP +.nf +.fam C +dapltest [Common_Args] [ -s server_name ] + +Quit testing (-T Q) connects to the server to ask it to clean up and +exit (after it waits for any
Re: [openib-general] Fork issues with simple MPI program
Arlin Davis wrote: Any insight would be greatly appreciated. It was our assumption that the parent process can continue to use IB resources after the fixes went into 2.6.16 and OFED 1.1. Is this true? As was discussed over this list in few occasions: in contrast to popular thought the fork support was deployed in libibverbs1.1 where OFED 1.1 contains libibverbs1.0 OFED 1.2 alpha (libibverbs 1.1) on 2.6.20 fails the same way. Does the following disclaimer still apply? Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. The fork() is supported as long as parent process does not run before child exits or calls exec(). The former can be achieved by calling wait(childpid) the later can be achieved by application specific means. Posix system() call is supported. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fork issues with simple MPI program
Or you could try setting the IBV_FORK_SAFE environment variable before running your application. I guess for MPI jobs you need to make sure that environment variable is propagated to every process. Ahh! That's what I was looking for. Thanks! This information is scattered around in various email threads, header files, and code. Can someone please add relevant text to the OFED 1.2 release notes or a Wiki page? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Fork issues with simple MPI program
We are seeing some fork issues with a simple MPI program (attached) running on a 2.6.16+ kernels and OFED 1.1. We have tried both Intel MPI and mvapich2 with the same results: t_fork mpiexec -n 2 t_system_fork parent process [0] started child process with pid=31552 send desc error parent process [0] Abort: [] Got completion with error 1, vendor code=69, dest rank=1 at line 540 in file ibv_channel_manager.c [1] I am child process with pid=25437 [1] started child process with pid=25437 [0] I am child process with pid=31552 child process [1] finished pid=25437 child process [0] finished pid=31552 rank 0 in job 2 svlmpicl400_32925 caused collective abort of all ranks exit status of rank 0: return code 252 If you run mvapich2 for uDAPL, it hangs before second MPI_Barrier() just like Intel MPI. If you use the I_MPI_RDMA_USE_EVD_FALLBACK=1 option with Intel MPI you get the following error similar to mvapich2: parent process parent process [0] I am child process with pid=9596 [0] started child process with pid=9596 [1] I am child process with pid=11477 [1] started child process with pid=11477 [0][rdma_iba.c:1007] Intel MPI fatal error: DTO operation completed with error. status=0x2. cookie=0x1 [1][rdma_iba.c:1007] Intel MPI fatal error: DTO operation completed with error. status=0x2. cookie=0x1 child process [1] finished pid=11477 child process [0] finished pid=9596 rank 0 in job 8 cst-19_54707 caused collective abort of all ranks exit status of rank 0: return code 255 Any insight would be greatly appreciated. It was our assumption that the parent process can continue to use IB resources after the fixes went into 2.6.16 and OFED 1.1. Is this true? Thanks, -arlin #include mpi.h #include stdio.h #include stdlib.h int main(int argc,char *argv[]) { int myid, numprocs; pid_t pid; MPI_Init(argc,argv); MPI_Comm_size(MPI_COMM_WORLD,numprocs); MPI_Comm_rank(MPI_COMM_WORLD,myid); MPI_Barrier(MPI_COMM_WORLD); system(echo parent process); pid = fork(); if( pid == 0) { pid = getpid(); printf([%d] I am child process with pid=%d\n, myid, pid); system(echo child process); } else { printf([%d] started child process with pid=%d\n, myid, pid); MPI_Barrier(MPI_COMM_WORLD); MPI_Finalize(); pid = getpid(); } printf([%d] finished pid=%d\n, myid, pid); return 0; } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dapltest?
Steve Wise wrote: Hey Arlin, Shouldn't dapl/test be shipped with OFED? It appears not to be... Yes, I will try to get to this by next week at the latest. Can you add a bugzilla report to track against? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dapl broken for iWARP
Steve Wise wrote: On Wed, 2007-02-07 at 14:02 -0600, Steve Wise wrote: Arlin, The OFED dapl code is assuming the responder_resources and initiator_depth passed up on a connection request event are from the remote peer. This doesn't happen for iWARP. In the current iWARP specifications, its up to the application to exchange this information somehow. So these are defaulting to 0 on the server side of any dapl connection over iWARP. This is a fairly recent change, I think. We need to come up with some way to deal with this for OFED 1.2 IMO. Yes, this was changed recently to sync up with the rdma_cm changes that exposed the values. The IWCM could set these to the device max values for instance. That would work fine as long as you know the remote settings will be equal or better. The provider just sets the min of local device max values and the remote values provided with the request. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today
*Sources developed in OFA:* 1. Each git owner will open a branch with the name ofed_1_2. This branch should be opened on 31-Jan (based on code readiness we will review today). ofed_1_2 branch created for dapl.git -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm and udapl: Which git branch to use in ofed_1_2 build
~ardavis/dapl.git: rdma_ucm* master Can you reply which branch to use in our daily ofed 1.2 builds. for dapl use rdma_ucm branch ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm and udapl: Which git branch to use in ofed_1_2 build
OK, we can switch to master. Then DAPL would need to be updated, right? Arlin? I think DAPL stays with rdma_ucm, but Arlin can confirm. I created a 1.2 uDAPL branch to use with ofed_1_2 builds. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL - rdma_ucm branch: add changes to support rr/init exchange
Some uDAPL changes to support exchanging and validation of the device responder_resources and the initiator_depth during connection establishment. Signed-off by: Arlin Davis [EMAIL PROTECTED] diff --git a/dapl/openib_cma/dapl_ib_cm.c b/dapl/openib_cma/dapl_ib_cm.c old mode 100644 new mode 100755 index 0f24244..8bdd0eb --- a/dapl/openib_cma/dapl_ib_cm.c +++ b/dapl/openib_cma/dapl_ib_cm.c @@ -259,6 +259,18 @@ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, new_conn-sp = conn-sp; new_conn-hca = conn-hca; + /* Get requesters connect data, setup for accept */ + new_conn-params.responder_resources = + DAPL_MIN(event-param.conn.initiator_depth, +conn-hca-ib_trans.max_rdma_rd_in); + new_conn-params.initiator_depth = + DAPL_MIN(event-param.conn.responder_resources, +conn-hca-ib_trans.max_rdma_rd_out); + + new_conn-params.flow_control = event-param.conn.flow_control; + new_conn-params.rnr_retry_count = event-param.conn.rnr_retry_count; + new_conn-params.retry_count = event-param.conn.retry_count; + /* save private data */ if (event-param.conn.private_data_len) { dapl_os_memcpy(new_conn-p_data, @@ -279,7 +291,8 @@ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, event-param.conn.private_data, event-param.conn.private_data_len); dapl_dbg_log(DAPL_DBG_TYPE_CM, passive_cb: -REQ: IP SRC %x PORT %d DST %x PORT %d\n, +REQ: IP SRC %x PORT %d DST %x PORT %d +rr %d init %d\n, ntohl(((struct sockaddr_in *) ipaddr-src_addr)-sin_addr.s_addr), ntohs(((struct sockaddr_in *) @@ -287,7 +300,9 @@ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, ntohl(((struct sockaddr_in *) ipaddr-dst_addr)-sin_addr.s_addr), ntohs(((struct sockaddr_in *) - ipaddr-dst_addr)-sin_port)); + ipaddr-dst_addr)-sin_port), +new_conn-params.responder_resources, +new_conn-params.initiator_depth); } return new_conn; } @@ -556,8 +571,8 @@ DAT_RETURN dapls_ib_connect(IN DAT_EP_HANDLE ep_handle, /* Setup QP/CM parameters and private data in cm_id */ (void)dapl_os_memzero(conn-params, sizeof(conn-params)); - conn-params.responder_resources = IB_TARGET_MAX; - conn-params.initiator_depth = IB_INITIATOR_DEPTH; + conn-params.responder_resources = conn-hca-ib_trans.max_rdma_rd_in; + conn-params.initiator_depth = conn-hca-ib_trans.max_rdma_rd_out; conn-params.flow_control = 1; conn-params.rnr_retry_count = IB_RNR_RETRY_COUNT; conn-params.retry_count = IB_RC_RETRY_COUNT; @@ -814,7 +829,6 @@ dapls_ib_accept_connection(IN DAT_CR_HANDLE cr_handle, struct dapl_cm_id *cr_conn = cr_ptr-ib_cm_handle; int ret; DAT_RETURN dat_status; - struct rdma_conn_param conn_params; dapl_dbg_log(DAPL_DBG_TYPE_CM, accept(cr %p conn %p, id %p, p_data %p, p_sz=%d)\n, @@ -867,16 +881,10 @@ dapls_ib_accept_connection(IN DAT_CR_HANDLE cr_handle, ep_ptr-qp_handle = cr_conn; ep_ptr-cm_handle = cr_conn; cr_conn-ep = ep_ptr; + cr_conn-params.private_data = p_data; + cr_conn-params.private_data_len = p_size; - memset(conn_params, 0, sizeof(conn_params)); - conn_params.private_data = p_data; - conn_params.private_data_len = p_size; - conn_params.responder_resources = IB_TARGET_MAX; - conn_params.initiator_depth = IB_INITIATOR_DEPTH; - conn_params.flow_control = 1; - conn_params.rnr_retry_count = IB_RNR_RETRY_COUNT; - - ret = rdma_accept(cr_conn-cm_id, conn_params); + ret = rdma_accept(cr_conn-cm_id, cr_conn-params); if (ret) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, accept: ERROR %d\n, ret); dat_status = dapl_convert_errno(ret, accept); diff --git a/dapl/openib_cma/dapl_ib_util.c b/dapl/openib_cma/dapl_ib_util.c old mode 100644 new mode 100755 index 6bb35f6..0606312 --- a/dapl/openib_cma/dapl_ib_util.c +++ b/dapl/openib_cma/dapl_ib_util.c @@ -469,6 +469,9 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HCA *hca_ptr, ia_attr-num_vendor_attr = 0; ia_attr-vendor_attr = NULL; ia_attr-max_iov_segments_per_rdma_read = dev_attr.max_sge; + /* save rd_atom for peer
Re: [openib-general] building dapl for ofed
-Original Message- From: Steve WIse [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 16, 2007 8:56 AM To: Davis, Arlin R; Hefty, Sean Cc: openib-general Subject: building dapl for ofed I'm having problems building dapl for ofed 1.2. I'm using the dapl rdma_ucm branch and still getting compile problems. What librdmacm branch should I be using? Did you use the rdma_ucm branch for both dapl and librdmacm? Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/3] uDAPL cma: add support for address and route retries, call disconnect when recving dreq
All 3 patches committed in OFA(r10074) and SourceForge (r1414) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1/3] uDAPL cma: add support for new client register event
New series of patches with -x -up Added support for new ib verbs client register event. No extra processing required at the uDAPL level. Shows up if opensm bounces. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 10032) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -744,9 +744,16 @@ void dapli_async_event_cb(struct _ib_hca hca-async_un_ctx); break; } + caseIBV_EVENT_CLIENT_REREGISTER: + /* no need to report this event this time */ + dapl_dbg_log (DAPL_DBG_TYPE_WARN, + async_event: IBV_EVENT_CLIENT_REREGISTER\n); + break; + default: dapl_dbg_log (DAPL_DBG_TYPE_WARN, - async_event: UNKNOWN\n); + async_event: %d UNKNOWN\n, +event.event_type); break; } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2/3] uDAPL cma: fix issues with creating qp without rcv resources
Fix some issues supporting create qp without recv cq handle or recv qp resources. IB verbs assume a recv_cq handle and uDAPL dapl_ep_create assumes there is always recv_sge resources specified. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/common/dapl_ep_create.c === --- dapl/common/dapl_ep_create.c(revision 10032) +++ dapl/common/dapl_ep_create.c(working copy) @@ -166,7 +166,7 @@ dapl_ep_create ( (recv_evd_handle != DAT_HANDLE_NULL ep_attr-max_recv_dtos == 0) || (request_evd_handle == DAT_HANDLE_NULL ep_attr-max_request_dtos != 0) || (request_evd_handle != DAT_HANDLE_NULL ep_attr-max_request_dtos == 0) || -ep_attr-max_recv_iov == 0 || +(recv_evd_handle != DAT_HANDLE_NULL ep_attr-max_recv_iov == 0) || ep_attr-max_request_iov == 0 || (DAT_SUCCESS != dapl_ep_check_recv_completion_flags ( ep_attr-recv_completion_flags)) )) Index: dapl/openib_cma/dapl_ib_qp.c === --- dapl/openib_cma/dapl_ib_qp.c(revision 10032) +++ dapl/openib_cma/dapl_ib_qp.c(working copy) @@ -143,13 +143,21 @@ DAT_RETURN dapls_ib_qp_alloc(IN DAPL_IA /* Setup attributes and create qp */ dapl_os_memzero((void*)qp_create, sizeof(qp_create)); qp_create.cap.max_send_wr = attr-max_request_dtos; - qp_create.cap.max_recv_wr = attr-max_recv_dtos; qp_create.cap.max_send_sge = attr-max_request_iov; - qp_create.cap.max_recv_sge = attr-max_recv_iov; qp_create.cap.max_inline_data = ia_ptr-hca_ptr-ib_trans.max_inline_send; qp_create.send_cq = req_cq; - qp_create.recv_cq = rcv_cq; + + /* ibv assumes rcv_cq is never NULL, set to req_cq */ + if (rcv_cq == NULL) { + qp_create.recv_cq = req_cq; + qp_create.cap.max_recv_wr = 0; + qp_create.cap.max_recv_sge = 0; + } else { + qp_create.recv_cq = rcv_cq; + qp_create.cap.max_recv_wr = attr-max_recv_dtos; + qp_create.cap.max_recv_sge = attr-max_recv_iov; + } qp_create.qp_type = IBV_QPT_RC; qp_create.qp_context = (void*)ep_ptr; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 3/3] uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Fix some timeout and long disconnect delay issues discovered during scale-out testing. Added support to retry rdma_cm address and route resolution with configuration options. Provide a disconnect call when receiving the disconnect request to guarantee a disconnect reply and event on the remote side. The rdma_disconnect was not being called from dat_ep_disconnect() as a result of the state changing to DISCONNECTED in the event callback. Here are the new options (environment variables) with the default setting: DAPL_CM_ARP_TIMEOUT_MS 4000 DAPL_CM_ARP_RETRY_COUNT 15 DAPL_CM_ROUTE_TIMEOUT_MS 4000 DAPL_CM_ROUTE_RETRY_COUNT 15 Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 10032) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -58,6 +58,9 @@ #include dapl_ib_util.h #include sys/poll.h #include signal.h +#include sys/socket.h +#include netinet/in.h +#include arpa/inet.h #include rdma/rdma_cma_ib.h extern struct rdma_event_channel *g_cm_events; @@ -99,8 +102,8 @@ static void dapli_addr_resolve(struct da ipaddr-src_addr)-sin_addr.s_addr), ntohl(((struct sockaddr_in *) ipaddr-dst_addr)-sin_addr.s_addr)); - - ret = rdma_resolve_route(conn-cm_id, 2000); + + ret = rdma_resolve_route(conn-cm_id, conn-route_timeout); if (ret) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, rdma_connect failed: %s\n,strerror(errno)); @@ -120,6 +123,7 @@ static void dapli_route_resolve(struct d struct rdma_addr *ipaddr = conn-cm_id-route.addr; struct ib_addr *ibaddr = conn-cm_id-route.addr.addr.ibaddr; #endif + dapl_dbg_log(DAPL_DBG_TYPE_CM, route_resolve: cm_id %p SRC %x DST %x PORT %d\n, conn-cm_id, @@ -331,21 +335,17 @@ static void dapli_cm_active_cb(struct da case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: { - ib_cm_events_t cm_event; -dapl_dbg_log( + dapl_dbg_log( DAPL_DBG_TYPE_WARN, dapli_cm_active_handler: CONN_ERR event=0x%x status=%d %s\n, event-event, event-status, (event-status == -ETIMEDOUT)?TIMEOUT: ); - /* no device type specified so assume IB for now */ - if (event-status == -ETIMEDOUT) /* IB timeout */ - cm_event = IB_CME_TIMEOUT; - else - cm_event = IB_CME_DESTINATION_UNREACHABLE; - - dapl_evd_connection_callback(conn, cm_event, NULL, conn-ep); + /* per DAT SPEC provider always returns UNREACHABLE */ + dapl_evd_connection_callback(conn, +IB_CME_DESTINATION_UNREACHABLE, +NULL, conn-ep); break; } case RDMA_CM_EVENT_REJECTED: @@ -381,6 +381,7 @@ static void dapli_cm_active_cb(struct da break; case RDMA_CM_EVENT_DISCONNECTED: + rdma_disconnect(conn-cm_id); /* force the DREP */ /* validate EP handle */ if (!DAPL_BAD_HANDLE(conn-ep, DAPL_MAGIC_EP)) dapl_evd_connection_callback(conn, @@ -494,6 +495,7 @@ static void dapli_cm_passive_cb(struct d break; case RDMA_CM_EVENT_DISCONNECTED: + rdma_disconnect(conn-cm_id); /* force the DREP */ /* validate SP handle context */ if (!DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_PSP) || !DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_RSP)) @@ -543,7 +545,8 @@ DAT_RETURN dapls_ib_connect(IN DAT_EP_HA IN void *p_data) { struct dapl_ep *ep_ptr = ep_handle; - + struct dapl_cm_id *conn; + /* Sanity check */ if (NULL == ep_ptr) return DAT_SUCCESS; @@ -552,36 +555,38 @@ DAT_RETURN dapls_ib_connect(IN DAT_EP_HA r_qual,p_data,p_size); /* rdma conn and cm_id pre-bound; reference via qp_handle */ - ep_ptr-cm_handle = ep_ptr-qp_handle; + conn = ep_ptr-cm_handle = ep_ptr-qp_handle; /* Setup QP/CM parameters and private data in cm_id */ - (void)dapl_os_memzero(ep_ptr-cm_handle-params, - sizeof(ep_ptr-cm_handle-params)); - ep_ptr-cm_handle-params.responder_resources = IB_TARGET_MAX; - ep_ptr-cm_handle-params.initiator_depth = IB_INITIATOR_DEPTH; - ep_ptr-cm_handle-params.flow_control = 1; - ep_ptr-cm_handle-params.rnr_retry_count = IB_RNR_RETRY_COUNT
Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Sean Hefty wrote: One option is having the SA (or ib_umad?) return a busy status in response to a MAD, but we'd still have to be able to send this response as quickly as requests are being received. We could then limit the number of requests that would be queued in the kernel for a user. Another great option would be to have path record caching. Unfortunately OFED 1.1 did not include ib_local_sa in the release. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Michael S. Tsirkin wrote: Another great option would be to have path record caching. Unfortunately OFED 1.1 did not include ib_local_sa in the release. This won't help you much. With 256 nodes all to all already gives you 65000 requests which is the same order of magnitude as the reported 13. Am I missing something here? 65,000 requests every 15 minutes (current default) for the entire cluster versus 100-13 every time I start an application is a big help. Especially on a very large cluster that is batching up smaller independent jobs sharing a single SA and fabric. We either need caching or SA capabilities that can scale up with large clusters. A single service running at 6000 requests/second will not succeed. -arlin. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Verbs QP create with RQ=0?
Roland Dreier wrote: As was already suggested, you should be able to use the same CQ for receives and for sends. If you never post any receives on the QP, you don't have to allocate any extra space on your send CQ. And it should work to have 0 receive work queue entries. Have you tried it? Yes, these settings work fine (recv_cq = req_cq, max_recv_wr = 0, max_recv_sge = 0). thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2/3] uDAPL cma: fix issues with creating qp without rcv resources
Fix some issues supporting create qp without recv cq handle or recv qp resources. IB verbs assume a recv_cq handle and uDAPL dapl_ep_create assumes there is always recv_sge resources specified. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/common/dapl_ep_create.c === --- dapl/common/dapl_ep_create.c (revision 9916) +++ dapl/common/dapl_ep_create.c (working copy) @@ -166,7 +166,7 @@ (recv_evd_handle != DAT_HANDLE_NULL ep_attr-max_recv_dtos == 0) || (request_evd_handle == DAT_HANDLE_NULL ep_attr-max_request_dtos != 0) || (request_evd_handle != DAT_HANDLE_NULL ep_attr-max_request_dtos == 0) || - ep_attr-max_recv_iov == 0 || + (recv_evd_handle != DAT_HANDLE_NULL ep_attr-max_recv_iov == 0) || ep_attr-max_request_iov == 0 || (DAT_SUCCESS != dapl_ep_check_recv_completion_flags ( ep_attr-recv_completion_flags)) )) Index: dapl/openib_cma/dapl_ib_qp.c === --- dapl/openib_cma/dapl_ib_qp.c (revision 10032) +++ dapl/openib_cma/dapl_ib_qp.c (working copy) @@ -143,13 +143,21 @@ /* Setup attributes and create qp */ dapl_os_memzero((void*)qp_create, sizeof(qp_create)); qp_create.cap.max_send_wr = attr-max_request_dtos; - qp_create.cap.max_recv_wr = attr-max_recv_dtos; qp_create.cap.max_send_sge = attr-max_request_iov; - qp_create.cap.max_recv_sge = attr-max_recv_iov; qp_create.cap.max_inline_data = ia_ptr-hca_ptr-ib_trans.max_inline_send; qp_create.send_cq = req_cq; - qp_create.recv_cq = rcv_cq; + + /* ibv assumes rcv_cq is never NULL, set to req_cq */ + if (rcv_cq == NULL) { + qp_create.recv_cq = req_cq; + qp_create.cap.max_recv_wr = 0; + qp_create.cap.max_recv_sge = 0; + } else { + qp_create.recv_cq = rcv_cq; + qp_create.cap.max_recv_wr = attr-max_recv_dtos; + qp_create.cap.max_recv_sge = attr-max_recv_iov; + } qp_create.qp_type = IBV_QPT_RC; qp_create.qp_context = (void*)ep_ptr; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 3/3] uDAPL cma: add support for address and route retries, call disconnect when recving dreq
Fix some timeout and long disconnect delay issues discovered during scale-out testing. Added support to retry rdma_cm address and route resolution with configuration options and provide a disconnect call when receiving the disconnect request to force an immediate disconnect reply to the remote side. Here are the new options (environment variables) with the default setting DAPL_CM_ARP_TIMEOUT_MS 4000 DAPL_CM_ARP_RETRY_COUNT 15 DAPL_CM_ROUTE_TIMEOUT_MS 4000 DAPL_CM_ROUTE_RETRY_COUNT 15 Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c (revision 9916) +++ dapl/openib_cma/dapl_ib_cm.c (working copy) @@ -58,6 +58,9 @@ #include dapl_ib_util.h #include sys/poll.h #include signal.h +#include sys/socket.h +#include netinet/in.h +#include arpa/inet.h #include rdma/rdma_cma_ib.h extern struct rdma_event_channel *g_cm_events; @@ -99,8 +102,8 @@ ipaddr-src_addr)-sin_addr.s_addr), ntohl(((struct sockaddr_in *) ipaddr-dst_addr)-sin_addr.s_addr)); - - ret = rdma_resolve_route(conn-cm_id, 2000); + + ret = rdma_resolve_route(conn-cm_id, conn-route_timeout); if (ret) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, rdma_connect failed: %s\n,strerror(errno)); @@ -120,6 +123,7 @@ struct rdma_addr *ipaddr = conn-cm_id-route.addr; struct ib_addr *ibaddr = conn-cm_id-route.addr.addr.ibaddr; #endif + dapl_dbg_log(DAPL_DBG_TYPE_CM, route_resolve: cm_id %p SRC %x DST %x PORT %d\n, conn-cm_id, @@ -381,6 +385,7 @@ break; case RDMA_CM_EVENT_DISCONNECTED: + rdma_disconnect(conn-cm_id); /* force the DREP */ /* validate EP handle */ if (!DAPL_BAD_HANDLE(conn-ep, DAPL_MAGIC_EP)) dapl_evd_connection_callback(conn, @@ -494,6 +499,7 @@ break; case RDMA_CM_EVENT_DISCONNECTED: + rdma_disconnect(conn-cm_id); /* force the DREP */ /* validate SP handle context */ if (!DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_PSP) || !DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_RSP)) @@ -543,7 +549,8 @@ IN void *p_data) { struct dapl_ep *ep_ptr = ep_handle; - + struct dapl_cm_id *conn; + /* Sanity check */ if (NULL == ep_ptr) return DAT_SUCCESS; @@ -552,36 +559,38 @@ r_qual,p_data,p_size); /* rdma conn and cm_id pre-bound; reference via qp_handle */ - ep_ptr-cm_handle = ep_ptr-qp_handle; + conn = ep_ptr-cm_handle = ep_ptr-qp_handle; /* Setup QP/CM parameters and private data in cm_id */ - (void)dapl_os_memzero(ep_ptr-cm_handle-params, - sizeof(ep_ptr-cm_handle-params)); - ep_ptr-cm_handle-params.responder_resources = IB_TARGET_MAX; - ep_ptr-cm_handle-params.initiator_depth = IB_INITIATOR_DEPTH; - ep_ptr-cm_handle-params.flow_control = 1; - ep_ptr-cm_handle-params.rnr_retry_count = IB_RNR_RETRY_COUNT; - ep_ptr-cm_handle-params.retry_count = IB_RC_RETRY_COUNT; + (void)dapl_os_memzero(conn-params, sizeof(conn-params)); + conn-params.responder_resources = IB_TARGET_MAX; + conn-params.initiator_depth = IB_INITIATOR_DEPTH; + conn-params.flow_control = 1; + conn-params.rnr_retry_count = IB_RNR_RETRY_COUNT; + conn-params.retry_count = IB_RC_RETRY_COUNT; if (p_size) { - dapl_os_memcpy(ep_ptr-cm_handle-p_data, p_data, p_size); - ep_ptr-cm_handle-params.private_data = - ep_ptr-cm_handle-p_data; - ep_ptr-cm_handle-params.private_data_len = p_size; + dapl_os_memcpy(conn-p_data, p_data, p_size); + conn-params.private_data = conn-p_data; + conn-params.private_data_len = p_size; } + /* copy in remote address, need a copy for retry attempts */ + dapl_os_memcpy(conn-r_addr, r_addr, sizeof(*r_addr)); + /* Resolve remote address, src already bound during QP create */ - ((struct sockaddr_in*)r_addr)-sin_port = htons(MAKE_PORT(r_qual)); - if (rdma_resolve_addr(ep_ptr-cm_handle-cm_id, - NULL, (struct sockaddr *)r_addr, 2000)) + ((struct sockaddr_in*)conn-r_addr)-sin_port = htons(MAKE_PORT(r_qual)); + ((struct sockaddr_in*)conn-r_addr)-sin_family = AF_INET; + + if (rdma_resolve_addr(conn-cm_id, NULL, + (struct sockaddr *)conn-r_addr, + conn-arp_timeout)) return dapl_convert_errno(errno,ib_connect); dapl_dbg_log(DAPL_DBG_TYPE_CM, - connect: resolve_addr: cm_id %p SRC %x DST %x port %d\n, - ep_ptr-cm_handle-cm_id, - ntohl(((struct sockaddr_in *) - ep_ptr-cm_handle-hca-hca_address)-sin_addr.s_addr), - ntohl(((struct sockaddr_in *)r_addr)-sin_addr.s_addr), - MAKE_PORT(r_qual) ); + connect: resolve_addr: cm_id %p - %s port %d\n, + conn-cm_id, + inet_ntoa(((struct sockaddr_in *)conn-r_addr)-sin_addr), + ((struct sockaddr_in*)conn-r_addr)-sin_port ); return DAT_SUCCESS; } @@ -1163,15 +1172,58 @@ case RDMA_CM_EVENT_ADDR_RESOLVED: dapli_addr_resolve(conn); break; + case RDMA_CM_EVENT_ROUTE_RESOLVED: dapli_route_resolve(conn); break; + case RDMA_CM_EVENT_ADDR_ERROR: + dapl_dbg_log(DAPL_DBG_TYPE_WARN, + CM ADDR ERROR: - %s retry (%d)..\n, + inet_ntoa(((struct sockaddr_in *) + conn-r_addr)-sin_addr), + conn-arp_retries); + + /* retry address resolution */ + if (--conn-arp_retries
Re: [openib-general] uDAPL problem
Stephen Smaldone wrote: Arlin Davis wrote: Steve Smaldone wrote: Hi, Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm device appears. However, it now fails with the following: $ ./dapltest -T S -D IB1 ... DAT Registry: dat_ia_openv (IB1,1:2,0) called DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so DAT Registry: dat_registry_add_provider (IB1,1:2,0) libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/local/lib/infiniband libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/local/lib/infiniband DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS ) DT_cs_Server (IB1): Exiting. DAT Registry: Stopped (dat_fini) The configuration remains the same otherwise. My dat.conf: IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 hora-1-ib0 0 Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135? there seems to be problems resolving hora-1-ib0 -arlin Yes. There is an entry as follows: 10.2.2.135 hora-1-ib0 could you change the hora-1-ib0 0 to just ib0 0 in your dat.conf and retry? They may be an issue parsing a hostname instead of a netdev name. Thanks, Steve ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.1 release schedule
Tziporet Koren wrote: This is the plan to do the 1.1 release this week: We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct) Only blocker issues from RC7 will be updated: 1. SRP fix for Cisco FC gateway 2. Small updates for the install 3. Fix in diagnet to support SM on a switch 4. Activate scaling code of ehca as default in the install 5. Documentation update Can someone double check the ib_cm kernel patch (sean_cm_drep_on_not_found.patch) again and verify the build process. I don't see the cm_issue_drep symbol in an RC7 build. From the build logs it appears that the patch is applied but I do not see the symbol in the installed ib_cm.ko after the build is complete. system with OFED RC7.. nm ib_cm.ko | grep issue 1689 t cm_issue_rej system with latest svn pull nm ib_cm.ko | grep issue 29f7 t cm_issue_drep 1486 t cm_issue_rej -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] uDAPL problem
Steve Smaldone wrote: Hi, Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm device appears. However, it now fails with the following: $ ./dapltest -T S -D IB1 ... DAT Registry: dat_ia_openv (IB1,1:2,0) called DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so DAT Registry: dat_registry_add_provider (IB1,1:2,0) libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/local/lib/infiniband libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/local/lib/infiniband DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS ) DT_cs_Server (IB1): Exiting. DAT Registry: Stopped (dat_fini) The configuration remains the same otherwise. My dat.conf: IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 hora-1-ib0 0 Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135? there seems to be problems resolving hora-1-ib0 -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] remove scm provider from uDAPL build.
Here is a patch to remove uDAPL scm provider from the build since it is no longer needed nor supported. This provider was merely a stop gap until uCMA was pushed into kernel. Tziporet, can you get this change into OFED 1.1? Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: doc/dat.conf === --- doc/dat.conf(revision 9781) +++ doc/dat.conf(working copy) @@ -6,19 +6,10 @@ # ia_name api_version threadsafety default lib_path \ # provider_version ia_params platform_params # -# Example for openib_cma and openib_scm -# -# For cma version you specify ia_params as: +# For the uDAPL cma provder, specify ia_params as one of the following: # network address, network hostname, or netdev name and 0 for port # -# For scm version you specify ia_params as actual device name and port -# # Simple (OpenIB-cma) default with netdev name provided first on list # to enable use of same dat.conf version on all nodes # OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 -OpenIB-cma-ip u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 192.168.0.22 0 -OpenIB-cma-name u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 svr1-ib0 0 -OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 -OpenIB-scm1 u1.2 nonthreadsafe default /usr/lib/libdaplscm.so mv_dapl.1.2 mthca0 1 -OpenIB-scm2 u1.2 nonthreadsafe default /usr/lib/libdaplscm.so mv_dapl.1.2 mthca0 2 Index: Makefile.am === --- Makefile.am (revision 9781) +++ Makefile.am (working copy) @@ -18,11 +18,9 @@ datlibdir = $(libdir) dapllibcmadir = $(libdir) -dapllibscmdir = $(libdir) datlib_LTLIBRARIES = dat/udat/libdat.la dapllibcma_LTLIBRARIES = dapl/udapl/libdaplcma.la -dapllibscm_LTLIBRARIES = dapl/udapl/libdaplscm.la dat_udat_libdat_la_CFLAGS = -Wall $(DBGFLAGS) -D_GNU_SOURCE $(OSFLAGS) \ -I$(srcdir)/dat/include/ -I$(srcdir)/dat/udat/ \ @@ -34,21 +32,13 @@ -I$(srcdir)/dapl/common -I$(srcdir)/dapl/udapl/linux \ -I$(srcdir)/dapl/openib_cma -dapl_udapl_libdaplscm_la_CFLAGS = -Wall $(DBGFLAGS) -D_GNU_SOURCE $(OSFLAGS) \ - -DOPENIB -DCQ_WAIT_OBJECT \ - -I$(srcdir)/dat/include/ -I$(srcdir)/dapl/include/ \ - -I$(srcdir)/dapl/common -I$(srcdir)/dapl/udapl/linux \ - -I$(srcdir)/dapl/openib_scm - if HAVE_LD_VERSION_SCRIPT dat_version_script = -Wl,--version-script=$(srcdir)/dat/udat/libdat.map daplcma_version_script = -Wl,--version-script=$(srcdir)/dapl/udapl/libdaplcma.map -daplscm_version_script = -Wl,--version-script=$(srcdir)/dapl/udapl/libdaplscm.map else dat_version_script = daplcma_version_script = -daplscm_version_script = endif @@ -177,116 +167,6 @@ -Wl,-init,dapl_init -Wl,-fini,dapl_fini \ -lpthread -libverbs -lrdmacm - -# -# uDAPL OpenIB Socket CM version: libdaplscm.so -# -dapl_udapl_libdaplscm_la_SOURCES = dapl/udapl/dapl_init.c \ -dapl/udapl/dapl_evd_create.c \ -dapl/udapl/dapl_evd_query.c\ -dapl/udapl/dapl_cno_create.c \ -dapl/udapl/dapl_cno_modify_agent.c \ -dapl/udapl/dapl_cno_free.c \ -dapl/udapl/dapl_cno_wait.c \ -dapl/udapl/dapl_cno_query.c\ -dapl/udapl/dapl_lmr_create.c \ -dapl/udapl/dapl_evd_wait.c \ -dapl/udapl/dapl_evd_disable.c \ -dapl/udapl/dapl_evd_enable.c \ -dapl/udapl/dapl_evd_modify_cno.c \ -dapl/udapl/dapl_evd_set_unwaitable.c \ -dapl/udapl/dapl_evd_clear_unwaitable.c \ -dapl/udapl/linux/dapl_osd.c\ -dapl/common/dapl_cookie.c \ -dapl/common/dapl_cr_accept.c\ -dapl/common/dapl_cr_query.c \ -dapl/common/dapl_cr_reject.c\ -dapl/common/dapl_cr_util.c \ -dapl/common/dapl_cr_callback.c \ -dapl/common/dapl_cr_handoff.c \ -dapl/common/dapl_ep_connect.c \ -dapl/common/dapl_ep_create.c\ -dapl/common/dapl_ep_disconnect.c\ -dapl/common/dapl_ep_dup_connect.c \ -dapl/common/dapl_ep_free.c \ -dapl/common/dapl_ep_reset.c \ -dapl/common/dapl_ep_get_status.c\ -dapl/common
Re: [openib-general] OFED 1.1 RC7
Aviram Gutman wrote: OFED-1.1-rc7 is available on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc7.tgz Please report any issues in bugzilla http://openib.org/bugzilla/ Aviram, Can you verify that the sean_cm_drep_on_not_found.patch is actually applied in RC7? Our delayed disconnect problems still exist. I don't see the new symbol cm_issue_drep in ib_cm.ko on our RC7 installed systems so I don't think the patch applied. Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
Woodruff, Robert J wrote: Aviram wrote, Pending that IPoIB HA is solved would like to issue RC7 that suppose to be final. Is everyone OK with this approach? Aviram Sounds good, What is the target date for RC7 ? Do we have a new target date? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] DAPL setup/config help
Troy Telford wrote: I've never set up dapl before, however I now have a reason to try... The problem is, I can't seem to find any documentation on how to set it up. I've tried the sample /etc/dat.conf (modified for the IPoIB address on the system), but I'm not sure I've been sucessful. I've: * compiled from OFED 1.0 * verified the library paths listed in /etc/dat.conf are correct * I do know that things like IP over IB, MVAPICH, Open MPI, etc. work fine; but they're not using DAPL * tried the 'dapltest' and 'dtest' programs. In both cases, I receive an error to the extent of: DAT_PROVIDER_NOT_FOUND DAT_NAME_NOT_REGISTERED The dapl provider name that your application uses for the open must match the ia_name entry in dat.conf. sample dat.conf: # Each entry should have the following fields: # # ia_name api_version threadsafety default lib_path \ # provider_version ia_params platform_params OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 The dtest makefile with OFED 1.0 should use OpenIB-cma as the provider name instead of OpenIB-cma-ip. The default configuration was fixed in OFED 1.1. For dapltest you must pass this dat.conf name as an argument to all scripts. For example ./srv.sh OpenIB-cma -arlin Can anybody point me in the right direction (so I can RTFM and get on with life?) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ
Arlin Davis wrote: Sean Hefty wrote: Currently a DREP is only sent in response to a DREQ if a connection has been found matching the DREQ, and it is in the proper state. Once a DREP is sent, the local connection moves into timewait. Duplicate DREQs received while in this state result in re-sending the DREP. However, it's likely that the local connection will enter and exit timewait before the remote side times out a lost DREP and resends a DREQ. There are a couple possible solutions to this. One is to increase how long a connection remains in timewait, by multiplying its wait time by max_cm_retries. This can greatly increase the timewait state before a QP can be re-used when CM messages are not lost. An alternative is to send a DREP in response to a DREQ, even if a local connection is not found, which is what this patch does. Would it be possible to get this fix in rc7? I am consistently seeing this problem with Intel MPI on a 64 node cluster. -arlin Aviram? Is there an rc7 and could this get in? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ
Sean Hefty wrote: Currently a DREP is only sent in response to a DREQ if a connection has been found matching the DREQ, and it is in the proper state. Once a DREP is sent, the local connection moves into timewait. Duplicate DREQs received while in this state result in re-sending the DREP. However, it's likely that the local connection will enter and exit timewait before the remote side times out a lost DREP and resends a DREQ. There are a couple possible solutions to this. One is to increase how long a connection remains in timewait, by multiplying its wait time by max_cm_retries. This can greatly increase the timewait state before a QP can be re-used when CM messages are not lost. An alternative is to send a DREP in response to a DREQ, even if a local connection is not found, which is what this patch does. Would it be possible to get this fix in rc7? I am consistently seeing this problem with Intel MPI on a 64 node cluster. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] OFED 1.1-rc3 is ready
Robert, Here is a slightly modified patch for your attributes issue. Can you give it a try? Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib/dapl_ib_util.c === --- dapl/openib/dapl_ib_util.c (revision 9106) +++ dapl/openib/dapl_ib_util.c (working copy) @@ -446,6 +446,7 @@ return(dapl_convert_errno(errno,ib_query_hca)); if (ia_attr != NULL) { + (void) dapl_os_memzero(ia_attr, sizeof(*ia_attr)); ia_attr-adapter_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-vendor_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-ia_address_ptr = @@ -470,7 +471,12 @@ /* ia_attr-hardware_version_minor = dev_attr.fw_ver; */ ia_attr-max_eps = dev_attr.max_qp; ia_attr-max_dto_per_ep = dev_attr.max_qp_wr; - ia_attr-max_rdma_read_per_ep = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_out= dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; + ia_attr-max_rdma_read_per_ep_out_guaranteed = DAT_TRUE; ia_attr-max_evds = dev_attr.max_cq; ia_attr-max_evd_qlen = dev_attr.max_cqe; ia_attr-max_iov_segments_per_dto = dev_attr.max_sge; @@ -501,6 +507,7 @@ } if (ep_attr != NULL) { + (void) dapl_os_memzero(ep_attr, sizeof(*ep_attr)); ep_attr-max_mtu_size = port_attr.max_msg_sz; ep_attr-max_rdma_size= port_attr.max_msg_sz; ep_attr-max_recv_dtos= dev_attr.max_qp_wr; Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 9106) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -424,6 +424,7 @@ return(dapl_convert_errno(errno,ib_query_hca)); if (ia_attr != NULL) { + (void) dapl_os_memzero(ia_attr, sizeof(*ia_attr)); ia_attr-adapter_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-vendor_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-ia_address_ptr = @@ -446,6 +447,8 @@ ia_attr-hardware_version_major = dev_attr.hw_ver; ia_attr-max_eps = dev_attr.max_qp; ia_attr-max_dto_per_ep = dev_attr.max_qp_wr; + ia_attr-max_rdma_read_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_out= dev_attr.max_qp_rd_atom; ia_attr-max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; ia_attr-max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; ia_attr-max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; @@ -481,6 +484,7 @@ } if (ep_attr != NULL) { + (void) dapl_os_memzero(ep_attr, sizeof(*ep_attr)); ep_attr-max_mtu_size = port_attr.max_msg_sz; ep_attr-max_rdma_size= port_attr.max_msg_sz; ep_attr-max_recv_dtos= dev_attr.max_qp_wr; Index: dapl/openib_scm/dapl_ib_util.c === --- dapl/openib_scm/dapl_ib_util.c (revision 9106) +++ dapl/openib_scm/dapl_ib_util.c (working copy) @@ -373,6 +373,7 @@ return(dapl_convert_errno(errno,ib_query_hca)); if (ia_attr != NULL) { + (void) dapl_os_memzero(ia_attr, sizeof(*ia_attr)); ia_attr-adapter_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-vendor_name[DAT_NAME_MAX_LENGTH - 1] = '\0'; ia_attr-ia_address_ptr = (DAT_IA_ADDRESS_PTR)hca_ptr-hca_address; @@ -390,7 +391,12 @@ /* ia_attr-hardware_version_minor = dev_attr.fw_ver; */ ia_attr-max_eps = dev_attr.max_qp; ia_attr-max_dto_per_ep = dev_attr.max_qp_wr; - ia_attr-max_rdma_read_per_ep = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_out= dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; + ia_attr-max_rdma_read_per_ep_out_guaranteed = DAT_TRUE; ia_attr-max_evds = dev_attr.max_cq; ia_attr-max_evd_qlen
Re: [openib-general] [PATCH] OFED 1.1-rc3 is ready
Robert Walsh wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Arlin Davis wrote: Robert, Here is a slightly modified patch for your attributes issue. Can you give it a try? I'll give it a spin this afternoon: it looks quite a bit more comprehensive than the small patch I did. Regards, Robert. Just added all appropriate RDMA in/out fields and some code to zero out the structure to avoid uninitialized data fields. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] OFED 1.1-rc3 is ready
Oddly enough, I'm back to the same problem with your new patch as I saw with the unpatched version: Hmmm. We ran this with OFED 1.1 RC3 and MPI 3.0b on an EM64T server with your adapter and it worked. Did you ever pick up the Intel MPI 3.0 beta? $ mpiexec -n 2 ./a.out I_MPI: [1] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma I_MPI: [0] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma I_MPI: [0] MPIDI_CH3_Init(): I_MPI: [1] MPIDI_CH3_Init(): will use rdma configuration will use rdma configuration [1:ib-idev-06][rdma_iba_init_d.c:154] error(0x60029): OpenIB-cma: could not create DAPL endpoint: DAT_INVALID_PARAMETER(DAT_INVALID_ARG6) Hello world: rank 0 of 2 running on ib-idev-05 rank 1 in job 1 ib-idev-05_51891 caused collective abort of all ranks exit status of rank 1: killed by signal 9 Still tracking this one down. I noticed in the patch you removed a couple of lines, too: - ia_attr-max_rdma_read_per_ep = dev_attr.max_qp_rd_atom; Any particular reason why you did this? max_rdma_read_per_ep is the same as max_rdma_read_per_ep_in. Look at dat.h line #369 /* To support backwards compatibility for DAPL-1.0 */ #define max_rdma_read_per_epmax_rdma_read_per_ep_in #define DAT_IA_FIELD_IA_MAX_DTO_PER_OP DAT_IA_FIELD_IA_MAX_DTO_PER_EP_IN /* To support backwards compatibility for DAPL-1.0 DAPL-1.1 */ #define max_mtu_size max_message_size -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Rollup patch for ipath and OFED
Michael S. Tsirkin wrote: Quoting r. Woodruff, Robert J [EMAIL PROTECTED]: We should have one git tree somewhere that has all the latest code that we can pull from I just don't think latest code is a well defined entity in a distributed development environment. kernel.org is well defined. How are we any different? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Question about QP's in timewait state and CM stale conn rejects
We are running into connection reject issues (IB_CM_REJ_STALE_CONN) with our application under heavy load and lots of connections. We occassionally get a reject based on the QP being in timewait state leftover from a prior connection. It appears that the CM keeps track of the QP's in timewait state on both sides of the connection, independently of the verbs layer, even after the QP has been destroyed at the verbs level. I can actually create a new QP via verbs and it could still be on the CM timewait queue waiting for the timer to pop and be removed. If this is the case, my attempts to connect using this QP will fail with a reject. How can a consumer know for sure that the new QP will not be in a timewait state according to the CM? Does it make sense to push the timewait functionality down into verbs? If not, is there a way for the CM to hold a reference to the QP until the timewait expires? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] DAPL and local_iov in RDMA RR/RW mode
Ryszard Jurga wrote: Hi Arlin, Thank you for your quick reply. Both dat_ep_post_rdma_read nad dat_ep_post_rdma_write return DAT_SUCCESS. When I read a field 'transfered_length' from DAT_DTO_COMPLETION_EVENT_DATA after calling a post function I receive the correct value which equals num_segs*seg_size. Unfortunately, when I read a content of a local buffer, only first segment is filled by appropriete data. I have tried to set up debug switch (by export DAPL_DBG_TYPE=0x before running my application) but unfortunately this does not produce any additional output for post functions. Do you have any other ideas? I did not mention before, but the case with num_segments1 works fine with a send/recv type of transmision. You have to configure --enable-debug to get the debug information. You may want to pick up the latest dapl/test/dtest/dtest.c and take a look at the rdma write section for a simple multi-segment uDAPL example. I recently made a few modifications to include multiple segments in the test. You can also use dapltest to verify that RDMA with multple segments are working properly. Look at cl.sh for example script and dapltest -TT --help for usage. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] DAPL and local_iov in RDMA RR/RW mode
Ryszard Jurga wrote: Hi everybody, I have one question about a number of segments in local_iov when using RDMA Write and Read mode. Is it possible to have num_segments1? I am asking, because when I try to set up num_segments to a value 1, then I can still only read/write one segment, even though I have an appropriate remote buffer already reserved. The size of transfered buffer is 10bytes, num_segs=2. The information, which is printed below, was obrained from network devices with one remark - I have set up manualy max_rdma_read_iov=10 and max_rdma_write_iov=10. Thank you in advance for your help. Yes, uDAPL will support num_segments up to the max counts returned on the ep_attr. Can you be more specific? Does the post return immediate errors or are you simply missing data on the remote node? Can you turn up the uDAPL debug switch (export DAPL_DBG_TYPE=0x) and send output of the post call? -arlin Best regards, Ryszard. EP_ATTR: the same for both nodes: -- max_message_size=2147483648 max_rdma_size=2147483648 max_recv_dtos=16 max_request_dtos=16 max_recv_iov=4 max_request_iov=4 max_rdma_read_in=4 max_rdma_read_out=4 srq_soft_hw=0 max_rdma_read_iov=10 max_rdma_write_iov=10 ep_transport_specific_count=0 ep_provider_specific_count=0 -- IA_ATTR: different for nodes -- IA Info: max_eps=64512 max_dto_per_ep=65535 max_rdma_read_per_ep_in=4 max_rdma_read_per_ep_out=1610616831 max_evds=65408 max_evd_qlen=131071 max_iov_segments_per_dto=28 max_lmrs=131056 max_lmr_block_size=18446744073709551615 max_pzs=32768 max_message_size=2147483648 max_rdma_size=2147483648 max_rmrs=0 max_srqs=0 max_ep_per_srq=0 max_recv_per_srq=143263 max_iov_segments_per_rdma_read=1073741824 max_iov_segments_per_rdma_write=0 max_rdma_read_in=0 max_rdma_read_out=65535 max_rdma_read_per_ep_in_guaranteed=7286 max_rdma_read_per_ep_out_guaranteed=0 IA Info: max_eps=64512 max_dto_per_ep=65535 max_rdma_read_per_ep_in=4 max_rdma_read_per_ep_out=0 max_evds=65408 max_evd_qlen=131071 max_iov_segments_per_dto=28 max_lmrs=131056 max_lmr_block_size=18446744073709551615 max_pzs=32768 max_message_size=2147483648 max_rdma_size=2147483648 max_rmrs=0 max_srqs=0 max_ep_per_srq=0 max_recv_per_srq=142247 max_iov_segments_per_rdma_read=1073741824 max_iov_segments_per_rdma_write=0 max_rdma_read_in=0 max_rdma_read_out=65535 max_rdma_read_per_ep_in_guaranteed=7286 max_rdma_read_per_ep_out_guaranteed=28 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary
Tziporet Koren wrote: You are correct - we forgot about it. Will be fixed in rc2 Can you open a bug in bugzilla for the installer package so we will not miss it this time? Done. Bug 195. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary
Can we include librdmacm and dapl in the basic installation option? Also, it would be nice to have rdma_ucm and rdma_cm load on boot by default. Thanks, -arlin This is a small change in the OFED scripts. I suggest that if we go for this change we will do it for the HPC install and not for the basic install (which includes only the verbs and IPoIB). If there is no objection from anyone we will go for this change. Tziporet I don't see this change in OFED 1.1 RC1. Please add librdmacm and dapl into the HPC install and make sure rdma_ucma and rdma_cma gets loaded during boot by default in RC2. Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary
Tziporet Koren wrote: Hi all, This is the outcome of the meeting we had today regarding OFED 1.1 schedule and features. Can we include librdmacm and dapl in the basic installation option? Also, it would be nice to have rdma_ucm and rdma_cm load on boot by default. Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL - OpenIB-cma: added consumer wakeup mechanism for cq wait objects
Arlin Davis wrote: Fix for Bug 158. Add support for dat_evd_set_unwaitable on a DTO EVD. Committed revision 8592. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma provider, errno reporting on create thread during open
Added errno reporting (message and return codes) during open to help diagnose create thread issues. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: openib_cma/dapl_ib_util.c === --- openib_cma/dapl_ib_util.c (revision 8559) +++ openib_cma/dapl_ib_util.c (working copy) @@ -212,6 +212,7 @@ DAT_RETURN dapls_ib_open_hca(IN IB_HCA_N struct rdma_cm_id *cm_id; union ibv_gid *gid; int ret; + DAT_RETURN dat_status; dapl_dbg_log(DAPL_DBG_TYPE_UTIL, open_hca: %s - %p\n, hca_name, hca_ptr); @@ -225,8 +226,9 @@ DAT_RETURN dapls_ib_open_hca(IN IB_HCA_N } dapl_os_unlock(g_hca_lock); - if (dapli_ib_thread_init()) - return DAT_INTERNAL_ERROR; + dat_status = dapli_ib_thread_init(); + if (dat_status != DAT_SUCCESS) + return dat_status; /* HCA name will be hostname or IP address */ if (getipaddr((char*)hca_name, @@ -557,10 +559,10 @@ DAT_RETURN dapls_ib_setup_async_callback return DAT_SUCCESS; } -int dapli_ib_thread_init(void) +DAT_RETURN dapli_ib_thread_init(void) { long opts; - DAT_RETURN ret; + DAT_RETURN dat_status; dapl_dbg_log(DAPL_DBG_TYPE_UTIL, ib_thread_init(%d)\n, getpid()); @@ -568,31 +570,27 @@ int dapli_ib_thread_init(void) dapl_os_lock(g_hca_lock); if (g_ib_thread_state != IB_THREAD_INIT) { dapl_os_unlock(g_hca_lock); - return 0; + return DAT_SUCCESS; } /* uCMA events non-blocking */ opts = fcntl(g_cm_events-fd, F_GETFL); /* uCMA */ if (opts 0 || fcntl(g_cm_events-fd, F_SETFL, opts | O_NONBLOCK) 0) { - dapl_dbg_log (DAPL_DBG_TYPE_ERR, - dapl_ib_init: ERR with uCMA FD\n ); dapl_os_unlock(g_hca_lock); - return 1; + return(dapl_convert_errno(errno, create_thread ERR: cm_fd)); } g_ib_thread_state = IB_THREAD_CREATE; dapl_os_unlock(g_hca_lock); /* create thread to process inbound connect request */ - ret = dapl_os_thread_create(dapli_thread, NULL, g_ib_thread); - if (ret != DAT_SUCCESS) - { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - ib_thread_init: failed to create thread\n); - return 1; - } - + dat_status = dapl_os_thread_create(dapli_thread, NULL, g_ib_thread); + if (dat_status != DAT_SUCCESS) + return(dapl_convert_errno(errno, + create_thread ERR: + check resource limits)); + /* wait for thread to start */ dapl_os_lock(g_hca_lock); while (g_ib_thread_state != IB_THREAD_RUN) { @@ -609,7 +607,8 @@ int dapli_ib_thread_init(void) dapl_dbg_log(DAPL_DBG_TYPE_UTIL, ib_thread_init(%d) exit\n,getpid()); - return 0; + + return DAT_SUCCESS; } void dapli_ib_thread_destroy(void) Index: openib_cma/dapl_ib_util.h === --- openib_cma/dapl_ib_util.h (revision 8559) +++ openib_cma/dapl_ib_util.h (working copy) @@ -265,7 +265,7 @@ typedef uint32_t ib_shm_transport_t; int32_tdapls_ib_init (void); int32_tdapls_ib_release (void); void dapli_thread(void *arg); -int dapli_ib_thread_init(void); +DAT_RETURN dapli_ib_thread_init(void); void dapli_ib_thread_destroy(void); void dapli_cma_event_cb(void); void dapli_cq_event_cb(struct _ib_hca_transport *hca); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL cma provider, errno reporting on create thread during open
Arlin Davis wrote: Added errno reporting (message and return codes) during open to help diagnose create thread issues. Committed revision 8565. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.1 release - schedule and features
Tziporet Koren wrote: • Core: – Set options in CMA uCMA (needed for Intel MPI) – HCA fatal - full flow support – Huge pages support • uDAPL: – Scalability features needed for Intel MPI – take from trunk • Arlin James – please reply if there are more features needed. The latest uDAPL from the trunk and uCMA set option support is sufficient. Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.1 release - schedule and features
Michael S. Tsirkin wrote: Quoting r. Arlin Davis [EMAIL PROTECTED]: The latest uDAPL from the trunk and uCMA set option support is sufficient. Which options do you set? Retry/timeout or path as well? Just retry/timeout. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 146] OFED-1.0 DAPL fails to build on SLES10 on IA64 with IA64_FETCHADD error
John Partridge wrote: I installed the dapl rpm. I do have libdat.so.1 but I also expect a symlink to libdat.so which does not exist (Intel MPI appears to need it) I also noticed that the dat.conf points to /usr/local/ofed/lib/libdaplcma.so but there is no symlink in the /usr/local/ofed/lib directory for it, I do have the libdaplcma.so.1 am I missing something here ? The links should be built during the RPM install. What RPM's are you using to install? Did you modify the dapl rpm? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] uCMA kernel slab corruption and oops
Sean, I am running a couple of iMPI/uDAPL benchmarks at the same time and ran into this: (2.6.17 kernel and svn8112) Jun 22 10:46:51 localhost kernel: Slab corruption: start=8100202458f8, len=512 Jun 22 10:46:51 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:51 localhost kernel: Last user: [8807fc41](rdma_destroy_id+0x188/0x193 [rdma_cm]) Jun 22 10:46:51 localhost kernel: 0f0: 6b 6b 6b 6b 6b 6b 6b 6b 18 be 2d 37 00 81 ff ff Jun 22 10:46:51 localhost kernel: Prev obj: start=8100202456e0, len=512 Jun 22 10:46:51 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:51 localhost kernel: Last user: [88086599](ucma_get_event+0x202/0x21f [rdma_ucm]) Jun 22 10:46:51 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:51 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:51 localhost kernel: Next obj: start=810020245b10, len=512 Jun 22 10:46:51 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:51 localhost kernel: Last user: [804762c2](skb_release_data+0x92/0x97) Jun 22 10:46:51 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:51 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:53 localhost kernel: Slab corruption: start=8100202458f8, len=512 Jun 22 10:46:53 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:53 localhost kernel: Last user: [804762c2](skb_release_data+0x92/0x97) Jun 22 10:46:53 localhost kernel: 0f0: 40 5c 3c 18 00 81 ff ff 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:53 localhost kernel: Prev obj: start=8100202456e0, len=512 Jun 22 10:46:53 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:53 localhost kernel: Last user: [88086599](ucma_get_event+0x202/0x21f [rdma_ucm]) Jun 22 10:46:53 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:53 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:53 localhost kernel: Next obj: start=810020245b10, len=512 Jun 22 10:46:53 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 10:46:53 localhost kernel: Last user: [804762c2](skb_release_data+0x92/0x97) Jun 22 10:46:53 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 10:46:53 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:01:01 localhost kernel: Slab corruption: start=8100202458f8, len=512 Jun 22 11:01:01 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:01:01 localhost kernel: Last user: [88069831](ib_destroy_cm_id+0x23b/0x246 [ib_cm]) Jun 22 11:01:01 localhost kernel: 0f0: d0 79 4c 2d 00 81 ff ff 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:01:01 localhost kernel: Prev obj: start=8100202456e0, len=512 Jun 22 11:01:01 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:01:01 localhost kernel: Last user: [802a1a8e](load_elf_interp+0x411/0x423) Jun 22 11:01:01 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:01:01 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:01:01 localhost kernel: Next obj: start=810020245b10, len=512 Jun 22 11:01:01 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:01:01 localhost kernel: Last user: [804762c2](skb_release_data+0x92/0x97) Jun 22 11:01:01 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:01:01 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:22:33 localhost kernel: Slab corruption: start=8100202458f8, len=512 Jun 22 11:22:33 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:22:33 localhost kernel: Last user: [802a1a8e](load_elf_interp+0x411/0x423) Jun 22 11:22:33 localhost kernel: 0f0: a0 83 9e 21 00 81 ff ff 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:22:33 localhost kernel: Prev obj: start=8100202456e0, len=512 Jun 22 11:22:33 localhost kernel: Redzone: 0x170fc2a5/0x170fc2a5. Jun 22 11:22:33 localhost kernel: Last user: [880346bb](mthca_create_qp+0x48/0x275 [ib_mthca]) Jun 22 11:22:33 localhost kernel: 000: 00 40 6a 3d 00 81 ff ff 38 96 d4 3a 00 81 ff ff Jun 22 11:22:33 localhost kernel: 010: 48 15 64 29 00 81 ff ff 48 15 64 29 00 81 ff ff Jun 22 11:22:33 localhost kernel: Next obj: start=810020245b10, len=512 Jun 22 11:22:33 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:22:33 localhost kernel: Last user: [802a1a8e](load_elf_interp+0x411/0x423) Jun 22 11:22:33 localhost kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:22:33 localhost kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Jun 22 11:22:43 localhost kernel: Slab corruption: start=8100202458f8, len=512 Jun 22 11:22:43 localhost kernel: Redzone: 0x5a2cf071/0x5a2cf071. Jun 22 11:22:43 localhost kernel: Last user:
Re: [openib-general] [PATCH] uDAPL dapl_evd_connection_callback does not support TIMED_OUT event
James, Added support for active side TIMED_OUT event from a provider. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/common/dapl_evd_connection_callb.c === --- dapl/common/dapl_evd_connection_callb.c (revision 8166) +++ dapl/common/dapl_evd_connection_callb.c (working copy) @@ -162,48 +162,15 @@ dapl_evd_connection_callback ( break; } case DAT_CONNECTION_EVENT_DISCONNECTED: - { - /* -* EP is now fully disconnected; initiate any post processing -* to reset the underlying QP and get the EP ready for -* another connection -*/ - ep_ptr-param.ep_state = DAT_EP_STATE_DISCONNECTED; - dapls_ib_disconnect_clean (ep_ptr, DAT_TRUE, ib_cm_event); - dapl_os_unlock (ep_ptr-header.lock); - - break; - } case DAT_CONNECTION_EVENT_PEER_REJECTED: - { - ep_ptr-param.ep_state = DAT_EP_STATE_DISCONNECTED; - dapls_ib_disconnect_clean (ep_ptr, DAT_TRUE, ib_cm_event); - dapl_os_unlock (ep_ptr-header.lock); - - break; - } case DAT_CONNECTION_EVENT_UNREACHABLE: - { - ep_ptr-param.ep_state = DAT_EP_STATE_DISCONNECTED; - dapls_ib_disconnect_clean (ep_ptr, DAT_TRUE, ib_cm_event); - dapl_os_unlock (ep_ptr-header.lock); - - break; - } case DAT_CONNECTION_EVENT_NON_PEER_REJECTED: - { - ep_ptr-param.ep_state = DAT_EP_STATE_DISCONNECTED; - dapls_ib_disconnect_clean (ep_ptr, DAT_TRUE, ib_cm_event); - dapl_os_unlock (ep_ptr-header.lock); - - break; - } case DAT_CONNECTION_EVENT_BROKEN: + case DAT_CONNECTION_EVENT_TIMED_OUT: { ep_ptr-param.ep_state = DAT_EP_STATE_DISCONNECTED; dapls_ib_disconnect_clean (ep_ptr, DAT_FALSE, ib_cm_event); dapl_os_unlock ( ep_ptr-header.lock ); - break; } case DAT_CONNECTION_REQUEST_EVENT: ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma: lower debug level on consumer rejects
James, Lower the reject debug message level so we don't see warnings when consumers reject. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 8166) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -359,7 +359,7 @@ static void dapli_cm_active_cb(struct da cm_event = IB_CME_DESTINATION_REJECT; dapl_dbg_log( - DAPL_DBG_TYPE_WARN, + DAPL_DBG_TYPE_CM, dapli_cm_active_handler: REJECTED reason=%d\n, event-status); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma - event processing bug
James, Fix bug in dapls_ib_get_dat_event() call after adding new unreachable event. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 8166) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -1092,9 +1092,6 @@ dapls_ib_get_dat_event(IN const ib_cm_ev active = active; - if (ib_cm_event IB_CME_BROKEN) - return (DAT_EVENT_NUMBER) 0; - dat_event_num = 0; for(i = 0; i DAPL_IB_EVENT_CNT; i++) { if (ib_cm_event == ib_cm_event_map[i].ib_cm_event) { ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dapltest gets segfaulted in librdmacm init
Or Gerlitz wrote: After fixing the ucma/port space issue with the calls to rdma_create_id i am now trying to run $ ./Target/dapltest -T S -D OpenIB-cma and getting an immediate segfault with the below trace, any idea? Hmm, no idea. I just updated to 8112 and everything runs fine for me (2.6.17). Or. #0 0x2af6d3a97685 in ibv_open_device (device=0x537440) at device.c:128 128 context = device-ops.alloc_context(device, cmd_fd); (gdb) where #0 0x2af6d3a97685 in ibv_open_device (device=0x537440) at device.c:128 #1 0x2af6d3cc4076 in ucma_init () at cma.c:220 #2 0x2af6d3cc4182 in rdma_create_event_channel () at cma.c:257 #3 0x2af6d3bb20e3 in dapls_ib_open_hca (hca_name=0x534430 ib0, hca_ptr=0x532870) at dapl_ib_util.c:222 #4 0x2af6d3bab454 in dapl_ia_open (name=0x530028 OpenIB-cma, async_evd_qlen=8, async_evd_handle_ptr=0x52e690, ia_handle_ptr=0x52e660) at dapl_ia_open.c:145 #5 0x2af6d352e422 in dat_ia_openv (name=0x530028 OpenIB-cma, async_event_qlen=8, async_event_handle=0x52e690, ia_handle=0x52e660, dapl_major=1, dapl_minor=2, thread_safety=DAT_FALSE) at udat.c:229 #6 0x0041461f in DT_cs_Server (params_ptr=0x530020) at dapl_server.c:105 #7 0x00407aa2 in DT_Execute_Test (params_ptr=0x530020) at dapl_execute.c:55 #8 0x0041e9d9 in DT_Tdep_Execute_Test (params_ptr=0x530020) at udapl_tdep.c:48 #9 0x00403669 in dapltest (argc=5, argv=0x7fffd7693748) at dapl_main.c:95 #10 0x004035bb in main (argc=5, argv=0x7fffd7693748) at dapl_main.c:37 (gdb) info sharedlibrary FromTo Syms Read Shared Object Library 0x2af6d352e0e0 0x2af6d3533e38 Yes /usr/local/ib/lib/libdat.so.1 0x2af6d365d470 0x2af6d3664d48 Yes /lib64/tls/libpthread.so.0 0x2af6d37888b0 0x2af6d3852ce0 Yes /lib64/tls/libc.so.6 0x2af6d398f450 0x2af6d3990128 Yes /lib64/libdl.so.2 0x2af6d3a94690 0x2af6d3a99aa8 Yes /usr/local/ib/lib/libibverbs.so.2 0x2af6d3415cf0 0x2af6d3426ab7 Yes /lib64/ld-linux-x86-64.so.2 0x2af6d3b9ffc0 0x2af6d3bb7028 Yes /usr/local/ib/lib/libdaplcma.so 0x2af6d3cc3ca0 0x2af6d3cc6d18 Yes /usr/local/ib/lib/librdmacm.so 0x2af6d3deb200 0x2af6d3df2348 Yes /usr/local/lib/libsysfs.so.1 0x2af6d3ef5b50 0x2af6d3efc138 Yes /usr/local/ib/lib/infiniband/mthca.so 0x2af6d40006c0 0x2af6d4005838 Yes /usr/local/ib/lib/libibverbs.so.1 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Processes not exiting on SVN7946
Woodruff, Robert J wrote: It appears that processes are not exiting cleanly on SVN7946 trunk backported to 2.6.9-34 EL. They seem to be stuck in a state of DL and I cannot even attach to them wil gdb or kill them with a kill -9. [EMAIL PROTECTED] core]# ps -uax | grep IMB woody 4087 0.0 0.0 58500 3172 pts/3T14:45 0:00 gdb ./IMB-MPI1 -p 4067 woody 4067 2.3 0.0 33108 2708 ?DL 14:44 0:12 ./IMB-MPI1 woody 4109 3.1 0.0 40148 2572 ?DL 14:47 0:12 ./IMB-MPI1 root 4156 0.0 0.0 51080 732 pts/3S+ 14:53 0:00 grep IMB The last code I pulled SVN7843 did not have this problem. Any ideas on what might be causing this ? I see the same thing running the uDAPL test (dapl/test/dtest). I am running a 2.6.16 kernel and svn8805 and it appears to be deadlocked (uninterruptible sleep) in the ibv_destroy_cq() call. This all worked fine on svn7843; my last update on these systems. -arlin woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Processes not exiting on SVN7946
Roland Dreier wrote: Roland Hmm, any further clue where in ibv_destroy_cq() it's Roland stuck? Is it doing down_write() or something? Can you send me full sysrq-t output when it gets stuck? Thanks... I just added ibv_destroy_cq() to ibv_rc_pingpong test. Here's the output open(/sys/class/infiniband_verbs/abi_version, O_RDONLY) = 3 read(3, 6\n, 8) = 2 close(3)= 0 open(/sys/class/infiniband_verbs, O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 getdents64(3, /* 4 entries */, 4096)= 112 open(/sys/class/infiniband_verbs/uverbs0/abi_version, O_RDONLY) = 4 read(4, 1\n, 8) = 2 close(4)= 0 open(/sys/class/infiniband_verbs/uverbs0/ibdev, O_RDONLY) = 4 read(4, mthca0\n, 64) = 7 close(4)= 0 open(/sys/class/infiniband_verbs/uverbs0/device/vendor, O_RDONLY) = 4 read(4, 0x15b3\n, 8) = 7 close(4)= 0 open(/sys/class/infiniband_verbs/uverbs0/device/device, O_RDONLY) = 4 read(4, 0x6278\n, 8) = 7 close(4)= 0 getdents64(3, /* 0 entries */, 4096)= 0 close(3)= 0 open(/dev/infiniband/uverbs0, O_RDWR) = 3 write(3, \0\0\0\0\4\0\4\0\300\227\221\377\377\177\0\0, 16) = 16 mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, 3, 0) = 0x2b318fa6f000 write(3, \3\0\0\0\4\0\3\0\200\227\221\377\377\177\0\0, 16) = 16 write(3, \3\0\0\0\4\0\3\0\320\227\221\377\377\177\0\0, 16) = 16 write(3, \t\0\0\0\f\0\3\0`\227\221\377\377\177\0\0\0pP\0\0\0\0\0..., 48) = 48 write(3, \t\0\0\0\f\0\3\0\240\226\221\377\377\177\0\0\0\240P\0\0..., 48) = 48 write(3, \22\0\0\0\22\0\4\0p\227\221\377\377\177\0\0\320nP\0\0\0..., 72) = 72 write(3, \t\0\0\0\f\0\3\0\240\226\221\377\377\177\0\0\0\360P\0\0..., 48) = 48 write(3, \30\0\0\0\30\0\10\0`\227\221\377\377\177\0\0p\221P\0\0..., 96) = 96 write(3, \32\0\0\0\36\0\0\0\250Y\1a9\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 120) = 120 write(3, \2\0\0\0\6\0\n\0`\227\221\377\377\177\0\0\1lQ\0\0\0\0\0..., 24) = 24 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 7), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b318fa7 write(1, local address: LID 0x0004, QP..., 57 local address: LID 0x0004, QPN 0x040407, PSN 0xce99bd ) = 57 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET6, sin6_port=htons(18515), inet_pton(AF_INET6, ::, sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(5, {sa_family=AF_INET6, sin6_port=htons(32770), inet_pton(AF_INET6, ::1, sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [22635233564164124]) = 0 close(5)= 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 5 connect(5, {sa_family=AF_INET, sin_port=htons(18515), sin_addr=inet_addr(0.0.0.0)}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(32770), sin_addr=inet_addr(127.0.0.1)}, [22635233564164112]) = 0 close(5)= 0 socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 5 setsockopt(5, SOL_SOCKET, SO_REUSEADDR, [22635233564164097], 4) = 0 bind(5, {sa_family=AF_INET6, sin6_port=htons(18515), inet_pton(AF_INET6, ::, sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 listen(5, 1)= 0 accept(5, 0, NULL) = 6 close(5)= 0 read(6, 0005:040407:abb228\0, 19) = 19 write(3, \32\0\0\0\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 120) = 120 write(3, \32\0\0\0\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 120) = 120 write(6, 0004:040407:ce99bd\0, 19)= 19 read(6, done\0, 19) = 5 close(6)= 0 write(1, remote address: LID 0x0005, QP..., 57 remote address: LID 0x0005, QPN 0x040407, PSN 0xabb228 ) = 57 write(1, calling destroy_cq\n, 20 calling destroy_cq ) = 20 write(3, \24\0\0\0\6\0\2\0\250\227\221\377\377\177\0\0\7\0\0\0\0..., 24 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Processes not exiting on SVN7946
Roland Dreier wrote: OK, just a dumb oversight on my part. The change below (already checked in) fixes it for me: --- infiniband/core/uverbs_cmd.c (revision 8055) +++ infiniband/core/uverbs_cmd.c (working copy) @@ -1123,6 +1123,12 @@ ssize_t ib_uverbs_create_qp(struct ib_uv goto err_copy; } + put_pd_read(pd); + put_cq_read(scq); + put_cq_read(rcq); + if (srq) + put_srq_read(srq); + mutex_lock(file-mutex); list_add_tail(obj-uevent.uobject.list, file-ucontext-qp_list); mutex_unlock(file-mutex); Works for me too. Thanks! -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
Tziporet Koren wrote: Jack put the bug fix to OFED 1.0. Tziporet Great. Did the CMA module (SVN 7742) changes also get in? If not, uDAPL is out of sync with CMA and will not work. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.0 release schedule
Woodruff, Robert J wrote: Tziporet wrote, We upload OFED-1.0-pre1.tgz to https://openib.org/svn/gen2/branches/1.0/ofed/releases/ I tried the new tar ball and the pathscale driver now compiles (on Redhat EL4 - U3) and IPoIB and OpenSM appear to work OK, but Intel MPI/uDAPL and NetPipe/uDAPL are broken. It apprears to be a problem with rdma operations. I also tried SDP/pathscale and it does not work either. Finally, the rdma_cm is missing the changes that match the uDAPL fix that was put in for the new setops for the CM timeouts. Arlin will provide specifics. We'd really like the rdma_cm fix in the release. Here is a pointer to Sean's email/patches with the details: http://openib.org/pipermail/openib-general/2006-June/022654.html http://openib.org/pipermail/openib-general/2006-June/022655.html -arlin woody -Original Message- From: Betsy Zeller [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 13, 2006 1:44 PM To: Tziporet Koren Cc: Matt L. Leininger; Scott Weitzenkamp (sweitzen); Matters, Todd; Moni Levy; Woodruff, Robert J; openib; OpenFabricsEWG Subject: Re: OFED 1.0 release schedule Tziporet - this plan makes sense. We'll let you know how the testing goes. BTW, for some reason, if you click on the URL you sent out, it just hangs but if you type it in, it works. Not sure why. Thanks, Betsy On Tue, 2006-06-13 at 16:07 +0300, Tziporet Koren wrote: Hi All, After reading the mail thread regarding OFED release I have decided this: We upload OFED-1.0-pre1.tgz to https://openib.org/svn/gen2/branches/1.0/ofed/releases/ We checked that all modules compile and loaded on this build (including ipath and uDAPL) The only missing parts of this release from the final release are the documents, and the scripts rpm that Scott requested. I think testing this version 3 days (Tuesday, Wednesday and Thursday) should be enough as Scott wrote. So - we can do the official OFED 1.0 release on Friday 16-June. Matt - please check with Novel if this date is acceptable by them. If not then the earliest we can do the release if Thursday 15-June. Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED] Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma provider - add missing ia_attributes for the ia_query
James, Here are some changes to include some missing IA attributes during a query. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 7935) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -444,7 +444,10 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HC ia_attr-hardware_version_major = dev_attr.hw_ver; ia_attr-max_eps = dev_attr.max_qp; ia_attr-max_dto_per_ep = dev_attr.max_qp_wr; - ia_attr-max_rdma_read_per_ep = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; + ia_attr-max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; + ia_attr-max_rdma_read_per_ep_out_guaranteed = DAT_TRUE; ia_attr-max_evds = dev_attr.max_cq; ia_attr-max_evd_qlen = dev_attr.max_cqe; ia_attr-max_iov_segments_per_dto = dev_attr.max_sge; @@ -468,10 +471,11 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HC ia_attr-max_eps, ia_attr-max_dto_per_ep, ia_attr-max_evds, ia_attr-max_evd_qlen ); dapl_dbg_log(DAPL_DBG_TYPE_UTIL, -query_hca: msg %llu rdma %llu iov %d lmr %d rmr %d\n, +query_hca: msg %llu rdma %llu iov %d lmr %d rmr %d +rd_io %d\n, ia_attr-max_mtu_size, ia_attr-max_rdma_size, ia_attr-max_iov_segments_per_dto, ia_attr-max_lmrs, - ia_attr-max_rmrs ); + ia_attr-max_rmrs, ia_attr-max_rdma_read_per_ep_in ); } if (ep_attr != NULL) { ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 126] RDMA_CM and UCM not loaded on boot
[EMAIL PROTECTED] wrote: http://openib.org/bugzilla/show_bug.cgi?id=126 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WONTFIX --- Comment #1 from [EMAIL PROTECTED] 2006-06-10 23:23 --- RDMA_CM and RDMA_UCM are not loaded by default. In order to load them upon boot edit /etc/infiniband/openib.conf file and set RDMA_CM_LOAD=yes and RDMA_UCM_LOAD=yes: # Start HCA driver upon boot ONBOOT=yes # Load UCM module UCM_LOAD=no # Load RDMA_CM module RDMA_CM_LOAD=no # Load RDMA_UCM module RDMA_UCM_LOAD=no Did the default openib.conf script get updated with: RDMA_CM_LOAD=yes RDMA_UCM_LOAD=yes -arlin -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB MTU tunable for uDAPL and/or Intel MPI?
Scott Weitzenkamp (sweitzen) wrote: This didn't help. Osu_bibw.c still reports max bi bandwidth in the 1600s, should be in the 1900s. I looked back at my notes, and OFED 1.0 rc4 had desired max bi bandwidth with OFED 1.0 rc4, did the uDAPL IB MTU change? uDAPL does not have any control over IB MTU using OpenIB-cma. We just use the path record that is supplied from Open SM. Not sure where or when the change occured but it is not in uDAPL. $ mpiexec -genv I_MPI_DAPL_PROVIDER OpenIB-scm -genv I_MPI_DEBUG 3 -genv I_MPI_DEVICE rdssm -genv LD_LIBRARY_PATH .../lib -n 2 ../osu_bibw.x I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma It picked up the OpenIB-cma device instead of OpenIB-scm. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB MTU tunable for uDAPL and/or Intel MPI?
Scott Weitzenkamp (sweitzen) wrote: While we're talking about MTUs, is the IB MTU tunable in uDAPL and/or Intel MPI via env var or config file? Looks like Intel MPI 2.0.1 uses 2K for IB MTU like MVAPICH does in OFED 1.0 rc4 and rc6, I'd like to try 1K with Intel MPI. Scott There is no mechanism for me to modify the MTU using rdma_cm so whatever is returned in the path record is what you get with the OpenIB-cma provider. However, you could use the OpenIB-scm provider which is hard coded for 1K MTU as a comparision. Can you run with -genv I_MPI_DAPL_PROVIDER OpenIB-scm on your cluster? -arlin *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Scott Weitzenkamp (sweitzen) *Sent:* Thursday, June 08, 2006 4:38 PM *To:* Tziporet Koren; [EMAIL PROTECTED] *Cc:* openib-general *Subject:* RE: [openib-general] OFED-1.0-rc6 is available The MTU change undos the changes for bug 81, so I have reopened bug 81 (http://openib.org/bugzilla/show_bug.cgi?id=81). With rc6, PCI-X osu_bw and osu_bibw performance is bad, and PCI-E osu_bibw performance is bad. I've enclosed some performance data, look at rc4 vs rc5 vs rc6 for Cougar/Cheetah/LionMini. Are there other benchmarks driving the changes in rc6 (and rc4)? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems *OSU MPI:* ·Added mpi_alltoall fine tuning parameters ·Added default configuration/documentation file $MPIHOME/etc/mvapich.conf ·Added shell configuration files $MPIHOME/etc/mvapich.csh , $MPIHOME/etc/mvapich.csh ·Default MTU was changed back to 2K for InfiniHost III Ex and InfiniHost III Lx HCAs. For InfiniHost card recommended value is: VIADEV_DEFAULT_MTU=MTU1024 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
James Lentini wrote: On Thu, 8 Jun 2006, Jack Morgenstein wrote: On Wednesday 07 June 2006 18:26, James Lentini wrote: On Wed, 7 Jun 2006, Jack Morgenstein wrote: This (bug fix) can still be included in next-week's release, if you think it is important (I have extracted it from the changes checked in at svn 7755) If you are going to make another release anyway, then I would included it. Do you mean -- include the fix in next week's release -- or -- wait with the fix for the following release? I'd include the fix in the next release, but I wouldn't create a special release just for this fix. So are we getting this in next weeks release or not? I think we need it. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL openib_cma, cleanup reported CM error events, add TIMEOUT
James, I cleaned up the connection error events to report the proper events during address resolution errors and timeouts. It was returning incorrect DAT event codes. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl_ib_cm.c === --- dapl_ib_cm.c(revision 7839) +++ dapl_ib_cm.c(working copy) @@ -330,6 +330,8 @@ static void dapli_cm_active_cb(struct da switch (event-event) { case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: + { + ib_cm_events_t cm_event; dapl_dbg_log( DAPL_DBG_TYPE_WARN, dapli_cm_active_handler: CONN_ERR @@ -337,10 +339,15 @@ static void dapli_cm_active_cb(struct da event-event, event-status, (event-status == -110)?TIMEOUT: ); - dapl_evd_connection_callback(conn, -IB_CME_DESTINATION_UNREACHABLE, -NULL, conn-ep); + /* no device type specified so assume IB for now */ + if (event-status == -110) /* IB timeout */ + cm_event = IB_CME_TIMEOUT; + else + cm_event = IB_CME_DESTINATION_UNREACHABLE; + + dapl_evd_connection_callback(conn, cm_event, NULL, conn-ep); break; + } case RDMA_CM_EVENT_REJECTED: { ib_cm_events_t cm_event; @@ -357,7 +364,6 @@ static void dapli_cm_active_cb(struct da event-status); dapl_evd_connection_callback(conn, cm_event, NULL, conn-ep); - break; } case RDMA_CM_EVENT_ESTABLISHED: @@ -1028,7 +1034,7 @@ int dapls_ib_private_data_size(IN DAPL_P /* * Map all socket CM event codes to the DAT equivelent. */ -#define DAPL_IB_EVENT_CNT 12 +#define DAPL_IB_EVENT_CNT 13 static struct ib_cm_event_map { @@ -1058,7 +1064,9 @@ static struct ib_cm_event_map /* 10 */ { IB_CME_LOCAL_FAILURE, DAT_CONNECTION_EVENT_BROKEN}, /* 11 */ { IB_CME_BROKEN, - DAT_CONNECTION_EVENT_BROKEN} + DAT_CONNECTION_EVENT_BROKEN}, + /* 12 */ { IB_CME_TIMEOUT, + DAT_CONNECTION_EVENT_TIMED_OUT}, }; /* @@ -1164,7 +1172,7 @@ void dapli_cma_event_cb(void) case RDMA_CM_EVENT_ADDR_ERROR: case RDMA_CM_EVENT_ROUTE_ERROR: dapl_evd_connection_callback(conn, -IB_CME_LOCAL_FAILURE, + IB_CME_DESTINATION_UNREACHABLE, NULL, conn-ep); break; case RDMA_CM_EVENT_DEVICE_REMOVAL: Index: dapl_ib_util.h === --- dapl_ib_util.h (revision 7839) +++ dapl_ib_util.h (working copy) @@ -86,7 +86,8 @@ typedef enum { IB_CME_DESTINATION_UNREACHABLE, IB_CME_TOO_MANY_CONNECTION_REQUESTS, IB_CME_LOCAL_FAILURE, - IB_CME_BROKEN + IB_CME_BROKEN, + IB_CME_TIMEOUT } ib_cm_events_t; /* CQ notifications */ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
Scott Weitzenkamp (sweitzen) wrote: Yes, the modules were loaded. Each of the 32 hosts had 3 IB ports up. Does Intel MPI or uDAPL use multiple ports and/or multiple HCAs? I shut down all but one port on each host, and now Pallas is running better on the 32 nodes using Intel MPI 2.0.1. HP MPI 2.2 started working too with Pallas too over uDAPL, so maybe this is a uDAPL issue? Can you tell me what adapters are installed (ibstat), how they are configured (ifconfig), and what your dat.conf looks like? It sounds like a device mapping issue during the dat_ia_open() processing. Multiple ports and HCAs should work fine but there is some care required in configuration of the dat.conf so you consitantly pick up the correct device across the cluster. Intel MPI will simply open a device based on the provider/device name (example: setenv I_MPI_DAPL_PROVIDER=OpenIB-cma) defined in the dat.conf and query dapl for the address to be used for connections. This line in the dat.conf will determine which library to load and which IB device to open and bind too. If you have the same exact configuration on each node and know that the ib0,ib1,ib2, etc will always come up in the same order then you can simply use the same netdev names across the cluster and use the same exact copy of dat.conf on each node. Here are the dat.conf options for OpenIB-cma configurations. # For cma version you specify ia_params as: # network address, network hostname, or netdev name and 0 for port # # Simple (OpenIB-cma) default with netdev name provided first on list # to enable use of same dat.conf version on all nodes # OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-cma-ip u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 192.168.0.22 0 OpenIB-cma-name u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 svr1-ib0 0 OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 Which type are you using? address, hostname, or netdev names? Also, Intel MPI is sometimes too smart for its own good when opening rdma devices via uDAPL. If the open fails with the first rdma device specified in the dat.conf it will continue onto the next line until one is successfull. If all rdma devices fail it will then go onto the static device automatcally. This sometimes does more harm then good since one node could be failing over to the second device in your configuration and the other nodes are all on the first device. If they are all on the same subnet then it would work fine but if they are on different subnets then we would not be able to connect. If you send me your configuration, we can set it up here and hopefully duplicate your error case. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
Scott Weitzenkamp (sweitzen) wrote: Arlin, I'm having trouble running Intel MPI 2.0.1 and OFED 1.0 rc5 with Intel MPI Benchmark 2.3 on a 32-node PCI-X RHEL4 U3 i686 cluster. This thread caught my eye, can you look at my output and tell me if this is the same issue? If not, are there other things I can tune, or should I file a bug somewhere? this looks like a configuration issue and not the timeout. The CR timeouts occured with the rdma device and not the rdssm. Is IPoIB running on the ib0 interfaces across the fabric? $ .../intelmpi-2.0.1-`uname -m`/bin/mpiexec -genv I_MPI_DEBUG 3 -genv I_MPI_DEVICE rdssm -genv LD_LIBRARY_PATH .../intelmpi-2.0.1-`uname -m`/lib -n 32 .../IMB_2.3/src/IMB-MPI1 PingPong I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(531): Initialization failed MPID_Init(146): channel initialization failed MPIDI_CH3_Init(937): MPIDI_CH3_Progress(328): MPIDI_CH3I_RDMA_wait_connect failed in VC_post_connect (unknown)(): (null) aborting job: ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: which dapl/udapl changes in trunk should be imported into OFED branch? (patch enclosed)
Jack Morgenstein wrote: I'm not familiar with what is needed by Intel MPI. My understanding is that Intel MPI works with revision 7141, but you should confirm that with Arlin. Arlin, could you please indicate which is the earliest revision that Intel MPI works with? Yes, 7141 works with Intel MPI. - Jack ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL: fix uCMA provider event types and dapl_ep_create segv bug
James, Fix for uCMA provider to return the correct event as a result of rejects. Also, ran into a segv bug with dapl_ep_create when creating without a conn_evd. Thanks, -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/common/dapl_ep_create.c === --- dapl/common/dapl_ep_create.c(revision 7140) +++ dapl/common/dapl_ep_create.c(working copy) @@ -310,7 +310,10 @@ dapl_ep_create ( * * N.B. This should really be done by a util routine. */ -dapl_os_atomic_inc ( ((DAPL_EVD *)connect_evd_handle)-evd_ref_count); +if (connect_evd_handle != DAT_HANDLE_NULL) +{ + dapl_os_atomic_inc ( ((DAPL_EVD *)connect_evd_handle)-evd_ref_count); +} /* Optional handles */ if (recv_evd_handle != DAT_HANDLE_NULL) { Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 7140) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -285,14 +285,24 @@ static void dapli_cm_active_cb(struct da NULL, conn-ep); break; case RDMA_CM_EVENT_REJECTED: + { + ib_cm_events_t cm_event; + + /* no device type specified so assume IB for now */ + if (event-status == 28) /* IB_CM_REJ_CONSUMER_DEFINED */ + cm_event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; + else + cm_event = IB_CME_DESTINATION_REJECT; + dapl_dbg_log( DAPL_DBG_TYPE_WARN, dapli_cm_active_handler: REJECTED reason=%d\n, event-status); - dapl_evd_connection_callback(conn, IB_CME_DESTINATION_REJECT, -NULL, conn-ep); + + dapl_evd_connection_callback(conn, cm_event, NULL, conn-ep); + break; - + } case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, @@ -381,6 +391,14 @@ static void dapli_cm_passive_cb(struct d break; case RDMA_CM_EVENT_REJECTED: + { + ib_cm_events_t cm_event; + + /* no device type specified so assume IB for now */ + if (event-status == 28) /* IB_CM_REJ_CONSUMER_DEFINED */ + cm_event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; + else + cm_event = IB_CME_DESTINATION_REJECT; dapl_dbg_log( DAPL_DBG_TYPE_WARN, @@ -395,10 +413,11 @@ static void dapli_cm_passive_cb(struct d ipaddr-dst_addr)-sin_addr.s_addr), ntohs(((struct sockaddr_in *) ipaddr-dst_addr)-sin_port)); - - dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, - NULL, conn-sp); + + dapl_cr_callback(conn, cm_event, NULL, conn-sp); + break; + } case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RE: [PATCH2] uDAPL: fix uCMA provider event types and dapl_ep_create segv bug
-Original Message- From: Arlin Davis [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 17, 2006 12:17 PM To: 'James Lentini' Cc: openib-general Subject: [PATCH] uDAPL: fix uCMA provider event types and dapl_ep_create segv bug James, Fix for uCMA provider to return the correct event as a result of rejects. Also, ran into a segv bug with dapl_ep_create when creating without a conn_evd. Thanks, -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Sorry, the last patch was wrong. Try again... -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/common/dapl_ep_create.c === --- dapl/common/dapl_ep_create.c(revision 7299) +++ dapl/common/dapl_ep_create.c(working copy) @@ -310,7 +310,10 @@ dapl_ep_create ( * * N.B. This should really be done by a util routine. */ -dapl_os_atomic_inc ( ((DAPL_EVD *)connect_evd_handle)-evd_ref_count); +if (connect_evd_handle != DAT_HANDLE_NULL) +{ + dapl_os_atomic_inc ( ((DAPL_EVD *)connect_evd_handle)-evd_ref_count); +} /* Optional handles */ if (recv_evd_handle != DAT_HANDLE_NULL) { Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 7299) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -287,14 +287,24 @@ static void dapli_cm_active_cb(struct da NULL, conn-ep); break; case RDMA_CM_EVENT_REJECTED: + { + ib_cm_events_t cm_event; + + /* no device type specified so assume IB for now */ + if (event-status == 28) /* IB_CM_REJ_CONSUMER_DEFINED */ + cm_event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; + else + cm_event = IB_CME_DESTINATION_REJECT; + dapl_dbg_log( DAPL_DBG_TYPE_WARN, dapli_cm_active_handler: REJECTED reason=%d\n, event-status); - dapl_evd_connection_callback(conn, IB_CME_DESTINATION_REJECT, -NULL, conn-ep); + + dapl_evd_connection_callback(conn, cm_event, NULL, conn-ep); + break; - + } case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, @@ -383,6 +393,14 @@ static void dapli_cm_passive_cb(struct d break; case RDMA_CM_EVENT_REJECTED: + { + ib_cm_events_t cm_event; + + /* no device type specified so assume IB for now */ + if (event-status == 28) /* IB_CM_REJ_CONSUMER_DEFINED */ + cm_event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; + else + cm_event = IB_CME_DESTINATION_REJECT; dapl_dbg_log( DAPL_DBG_TYPE_WARN, @@ -397,10 +415,11 @@ static void dapli_cm_passive_cb(struct d ipaddr-dst_addr)-sin_addr.s_addr), ntohs(((struct sockaddr_in *) ipaddr-dst_addr)-sin_port)); - - dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, - NULL, conn-sp); + + dapls_cr_callback(conn, cm_event, NULL, conn-sp); + break; + } case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 15 of 53] ipath - make some maximum values more sane
Bryan O'Sullivan wrote: Increase the limits on some maximum values. I noticed a rdma/message max size limitation of 4096 the last time I ran some dapl tests. Are there plans to increase or did I miss it somewhere in all the patches? Here are the max values returned from the ipath ibv_query_device: query_hca: (ver=20401) ep 65535 ep_q 65535 evd 65535 evd_q 65535 query_hca: msg 4096 rdma 4096 iov 255 lmr 65535 rmr 0 query_hca: dto 65535 iov 255 rdma i1,o1 Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Quick RDMA Write with Immediate Data Item question
Roland Dreier wrote: Steven With an RDMA Write with Immediate Data Item transfer, in Steven the CQE at the destination (the thing that has the Steven Immediate Data it), does the CQE also contain the memory Steven location where the message just got written too? i.e. does Steven the scatter/gather buffer member of the work completion Steven structure get filled in at all? Or do you just get the Steven ImmdDataItem? It only has the immediate date. A completion queue entry never has information about the address where data was written, so it definitely doesn't in this case. The work completion will also include the length of the RDMA write. -arlin - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] update uDAPL openib_cma provider to work with new uCMA event channels
James, Update the uDAPL openib_cma provider to work with the new uCMA event channel interface. I ran a full set of Intel-MPI test suites with these latest changes and it looks fine. Sync up with Sean on commits. Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 6942) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -67,6 +67,7 @@ static const char rcsid[] = $Id: $; int g_dapl_loopback_connection = 0; int g_ib_pipe[2]; +struct rdma_event_channel *g_cm_events = NULL; ib_thread_state_t g_ib_thread_state = 0; DAPL_OS_THREAD g_ib_thread; DAPL_OS_LOCK g_hca_lock; @@ -184,6 +185,7 @@ int32_t dapls_ib_release(void) { dapl_dbg_log(DAPL_DBG_TYPE_UTIL, dapl_ib_release: \n); dapli_ib_thread_destroy(); + rdma_destroy_event_channel(g_cm_events); return 0; } @@ -214,9 +216,17 @@ DAT_RETURN dapls_ib_open_hca(IN IB_HCA_N dapl_dbg_log(DAPL_DBG_TYPE_UTIL, open_hca: %s - %p\n, hca_name, hca_ptr); + /* Setup the global cm event channel */ + dapl_os_lock(g_hca_lock); + if (g_cm_events == NULL) { + g_cm_events = rdma_create_event_channel(); + if (g_cm_events == NULL) + return DAT_INTERNAL_ERROR; + } + dapl_os_unlock(g_hca_lock); + if (dapli_ib_thread_init()) return DAT_INTERNAL_ERROR; - /* HCA name will be hostname or IP address */ if (getipaddr((char*)hca_name, @@ -224,14 +234,13 @@ DAT_RETURN dapls_ib_open_hca(IN IB_HCA_N sizeof(DAT_SOCK_ADDR6))) return DAT_INVALID_ADDRESS; - /* cm_id will bind local device/GID based on IP address */ - if (rdma_create_id(cm_id, (void*)hca_ptr)) + if (rdma_create_id(g_cm_events, cm_id, (void*)hca_ptr)) return DAT_INTERNAL_ERROR; ret = rdma_bind_addr(cm_id, (struct sockaddr *)hca_ptr-hca_address); - if (ret) { + if ((ret) || (cm_id-verbs == NULL)) { rdma_destroy_id(cm_id); dapl_dbg_log(DAPL_DBG_TYPE_UTIL, open_hca: ERR bind (%d) %s \n, @@ -551,8 +560,8 @@ int dapli_ib_thread_init(void) } /* uCMA events non-blocking */ - opts = fcntl(rdma_get_fd(), F_GETFL); /* uCMA */ - if (opts 0 || fcntl(rdma_get_fd(), + opts = fcntl(g_cm_events-fd, F_GETFL); /* uCMA */ + if (opts 0 || fcntl(g_cm_events-fd, F_SETFL, opts | O_NONBLOCK) 0) { dapl_dbg_log (DAPL_DBG_TYPE_ERR, dapl_ib_init: ERR with uCMA FD\n ); @@ -741,7 +750,7 @@ void dapli_thread(void *arg) dapl_dbg_log (DAPL_DBG_TYPE_UTIL, ib_thread(%d,0x%x): ENTER: pipe %d ucma %d\n, - getpid(), g_ib_thread, g_ib_pipe[0], rdma_get_fd()); + getpid(), g_ib_thread, g_ib_pipe[0], g_cm_events-fd); /* Poll across pipe, CM, AT never changes */ dapl_os_lock( g_hca_lock ); @@ -749,7 +758,7 @@ void dapli_thread(void *arg) ufds[0].fd = g_ib_pipe[0]; /* pipe */ ufds[0].events = POLLIN; - ufds[1].fd = rdma_get_fd(); /* uCMA */ + ufds[1].fd = g_cm_events-fd; /* uCMA */ ufds[1].events = POLLIN; while (g_ib_thread_state == IB_THREAD_RUN) { Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 6942) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -59,6 +59,8 @@ #include sys/poll.h #include signal.h +extern struct rdma_event_channel *g_cm_events; + /* local prototypes */ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, struct rdma_cm_event *event); @@ -614,7 +616,7 @@ dapls_ib_setup_conn_listener(IN DAPL_IA dapl_os_lock_init(conn-lock); /* create CM_ID, bind to local device, create QP */ - if (rdma_create_id(conn-cm_id, (void*)conn)) { + if (rdma_create_id(g_cm_events, conn-cm_id, (void*)conn)) { dapl_os_free(conn, sizeof(*conn)); return(dapl_convert_errno(errno,setup_listener)); } @@ -1067,7 +1069,7 @@ void dapli_cma_event_cb(void) dapl_dbg_log(DAPL_DBG_TYPE_UTIL, cm_event()\n); /* process one CM event, fairness */ - if(!rdma_get_cm_event(event)) { + if(!rdma_get_cm_event(g_cm_events, event)) { struct dapl_cm_id *conn; /* set proper conn from cm_id context*/ Index: dapl/openib_cma/dapl_ib_qp.c
[openib-general] RE: [PATCH2] uDAPL openib_cma: fixed address bindings, getaddrinfo, and added debug messages for rejects
James, Here is a new patch with your recommended changes. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 6672) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -121,11 +121,12 @@ static int getipaddr(char *name, char *a if (getaddrinfo(name, NULL, NULL, res)) { /* retry using network device name */ ret = getipaddr_netdev(name,addr,len); - if (ret) + if (ret) { dapl_dbg_log(DAPL_DBG_TYPE_WARN, getipaddr: invalid name, addr, or netdev(%s)\n, name); - return ret; + return ret; + } } else { if (len = res-ai_addrlen) memcpy(addr, res-ai_addr, res-ai_addrlen); Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 6672) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -274,11 +274,21 @@ static void dapli_cm_active_cb(struct da switch (event-event) { case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_active_handler: CONN_ERR +event=0x%x status=%d\n, + event-event, event-status); + dapl_evd_connection_callback(conn, IB_CME_DESTINATION_UNREACHABLE, NULL, conn-ep); break; case RDMA_CM_EVENT_REJECTED: + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_active_handler: REJECTED reason=%d\n, + event-status); dapl_evd_connection_callback(conn, IB_CME_DESTINATION_REJECT, NULL, conn-ep); break; @@ -320,6 +330,9 @@ static void dapli_cm_passive_cb(struct d struct rdma_cm_event *event) { struct dapl_cm_id *new_conn; +#ifdef DAPL_DBG + struct rdma_addr *ipaddr = conn-cm_id-route.addr; +#endif dapl_dbg_log(DAPL_DBG_TYPE_CM, passive_cb: conn %p id %d event %d\n, @@ -343,13 +356,48 @@ static void dapli_cm_passive_cb(struct d event-private_data, new_conn-sp); break; case RDMA_CM_EVENT_UNREACHABLE: + dapls_cr_callback(conn, IB_CME_DESTINATION_UNREACHABLE, +NULL, conn-sp); + case RDMA_CM_EVENT_CONNECT_ERROR: + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive: CONN_ERR +event=0x%x status=%d, +on SRC 0x%x,0x%x DST 0x%x,0x%x\n, + event-event, event-status, + ntohl(((struct sockaddr_in *) + ipaddr-src_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-src_addr)-sin_port), + ntohl(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_port)); + dapls_cr_callback(conn, IB_CME_DESTINATION_UNREACHABLE, NULL, conn-sp); break; + case RDMA_CM_EVENT_REJECTED: - dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, NULL, -conn-sp); + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive: REJECTED reason=%d +on SRC 0x%x,0x%x DST 0x%x,0x%x\n, + event-status, + ntohl(((struct sockaddr_in *) + ipaddr-src_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-src_addr)-sin_port), + ntohl(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_port)); + + dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, + NULL, conn-sp); break; case RDMA_CM_EVENT_ESTABLISHED: @@ -556,6 +604,7 @@ dapls_ib_setup_conn_listener(IN DAPL_IA { DAT_RETURN dat_status
[openib-general] [PATCH] uDAPL openib_cma: fixed address bindings, getaddrinfo, and added debug messages for rejects
James, Sean's port checking in the uCMA exposed a address binding issue in the openib_cma provider. Here is a patch to fix the port issue and a fix for getaddrinfo when running with a debug build. I also added some additional debug messages during connect errors and rejects. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 6672) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -121,11 +121,12 @@ static int getipaddr(char *name, char *a if (getaddrinfo(name, NULL, NULL, res)) { /* retry using network device name */ ret = getipaddr_netdev(name,addr,len); - if (ret) + if (ret) { dapl_dbg_log(DAPL_DBG_TYPE_WARN, getipaddr: invalid name, addr, or netdev(%s)\n, name); - return ret; + return ret; + } } else { if (len = res-ai_addrlen) memcpy(addr, res-ai_addr, res-ai_addrlen); Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 6672) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -274,11 +274,21 @@ static void dapli_cm_active_cb(struct da switch (event-event) { case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_active_handler: CONN_ERR +event=0x%x status=%d\n, + event-event, event-status); + dapl_evd_connection_callback(conn, IB_CME_DESTINATION_UNREACHABLE, NULL, conn-ep); break; case RDMA_CM_EVENT_REJECTED: + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_active_handler: REJECTED reason=%d\n, + event-status); dapl_evd_connection_callback(conn, IB_CME_DESTINATION_REJECT, NULL, conn-ep); break; @@ -320,6 +330,9 @@ static void dapli_cm_passive_cb(struct d struct rdma_cm_event *event) { struct dapl_cm_id *new_conn; +#ifdef DAPL_DBG + struct rdma_addr *ipaddr = conn-cm_id-route.addr; +#endif dapl_dbg_log(DAPL_DBG_TYPE_CM, passive_cb: conn %p id %d event %d\n, @@ -343,13 +356,58 @@ static void dapli_cm_passive_cb(struct d event-private_data, new_conn-sp); break; case RDMA_CM_EVENT_UNREACHABLE: + dapls_cr_callback(conn, IB_CME_DESTINATION_UNREACHABLE, +NULL, conn-sp); + case RDMA_CM_EVENT_CONNECT_ERROR: + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive_handler: CONN_ERR +event=0x%x status=%d\n, + event-event, event-status ); + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive_handler: CONN_ERR +on SRC 0x%x,0x%x DST 0x%x,0x%x \n, + ntohl(((struct sockaddr_in *) + ipaddr-src_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-src_addr)-sin_port), + ntohl(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr-dst_addr)-sin_port) +); + dapls_cr_callback(conn, IB_CME_DESTINATION_UNREACHABLE, NULL, conn-sp); break; + case RDMA_CM_EVENT_REJECTED: - dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, NULL, -conn-sp); + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive_handler: REJECTED reason=%d\n, + event-status); + + dapl_dbg_log( + DAPL_DBG_TYPE_WARN, +dapli_cm_passive_handler: REJECTED +on SRC 0x%x,0x%x DST 0x%x,0x%x \n, + ntohl(((struct sockaddr_in *) + ipaddr-src_addr)-sin_addr.s_addr), + ntohs(((struct sockaddr_in *) + ipaddr
Re: [openib-general] Re: [uDAPL] dat.conf generator
James Lentini wrote: On Tue, 18 Apr 2006, Dotan Barak wrote: On Monday 17 April 2006 23:46, James Lentini wrote: On Sun, 16 Apr 2006, Dotan Barak wrote: On Wednesday 12 April 2006 17:50, James Lentini wrote: OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 mthca0 1 OpenIB-cma0-1 u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 mthca0 1 OpenIB-cma0-2 u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 mthca0 2 OpenIB-cma1-1 u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 mthca1 1 OpenIB-cma1-2 u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 mthca1 2 These entries are wrong. The cma versopm will only work with an ip address, network hostname, or netdev name. The port value is meaningless since the name gives you the device and port reference all in one. For cma the best flavor is netdev name as follow: OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 because it allows you to have identical dat.conf setups on across your cluster if you intend on using the first IB interface on each node. look at the dapl/doc/dat.conf for the correct examples. -arlin # # DAT 1.2 configuration file # # Each entry should have the following fields: # # ia_name api_version threadsafety default lib_path \ # provider_version ia_params platform_params # # Example for openib_cma and openib_scm # # For cma version you specify ia_params as: # network address, network hostname, or netdev name and 0 for port # # For scm version you specify ia_params as actual device name and port # # Simple (OpenIB-cma) default with netdev name provided first on list # to enable use of same dat.conf version on all nodes # OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-cma-ip u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 192.168.0.22 0 OpenIB-cma-name u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 svr1-ib0 0 OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-scm1 u1.2 nonthreadsafe default /usr/lib/libdaplscm.so mv_dapl.1.2 mthca0 1 OpenIB-scm2 u1.2 nonthreadsafe default /usr/lib/libdaplscm.so mv_dapl.1.2 mthca0 2 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [uDAPL] dtest server never ends when using the dapl provider OpenIB-scm1
Dotan Barak wrote: Can you attach to the server process with gdb and get me a back trace from each of the threads? What does driver IBED-1.0-rc3 consist of? Thanks, -arlin Here is a back trace of the hanged process: (gdb) bt #0 0x2b31c86a in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/tls/libpthread.so.0 #1 0x2b42ef5b in dapl_os_wait_object_wait (wait_obj=0x516650, timeout_val=value optimized out) at dapl_osd.c:276 #2 0x2b42e9ab in dapl_evd_wait (evd_handle=0x516560, time_out=4294967295, threshold=1, event=0x7fdd7bf0, nmore=0x7fdd7c2c) at dapl_evd_wait.c:233 #3 0x004021ab in disconnect_ep () at dtest.c:894 #4 0x00404cad in main (argc=4, argv=value optimized out) at Yes, looks like the disconnect event was dropped. Couple of questions: Does this only happen with the scm provider? Can you reproduce on the OpenIB trunk or 1.0 branch? Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [uDAPL] dtest server never ends when using the dapl provider OpenIB-scm1
Dotan Barak wrote: Hi. thanks for the quick response. I executed the dtest with the -v parameter and here is the output of both sides. I added the test the '-l' parameter to be able to change to dapl provider in command line (if you wish i can post you a patch). full server output: --- sw043:/tmp/tsscr/svn.mlx_tp/gen2/userspace/ulps/udapl/dtest # ./dtest -l OpenIB-scm2 -v 23996 DAPL_PROVIDER is OpenIB-scm2 23996 Verbose 23996 Running as server 23996 Allocated RDMA buffers (r:0x8052390,s:0x8052618) len 64 23996 Opened Interface Adaptor ... 23996 waiting for message receive event 23996 inbound message; message arrived! 23996 SERVER: RCV buffer 0x80525d0 contains: 0x55 len=64 23996 SERVER: SND buffer 0x8052858 contains: 0xffaa len=64 23996 calling post_send 23996 send_msg completed 23996 do_ping_pong_msg complete 23996 Disconnect and Free EP 0x805f518 Hmm, not sure what this thread is waiting on. I would expect to see the dat_ep_disconnect messages before the wait complete or at least the dat_ep_disconnect message indicating a blocking disconnect call. The next 3 messages expected are as follow: dat_ep_disconnect dat_ep_disconnect completed dat_evd_wait for h_conn_evd completed Can you attach to the server process with gdb and get me a back trace from each of the threads? What does driver IBED-1.0-rc3 consist of? Thanks, -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [uDAPL] dtest server never ends when using the dapl provider OpenIB-scm1
James Lentini wrote: It sounds like the disconnect is being lost. Let me see if I can reproduce this. Arlin, have you ever seen this? No. it runs fine on my systems. It looks like the ping pong test on the server side did not finish. Can Dotan add a -v switch to the dtest to help isolate? What svn version are we running? Do you have the latest uDAPL fixes commited in 6393? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] uDAPL cma; dat_ep_free can return without freeing cm_id
Steve Wise wrote: Steve, can you test this version and see if it works for your iWARP device. I think the patch is good. I ran dapltest/regress.sh over the chelsio iwarp device using this new patch instead of my original patch, and things seem as stable as they were before (i'm fighting some intermittent connection setup failures that I think are in cxgb3 provider, not dapl). James, can you go ahead and commit this patch? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [uDAPLl] question about dapl_ib_cq_resize
Dotan Barak wrote: Hi. I looked at the file: src/userspace/dapl/dapl/openib/dapl_ib_cq.c, function: dapl_ib_cq_resize: In this function, when one wants to resize a CQ, the dapl destroys the old CQ and creates a new one instead of calling to the resize CQ verb (which was added ~3 months ago), is there is a reason for this code? When this was coded the resize CQ verb was not available. I will take a look and update the code. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH] [RFC] - dapl - dat_ep_free() can return without
James Lentini wrote: On Tue, 4 Apr 2006, Steve Wise wrote: What happens if a consumer attempts to free the EP from a callback? There are no direct consumer callbacks in usermode are there? consumers call dat_evd_wait() or whatever and get scheduled. Not like kernel mode... Or am I confused? You're right. The DAT consumer thread calling dat_ep_free() will never be a provider (or verbs) thread. It looks like there needs to be some synchronization around destroying the cm_id with the dapli_thread(), though. Could we only delete the QP in dat_ep_free as Sean suggested and leave the cm_id cleanup for later as is being done now? I think we should destroy all resources before returning, including the cm_id. This will insure that no events will fire with context associated with an EP(qp,cm_id,etc) that was just freed. I will take a look at Steve's patch and get back to you. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma; dat_ep_free can return without freeing cm_id
James and Steve, Here is revised patch that was tested (free and debug build versions) with dtest, dapltest, and Intel MPI test suites. The rdma_destroy_id will block until we acknowledge the event so there was no need to add our own wait objects for synchronization. This will never be called from the async event thread so there is no chance of deadlock conditions. I also made some changes to build with configure enable-debug. Some unused variables that were deleted are actually used in the debug messages. Please review the changes. Steve, can you test this version and see if it works for your iWARP device. Thanks, -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 6305) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -62,9 +62,9 @@ /* local prototypes */ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, struct rdma_cm_event *event); -static int dapli_cm_active_cb(struct dapl_cm_id *conn, +static void dapli_cm_active_cb(struct dapl_cm_id *conn, struct rdma_cm_event *event); -static int dapli_cm_passive_cb(struct dapl_cm_id *conn, +static void dapli_cm_passive_cb(struct dapl_cm_id *conn, struct rdma_cm_event *event); static void dapli_addr_resolve(struct dapl_cm_id *conn); static void dapli_route_resolve(struct dapl_cm_id *conn); @@ -87,7 +87,9 @@ static inline uint64_t cpu_to_be64(uint6 static void dapli_addr_resolve(struct dapl_cm_id *conn) { int ret; - +#ifdef DAPL_DBG + struct rdma_addr *ipaddr = conn-cm_id-route.addr; +#endif dapl_dbg_log(DAPL_DBG_TYPE_CM, addr_resolve: cm_id %p SRC %x DST %x\n, conn-cm_id, @@ -110,8 +112,10 @@ static void dapli_addr_resolve(struct da static void dapli_route_resolve(struct dapl_cm_id *conn) { int ret; - struct rdma_cm_id *cm_id = conn-cm_id; - +#ifdef DAPL_DBG + struct rdma_addr *ipaddr = conn-cm_id-route.addr; + struct ib_addr *ibaddr = conn-cm_id-route.addr.addr.ibaddr; +#endif dapl_dbg_log(DAPL_DBG_TYPE_CM, route_resolve: cm_id %p SRC %x DST %x PORT %d\n, conn-cm_id, @@ -158,37 +162,51 @@ bail: NULL, conn-ep); } +/* + * Called from consumer thread via dat_ep_free(). + * CANNOT be called from the async event processing thread + * dapli_cma_event_cb() since a cm_id reference is held and + * a deadlock will occur. + */ void dapli_destroy_conn(struct dapl_cm_id *conn) { - int in_callback; + struct rdma_cm_id *cm_id; dapl_dbg_log(DAPL_DBG_TYPE_CM, destroy_conn: conn %p id %d\n, conn,conn-cm_id); - + dapl_os_lock(conn-lock); conn-destroy = 1; - in_callback = conn-in_callback; + + if (conn-ep) + conn-ep-cm_handle = IB_INVALID_HANDLE; + + cm_id = conn-cm_id; + conn-cm_id = NULL; dapl_os_unlock(conn-lock); - if (!in_callback) { - if (conn-ep) - conn-ep-cm_handle = IB_INVALID_HANDLE; - if (conn-cm_id) { - if (conn-cm_id-qp) - rdma_destroy_qp(conn-cm_id); - rdma_destroy_id(conn-cm_id); - } - - conn-cm_id = NULL; - dapl_os_free(conn, sizeof(*conn)); + /* +* rdma_destroy_id will force synchronization with async CM event +* thread since it blocks until the in-process event reference +* is cleared during our event processing call exit. +*/ + if (cm_id) { + if (cm_id-qp) + rdma_destroy_qp(cm_id); + + rdma_destroy_id(cm_id); } + dapl_os_free(conn, sizeof(*conn)); } static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, struct rdma_cm_event *event) { struct dapl_cm_id *new_conn; +#ifdef DAPL_DBG + struct rdma_addr *ipaddr = event-id-route.addr; +#endif if (conn-sp == NULL) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, @@ -239,11 +257,9 @@ static struct dapl_cm_id * dapli_req_rec return new_conn; } -static int dapli_cm_active_cb(struct dapl_cm_id *conn, +static void dapli_cm_active_cb(struct dapl_cm_id *conn, struct rdma_cm_event *event) { - int destroy; - dapl_dbg_log(DAPL_DBG_TYPE_CM, active_cb: conn %p id %d event %d\n, conn, conn-cm_id, event-event ); @@ -251,9 +267,8 @@ static int dapli_cm_active_cb(struct dap dapl_os_lock(conn-lock); if (conn
Re: [openib-general] how to execute the dtest?
Dotan Barak wrote: Some more info: when i changed the dat.conf to be: OpenIB-cma u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 192.168.0.22 0 OpenIB-cma-name u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 svr1-ib0 0 OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-scm1 u1.2 nonthreadsafe default /usr/local/lib/libdaplscm.so mv_dapl.1.2 mthca0 1 OpenIB-scm2 u1.2 nonthreadsafe default /usr/local/lib/libdaplscm.so mv_dapl.1.2 mthca0 2 The dtest makefile builds with DAPL_PROVIDER == OpenIB-cma-ip by default so it will use the second line of the configuration file. This requires the IP address of the IB device on your system to be supplied in the dat.conf. Change the default IP address (192.168.0.22) to match your ib device network address that you ifconfig'ed. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: SPAM: [PATCH] [RFC] - dapl - dat_ep_free() can return without freeing the endpoint
Sean Hefty wrote: James Lentini wrote: void dapli_destroy_conn(struct dapl_cm_id *conn) { int in_callback; +struct rdma_cm_id *cm_id; dapl_dbg_log(DAPL_DBG_TYPE_CM, destroy_conn: conn %p id %d\n, conn,conn-cm_id); - dapl_os_lock(conn-lock); conn-destroy = 1; in_callback = conn-in_callback; -dapl_os_unlock(conn-lock); - -if (!in_callback) { -if (conn-ep) -conn-ep-cm_handle = IB_INVALID_HANDLE; -if (conn-cm_id) { -if (conn-cm_id-qp) -rdma_destroy_qp(conn-cm_id); -rdma_destroy_id(conn-cm_id); +do { +if (in_callback) { +dapl_os_unlock(conn-lock); +usleep(10); +dapl_os_lock(conn-lock); In general this doesn't work. The calling thread may be the callback thread, which would lead to deadlock. This is why we don't just call rdma_destroy_id() directly, and let it wait for the callback to complete. Sorry, the callback names should be changed since it is really a async event processing thread and not a direct callback from CMA. The async thread can destroy the cm_id if we no longer hold any cm_id event references, we destroy the associated QP, and we are syncronized with any other thread that could be destroying at the same time. This is how the code currently works. I did not see the original thread/patch from Steve so I don't have the entire context of this issue but it sounds like we need to fix the code so that the destroy QP (dat_ep_free) blocks until the event processing is complete, always destroy the QP and cm_id from this call, and remove cleanup from any async event processing threads. Is this what Steve was attempting to do with his patch? I seemed to have missed the posting of the patch so could someone point me to the original patch so I can review and test any changes. thanks -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [DAPL] Provider initialialization
James Lentini wrote: Arlin, As part of the uDAPL autotools patch, we changed the mechanism by which the uDAPL provider library's init and fini functions were specified. I've seen (and received reports) of systems on which the init and fini functions are not being called. I'd like to move back to the old mechanism (see patch below). Do you see any problems with this? no problem. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH3] uDAPL/uDAT autotools - Package for udat, udaplcma, udaplscm
James Lentini wrote: On Fri, 17 Mar 2006, Davis, Arlin R wrote: Here is a patch with all the latest changes, including an updated dat.conf. Committed in the trunk and 1.0 branch in revision 5880. James, The spec file name is wrong; libdat.spec should be /libdat.spec.in -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [PATCH2] uDAPL/uDAT autotools - Package for udat, udaplcma, udaplscm
James Lentini wrote: On Thu, 16 Mar 2006, Arlin Davis wrote: This looks excellent Arlin. I think it is essentailly ready to go. I have a few minor questions/commnets: I'll add an empty directory called config to the root of the dapl tree. perfect. Index: AUTHORS === --- AUTHORS (revision 0) +++ AUTHORS (revision 0) @@ -0,0 +1,2 @@ +James Lentini [EMAIL PROTECTED] +Arlin Davis[EMAIL PROTECTED] I'll add several more names to this. Yes, this was just a start. Please check the COPYING file also for completeness. +dnl Checks for typedefs, structures, and compiler characteristics. +AC_C_CONST +AC_CHECK_SIZEOF(long) If the code does not include config.h, do these have any effect? no, we can remove. + +dnl Checks for libraries +if test $disable_libcheck != yes +then +AC_CHECK_LIB(ibverbs, ibv_get_device_list, [], +AC_MSG_ERROR([ibv_get_device_list() not found. libdapl requires libibverbs.])) +fi Should we throw in a check for librdmacm? While there is dependency for libdaplcma there is no dependency for libdaplscm (socket CM). Not sure how to deal with different dependencies across multiple libraries in the package. Anyone have suggestions? +AC_HEADER_STDC Again, if the code doesn't include config.h, does this have any effect? no, we can remove. + dat_srq_resize; + dat_srq_set_lw; We can make this local, right? + dats_get_ia_handle; I would say yes but the current version of Intel MPI is designed to support both 1.1 and 1.2 udat/udapl providers on the fly via some tricks and actually does a dlsym lookup on this function to determine a 1.2 uDAT version. I think the right thing to do is to add something to the configure.in script. Here's what I propose (patch to your patch). I'm not autotools expert, so let me know if you see anything wrong with this: +AC_CACHE_CHECK(whether this is an RHEL system, ac_cv_rhel, +if test -f /etc/redhat-release + test -n `grep -v Fedora /etc/redhat-release`; then +ac_cv_rhel=yes +else +ac_cv_rhel=no +fi) + +AM_CONDITIONAL(OS_RHEL, test $ac_cv_rhel = yes) + -# TODO...Need check to set properly -OSVENDOR=REDHAT_EL4 +if OS_RHEL +OSFLAGS=-DREDHAT_EL4 +else +OSFLAGS= +endif if DEBUG DBGFLAGS = -ggdb -DDAPL_DBG @@ -19,17 +22,17 @@ datlib_LTLIBRARIES = dat/udat/libdat.la dapllibcma_LTLIBRARIES = dapl/udapl/libdaplcma.la dapllibscm_LTLIBRARIES = dapl/udapl/libdaplscm.la -dat_udat_libdat_la_CFLAGS = -Wall $(DBGFLAGS) -D_GNU_SOURCE -D$(OSVENDOR) This looks good. Just change the Makefile.am _CFLAGS lines to get rid of the extra -D I will package up a version 3 patch that will include these changes and a new dat.conf that works with the RPM. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] uDAPL cma, missing cma event ack
James, Fixes a corner case where a CMA event was not acknowledged during disconnect processing. -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 5854) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -1074,8 +1074,10 @@ void dapli_cma_event_cb(void) if (conn-cm_id-qp) rdma_destroy_qp(conn-cm_id); + rdma_ack_cm_event(event); rdma_destroy_id(conn-cm_id); dapl_os_free(conn, sizeof(*conn)); + return; } break; case RDMA_CM_EVENT_CONNECT_RESPONSE: ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] uDAPL/uDAT autotools - Package for udat, udaplcma, udaplscm
Bryan O'Sullivan wrote: Thanks, Arlin. I think we'll use this stuff instead of Moni's patch, as it's smaller and cleaner. I do have a hack regarding the disti switch (OSVENDOR=REDHAT_EL4) required for the uDAPL build. We need to clean this up with a real disti check. I wasn't sure how to check and set the disti accordingly in the Makefile.am. Um, what's a disti? distributors... redhat, suse, etc +%define prefix /usr This should go. ok. You should check the other packages since most have this included. I will incorporate all your other changes. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1.0] uDAPL - QP destroy and HCA close problems fixed
James, Here is a small uDAPL patch that should go into 1.0 that fixes some issues that we just found with MPI scale out testing on OpenIB. QP was not being destroyed in some cases and hca_close issues with async work thread. I am still working one other elusive disconnect problem that may require another small patch. Thanks, -arlin Signed-off by: Arlin Davis [EMAIL PROTECTED] Index: dapl/openib_cma/dapl_ib_util.c === --- dapl/openib_cma/dapl_ib_util.c (revision 5489) +++ dapl/openib_cma/dapl_ib_util.c (working copy) @@ -330,6 +330,13 @@ DAT_RETURN dapls_ib_close_hca(IN DAPL_HC hca_ptr-ib_hca_handle = IB_INVALID_HANDLE; } + dapl_os_lock(g_hca_lock); + if (g_ib_thread_state != IB_THREAD_RUN) { + dapl_os_unlock(g_hca_lock); + goto bail; + } + dapl_os_unlock(g_hca_lock); + /* * Remove hca from async and CQ event processing list * Wakeup work thread to remove from polling list @@ -342,10 +349,12 @@ DAT_RETURN dapls_ib_close_hca(IN DAPL_HC struct timespec sleep, remain; sleep.tv_sec = 0; sleep.tv_nsec = 1000; /* 10 ms */ + write(g_ib_pipe[1], w, sizeof w); dapl_dbg_log(DAPL_DBG_TYPE_UTIL, ib_thread_destroy: wait on hca %p destroy\n); nanosleep (sleep, remain); } +bail: return (DAT_SUCCESS); } Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 5489) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -306,15 +306,6 @@ static int dapli_cm_active_cb(struct dap destroy = conn-destroy; conn-in_callback = conn-destroy; dapl_os_unlock(conn-lock); - if (destroy) { - dapl_dbg_log(DAPL_DBG_TYPE_CM, - active_cb: DESTROY conn %p id %d \n, -conn, conn-cm_id ); - if (conn-ep) - conn-ep-cm_handle = IB_INVALID_HANDLE; - - dapl_os_free(conn, sizeof(*conn)); - } return(destroy); } @@ -389,12 +380,6 @@ static int dapli_cm_passive_cb(struct da destroy = conn-destroy; conn-in_callback = conn-destroy; dapl_os_unlock(conn-lock); - if (destroy) { - if (conn-ep) - conn-ep-cm_handle = IB_INVALID_HANDLE; - - dapl_os_free(conn, sizeof(*conn)); - } return(destroy); } @@ -1080,10 +1065,21 @@ void dapli_cma_event_cb(void) ret = dapli_cm_passive_cb(conn,event); else ret = dapli_cm_active_cb(conn,event); - - if (ret) + + /* destroy both qp and cm_id */ + if (ret) { + dapl_dbg_log(DAPL_DBG_TYPE_CM, + cma_cb: DESTROY conn %p + cm_id %p qp %p\n, +conn, conn-cm_id, +conn-cm_id-qp); + + if (conn-cm_id-qp) + rdma_destroy_qp(conn-cm_id); + rdma_destroy_id(conn-cm_id); - + dapl_os_free(conn, sizeof(*conn)); + } break; case RDMA_CM_EVENT_CONNECT_RESPONSE: default: @@ -1095,7 +1091,7 @@ void dapli_cma_event_cb(void) } rdma_ack_cm_event(event); } else { - dapl_dbg_log(DAPL_DBG_TYPE_WARN, + dapl_dbg_log(DAPL_DBG_TYPE_CM, cm_event: ERROR: rdma_get_cm_event() %d %d %s\n, ret, errno, strerror(errno)); } Index: dapl/openib_cma/dapl_ib_util.h === --- dapl/openib_cma/dapl_ib_util.h (revision 5489) +++ dapl/openib_cma/dapl_ib_util.h (working copy) @@ -295,7 +295,8 @@ dapl_convert_errno( IN int err, IN const if (!err) return DAT_SUCCESS; #if DAPL_DBG -if ((err != EAGAIN) (err != ETIME) (err != ETIMEDOUT)) +if ((err != EAGAIN) (err != ETIME) + (err != ETIMEDOUT) (err != EINTR)) dapl_dbg_log (DAPL_DBG_TYPE_ERR, %s %s\n, str, strerror(err)); #endif Index: dapl/openib_cma/dapl_ib_cq.c === --- dapl/openib_cma/dapl_ib_cq.c(revision 5489) +++ dapl/openib_cma/dapl_ib_cq.c(working copy) @@ -498,7 +498,10
Re: [dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal
Roland Dreier wrote: Hmm. Can you put a number on how much better RDMA write with immediate is on current HCA hardware? How does using the underlying OpenIB verbs ability to post a list of work requests compare (ie posting an RDMA write followed by a send in one verbs call)? Maybe post multiple is a better direction for DAT. With post multiple, unlike immediate data, you don't have the ability to distinguish between a normal receive and a rdma write completion indication on the other end. This is the uniqueness of the service that cannot be provided by the post multiple. Yes, post multiple would be a nice option for DAT it is just a different service. It would also be required to conform to the semantics rules of the bundled operations so you could not do any optimization tricks under the covers with an IB rdma_write_immediate operation. -arlin - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal
Michael Krause wrote: RDMA Write with Immediate is part of the IB Extended Transport Header. It is a fixed-sized quantity and not one subject to change, i.e. increasing its size. Your argument above reinforces that the particular application need is IB-specific and thus should not be part of a general API but a transport-specific API. If the application will only operate optimally using immediate data, then it is only suitable for an IB fabric. This reinforces the need for a transport-specific API. I agree. I will move the IB immediate data service back into the extension interface and update the OpenIB uDAPL provider patch. Those applications that simply want to enable completion notification when a RDMA Write has occurred can use a general purpose API that is interconnect independent and whose code is predicated upon a RDMA Write - Send set of operations. This will enable application portability across all interconnect types. I will defer this to Arkady to draft. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal
Roland Dreier wrote: Michael So, here we have a long discussion on attempting to Michael perpetuate a concept that is not universal across Michael transports and was deemed to have minimal value that most Michael wanted to see removed from the architecture. But this discussion is being driven by an application developer who does see value in immediate data. Arlin, can you quantify the benefit you see from RDMA write with immediate vs. RDMA write followed by a send? We need speed and simplicity. A very latency sensitive application that requires immediate notification of RDMA write completion on the remote node without ANY latency penalties associated with combining operations, HCA priority rules across QPs, wire congestion, etc. An application that has no requirement for messaging outside of remote rdma write completion notifications. The application would not have to register and manage additional message buffers on either side, we can just size the queues accordingly and post zero byte messages. We need something that would be equivelent to setting there polling on the last byte of inbound data. But, since data ordering within an operation is not guaranteed that is not an option. So, rdma with immediate data is the most optimal and simplistic method for indication of RDMA-write completion that we have available today. In fact, I would like to see it increased in size to make it even more useful. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal
Sean Hefty wrote: The requirement is to provide an API that supports RDMA writes with immediate data. A send that follows an RDMA write is not immediate data, and the API should not be constructed around trying to make it so. To be clear, I believe that write with immediate should be part of the normal APIs, rather than an extension, but should be designed around those devices that provide it natively. I totally agree. A standard RDMA write with immediate API can be very useful to RDMA applications based on the requirements (native support) set forth in my earlier email. It is analogous to the new dat_ep_post_send_with_invalidate() call; a call that supports a native iWARP transport operation but provides no provisions to help other transports emulate. So, other transports simply return NOT_SUPPORTED and add it natively in the future if it makes sense. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] RE: [RFC] DAT 2.0 immediate data proposal
But this penalizes user which need to deal with 2 way to deal with post calls and completions. I do not think we are not to far from consensus. Transport independent App will allocate 4 bytes extra for buffers that can match immediate data. Completion data will return where the immediate data is return (Consumer can not request it on posting), and 4 bytes for immediate data in completion event. The rest are ironing details for complete specification. This is no different than for any other new functionality proposed. And except for wasting 4 bytes per buffer or completion I do not see how it penalizes IB. Moreover if Apps knows that Provider returns immediate data in completion event it can avoid any penalty. There is no penalty to the user if you just provide native features via extensions. Your extension will provide the best possible interface for your native capabilities. I think we are further from consensus then we first thought: Right now we have a new post recv, different delivery mechanisms, and a requirement to allocate an extra 4 bytes of user data. The only requirement to support immediate data on IB, is a new post send and write immediate data calls and a new event data construct. The normal post_recv can be used unchanged and can already process normal and immediate data. No requirement on the user to allocate and manage an extra 4 bytes in the receive buffer. In fact, you can post receive with no buffer. In order to support immediate data via iWARP, you now have a requirement to use a special new receive post, new user buffer constructs to place the data, and new delivery method that has to be checked via provider attributes or at event time. Is there anyway to get this closer? If not, I would recommend going back to an extension interface for immediate data. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [RFC] DAT 2.0 immediate data proposal
ok, maybe we should backup and start over This is exactly why immediate data was initially proposed as an extension instead of general API. We start to penalize native IB features based on the requirements of other RDMA interfaces that have to emulate the feature anyway. What prevents the next RDMA interface that comes along from requiring other variations of the interface due to implementation implications? This is an IB specific feature that does not map well on iWARP so lets just call it what it is and let IB providers supply immediate data capabilities via the extension interface. -arlin Caitlin Bestler wrote: Maybe we need to just go back to one model and always deliver via the event? With the post_recv_immed requirements, other transports have a mechanism to emulate and create the necessary resources on the recv side to place idata and copy to event when operation is completed. Would this work for iWARP? Two different models for receiving idata should be avoided if at all possible. Always delivering by the event is not feasible for an iWARP vendor. If you are working over RDMAC verbs then the work completion is no longer accessible by the time the Work Completion is reaped. So copying from the receive buffer to the event does not work since the location of the receive buffer is now known only to the application. The same problem exists in the opposite direction for InfiniBand HCAs using standard verbs. They cannot copy from the CQE to the receive buffer. So the user is stuck checking a flag or the event type to know where their data is. This is not terribly user friendly, but it is the best that can be offered if we want to enable this optimization. The need to check the flag does reduce the value of the optimization though. 6. Is dto_completion_data xfer_length include immediate_data size or not? no Then how does the receiver know how much data there is? Even if an iWarp Provider attempts to optimize immediate placement into the CQ, it will end up setting the xfer_length whenever the packet is received out of order. So it is far simpler for the application to simply know that the data will be in the buffer, and that the xfer_length will be set. It doesn't need to worry about whether they were set by the cq_poll verb or by the hardware. 11. Need to cleanup operation description to make it clear that Send|RDMA_write and immediate data part is a single atomic operation. The current followed by language is misleading. Make it explicit that there is a single local DTO completion and single remote DTO completion. Ok, I will clean that up The best mapping available over RDMAC-compliant firmware for an iWARP NIC would be to post two operations (RDMA Write followed by a short Send). That would require additional spacein the send and completion queues since a completion for the write can only be suppressed for a successful completion. Whether these extra slots were required would be an IA attribute. And the requirement is that nothing for that QP can come between the iWARP Write and the Send. How the provider does that is up to it. Options include locking over both posts and a composite work request. Anyone working over existing RDMAC-compliant verbs will have to use the first approach. 12. Is your intension that post_recv_immed can ONLY except immediate data and is not capable to recv any message? No, the intention is to extend the post_recv to handle 32bit idata which may arrive with or without other send or rdma_write data. Does it make more sense to add a dto_flags to the existing post_recv? How does this map to iWARP? When the data can be sent as an immediate OR as data, then when received it can be placed into the receive buffer or even potentially directly into the CQ when everything aligns just right. But an iWARP sender has to place the immediate value as the first four bytes of a Send message. There is no other mapping than makes sense. Shoving the rest of the message up is complex, as is using the last four bytes of the message since the last four bytes *could* cross a DDP Segment boundary, and would require the user to provide a buffer that was 4 bytes larger. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [RFC] DAT 2.0 extension proposal
Kanevsky, Arkady wrote: Arlin, 1. Does it mean that existing DAT providers will have to be modified so they report DAT_NOT_IMPLEMENTED for each extension? No. During the open, a dat library built to support extensions, a query call is made to verify that the provider supports extensions and sets a global flag accordingly. This flag is checked via our single dat_extension call in dat_api. Take a look at the patch for all the details. 2. Why is there DAT_INVALID in DAT_DTOS? no reason. I can get rid of it. I will go ahead and keep this in sync with the latest 1.3 (2.0) definitions. 3. Do you want to use DAT_EXTENSION_DATA or DAT_EXT_DATA? sure. 4. The proposed operations are operation on EP and they are DTOs. Why not define DAT_DTO_EXT_OP instead of DAT_EXT_OP? Yes, it makes more sense if we decide to limit these extensions to DTO types. MY concern is that if these are not DTO then we have a new event stream type for extensions and we need to define rules for this event stream including ordering rules and interactions with other event streams, provider attributes for stream mixing and so on... If we restrict extensions to DTO operation extension we avoid all these issues and simplify APIs. On the negative side these extension are restrictive. I have no problem limiting this proposal and work to DTO extensions. However, we should get consensus on this. 5. Memory protection extension for atomic operations 6. error returns for extensions? yes and yes; I will work these into the next patch and update the proposal. -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [RFC] DAT 2.0 extension proposal
Arlin Davis wrote: Kanevsky, Arkady wrote: 5. Memory protection extension for atomic operations 6. error returns for extensions? yes and yes; I will work these into the next patch and update the proposal. For error returns I am thinking about carving up the return type, adding a new mask, and extension get type macro. Suggestions on carving up the following? Carve into type or subtype? other suggestions? type: DAT_RETURN_CLASSDAT_RETURN_TYPEDAT_RETURN_SUBTYPE bits: 31-30 29-16 15-0 -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [RFC][PATCH] OpenIB uDAPL extension proposal - sample immed data and atomic api's
Kanevsky, Arkady wrote: comments inline. As mentioned on the con-call, there are two separate items to consider while looking at the proposal. The first is the ability to extend DAT for specific provider value-add and the second is to validate the need for general atomic and immediate data functionality in the basic set of API's for all providers. I included atomics and immediate data as examples since it is specific to one provider (IB), it includes operations that require new ops, events, and event data types, and it also provides a working model to validate the extension model from request to completion events. I would like to concentrate on getting consensus on the extension proposal first if possible. Just try to think of the actual operations as some opaque dat_ext_foobar_op(). The thing that bothers me is that we already have several APIs that are transport specific. While some are possible to implement on other transports the others, like Socket CM, can not. So I view both of your specific extensions as transport specific amd hence prefer to add them as normal APIs not extensions. That would work for me. The secondary goal is that Provider can add extensions without requiring to change to DAT. These fall into 3 categories. 1. New memory types including privilages and protection attributes. We can add extension entry to these structures. We need to check if this is sufficient. Think of shared memory for example. I am assuming no changes to PZ. 2. New DTOs. The main issue is not DTOs but their completions and async errors. This is why Immediate data is better handled by incorporating into DAT spec while atomic can be handled by extensions. That is completion will return extention and Consumer will do the secondary switch on the extension type. Extension should not impact backwards compatibility. We had not looked at errors. But assuming a simple model that async errors break connection and we can return extension error with extensions defining new reason. Again details need to be polished. 3. new connection types or CM models... New connections seems to have little impact on existing API assuming that EP type can be extended. The new connection can even restrict which DTO they can handle. CM model is more problematic. Nice summary. Yes, we need be thorough when flushing out all the requirements for extensions in general. I am not sure how much I can share at this point regarding any other extensions but if we think in general terms we should cover all the necessary requirements. Do you want to update the proposal based on your statements above? I would be happy to work it into a real patch for feasibility and to provide feedback based on future extentability. Arlin, it would be nice to consider some of your other extensions that are not transport specific to see how it will fit before we make the final decision. This should give us idea how extensible DAT extension model is. In general, extension route was intended for RNIC|HCA providers to expose HW capabilities beyond IBTA, iWARP and VIA standards. The standard RDMA functionality is best handle via spec addition. DAT 2.0 does it for FMR, remote and local memory invalidation as well as others. True, but the extension route is not fully defined, documented, nor implemented. This is what I would like to work on getting completed in time for 2.0 if possible. BTW: The existing implementation actually uses dapl_provider-extension to store the hca_ptr but the specification states that it is reserved for the providers private use (8.2.1 in DAPL1.2 spec). This is why I had to defined another extension_func in the patch. I had posted a complete list of changes/addition to DAT 2.0 about a month ago. But we had not discussed yet version change from 1.3 to 2.0 nor how much backwards compatibility spec will provide. 2. What is IMMED_EVENT? is it just immediate data without any payload one? I suggest chnaging the name so it will not use EVENT. Just call it NO_PAYLOAD. Do you want to support 2 different way to delivery immediate data? One in event and one in data payload? Why? I would think that just an event way will do. This was modeled after the immediate data discussions on the DAT reflector based on iWARP requirements. http://groups.yahoo.com/group/dat-discussions/message/3285 I recall it now. I want to consider a few usage cases. 1. Existing app running on the Provider with extensions. Want to make sure we do not require any App changes beyond recompile due to extensions. agree 2. App wants to be modified to use Immediate data. How big impact it has on existing code. For example buffer size allocation and completion handling It really depends on transport capabilities. Our current thinking has two delivery mechanisms for the two transports (event and payload) which is not optimal. If we can come to a consensus on
Re: [openib-general] RE: [RFC][PATCH] OpenIB uDAPL extension proposal - sample immed data and atomic api's
Kanevsky, Arkady wrote: Arlin, nice proposal, thanks. I have one high level question and a few specific technical ones. 1. Why do you want to provide this functionality via extension instead of part of new DAT spec, say 2.0? This will allow Consumers to use all events, operations, and Provider/IA functionality uniformly instead of via 2 separate layers. This will also ensure that this basic funcionality can be provided by all DAPL Provider the same way on DAPL and DAT layers. DAPL 2.0 is not done yet so we have time to incorporate that. DAPL 2.0 already introduced new functionality which is easy to beef up for your proposal. See DAT_DTOS for example. DAT_EVENT is also modified to handle remote invalidation so a small addition for Immediate data and Atoimc ops is a sensible addition. This should simplify proposal significantly. As you will not need to introduce any new EXT structures. As mentioned on the con-call, there are two separate items to consider while looking at the proposal. The first is the ability to extend DAT for specific provider value-add and the second is to validate the need for general atomic and immediate data functionality in the basic set of API's for all providers. I included atomics and immediate data as examples since it is specific to one provider (IB), it includes operations that require new ops, events, and event data types, and it also provides a working model to validate the extension model from request to completion events. I would like to concentrate on getting consensus on the extension proposal first if possible. Just try to think of the actual operations as some opaque dat_ext_foobar_op(). In general, extension route was intended for RNIC|HCA providers to expose HW capabilities beyond IBTA, iWARP and VIA standards. The standard RDMA functionality is best handle via spec addition. DAT 2.0 does it for FMR, remote and local memory invalidation as well as others. True, but the extension route is not fully defined, documented, nor implemented. This is what I would like to work on getting completed in time for 2.0 if possible. BTW: The existing implementation actually uses dapl_provider-extension to store the hca_ptr but the specification states that it is reserved for the providers private use (8.2.1 in DAPL1.2 spec). This is why I had to defined another extension_func in the patch. I had posted a complete list of changes/addition to DAT 2.0 about a month ago. But we had not discussed yet version change from 1.3 to 2.0 nor how much backwards compatibility spec will provide. 2. What is IMMED_EVENT? is it just immediate data without any payload one? I suggest chnaging the name so it will not use EVENT. Just call it NO_PAYLOAD. Do you want to support 2 different way to delivery immediate data? One in event and one in data payload? Why? I would think that just an event way will do. This was modeled after the immediate data discussions on the DAT reflector based on iWARP requirements. http://groups.yahoo.com/group/dat-discussions/message/3285 3. I suggest beefing up DAT_DTO_COMPLETION_EVENT_DATA and DAT_DTOS to convey which operation completed and return Immediate data if complete operation had immediate data. Since we already modified these 2 struct as part of DAT 2.0 change lets add your proposal to the change. This will allow Consumers to use single approach to deal with completions, extension to the current one but not a structural one. No need for DAT_EXTENSION_DATA, DAT_EXT_EVENT_TYPE, DAT_EXT_OP nor the whole mechanism for extended ops. You still need extension types for the other value-add operations/evnts that will not be accepted as standard and are vendor specific. I would like to defer the rest of the questions for now since they touch on actual operations and not the extension mechanism. Although, I do need to think about how to extend memory registration privledges. Any suggestions? 4. What is the purpose of DAT_EXT_WRITE_CONFIRM_FLAG? Is it to expose IB round trip semantic? iWARP does not support immediate data. One can try to format payload to pass immediate data. Is that what you had in mind? What is the semantic meaning of the completion with this flag set? without flag set? Are extended flags are additonal values for COMPLETION_FLAGS? 2.4.1 talks about extended flags but where they are passed in is not defined. DAT 2.0 extended them already for FMR barrier. I would prefer to follow that route rather than creating a separate extension completion flags. 5. Why do you need RECV_IMMED? If Immed data is delivered in event no new Recv operation is needed. If Consumer asks for immediate data in payload where in payload will it be? If this is needed for local match for remote RDMA_Write to handle immediate data lets state so. What happens for mismatch between local and remote op? That is recv was posted for Send and RDMA_Write arrived? Vice Versa? 6. I see extension for
[openib-general] [PATCH] uDAPL openib_cma disconnect processing fix
James, Here is a patch to fix up the disconnect event processing and a change to dtest to validate. Tested with dtest and dapltest. -arlin Signed-off-by: Arlin Davis [EMAIL PROTECTED] Index: test/dtest/dtest.c === --- test/dtest/dtest.c (revision 4759) +++ test/dtest/dtest.c (working copy) @@ -862,15 +862,31 @@ disconnect_ep() if (connected) { - LOGPRINTF(%d dat_ep_disconnect\n, getpid()); - ret = dat_ep_disconnect( h_ep, DAT_CLOSE_DEFAULT ); - if(ret != DAT_SUCCESS) { - fprintf(stderr, %d Error dat_ep_disconnect: %s\n, - getpid(),DT_RetToString(ret)); - } - else { + /* +* Only the client needs to call disconnect. The server _should_ be able to +* just wait on the EVD associated with connection events for a disconnect +* request and exit then. +*/ + if ( !server ) { + LOGPRINTF(%d dat_ep_disconnect\n, getpid()); + ret = dat_ep_disconnect( h_ep, DAT_CLOSE_DEFAULT ); + if(ret != DAT_SUCCESS) { + fprintf(stderr, %d Error dat_ep_disconnect: %s\n, + getpid(),DT_RetToString(ret)); + } + else { LOGPRINTF(%d dat_ep_disconnect completed\n, getpid()); + } } + + ret = dat_evd_wait( h_conn_evd, DAT_TIMEOUT_INFINITE, 1, event, nmore ); + if(ret != DAT_SUCCESS) { + fprintf(stderr, %d Error dat_evd_wait: %s\n, + getpid(),DT_RetToString(ret)); + } + else { + LOGPRINTF(%d dat_evd_wait for h_conn_evd completed\n, getpid()); + } } /* destroy service point */ Index: dapl/openib_cma/dapl_ib_cm.c === --- dapl/openib_cma/dapl_ib_cm.c(revision 4759) +++ dapl/openib_cma/dapl_ib_cm.c(working copy) @@ -35,7 +35,7 @@ * * Description: * - * The uDAPL openib provider - connection management + * The OpenIB uCMA provider - uCMA connection management * *Source Control System Information @@ -287,6 +287,12 @@ static int dapli_cm_active_cb(struct dap break; case RDMA_CM_EVENT_DISCONNECTED: + /* validate EP handle */ + if (!DAPL_BAD_HANDLE(conn-ep, DAPL_MAGIC_EP)) + dapl_evd_connection_callback(conn, +IB_CME_DISCONNECTED, +NULL, +conn-ep); break; default: dapl_dbg_log( @@ -364,6 +370,13 @@ static int dapli_cm_passive_cb(struct da break; case RDMA_CM_EVENT_DISCONNECTED: + /* validate SP handle context */ + if (!DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_PSP) || + !DAPL_BAD_HANDLE(conn-sp, DAPL_MAGIC_RSP)) + dapls_cr_callback(conn, + IB_CME_DISCONNECTED, + NULL, + conn-sp); break; default: dapl_dbg_log(DAPL_DBG_TYPE_ERR, passive_cb: @@ -496,21 +509,10 @@ dapls_ib_disconnect(IN DAPL_EP *ep_ptr, disconnect: ID %p ret %d\n, ep_ptr-cm_handle, ret); - /* -* uDAPL does NOT expect disconnect callback from provider -* with abrupt close. uDAPL will callback with DISC event when -* from provider returns. So, if callback is expected from -* rdma_cma then block and don't post the event during callback. + /* +* DAT event notification occurs from the callback +* Note: will fire even if DREQ goes unanswered on timeout */ - if (close_flags != DAT_CLOSE_ABRUPT_FLAG) - { - if (ep_ptr-cr_ptr) - dapls_cr_callback(conn, IB_CME_DISCONNECTED, NULL, - ((DAPL_CR *)ep_ptr-cr_ptr)-sp_ptr); - else - dapl_evd_connection_callback(conn, IB_CME_DISCONNECTED, -NULL, ep_ptr); - } return DAT_SUCCESS; } @@ -537,11 +539,8 @@ dapls_ib_disconnect_clean(IN DAPL_EP *ep IN DAT_BOOLEAN active, IN const ib_cm_events_t ib_cm_event) { - /* -* Clean up outstanding connection state -*/ - dapls_ib_disconnect(ep_ptr, DAT_CLOSE_ABRUPT_FLAG); - + /* nothing to do */ + return; } /* @@ -592,7 +591,11
Re: [openib-general] Re: [RFC][PATCH] OpenIB uDAPL extension proposal - sample immed data and atomic api's
James Lentini wrote: On Fri, 23 Dec 2005, Arlin Davis wrote: arlin arlin A single entry point is still there with this patch, I just arlin defined it a little different with a function definition for arlin better DAT API mappings. The idea was to replace the existing arlin pvoid extension definition with this new one. Can you give me arlin an idea of how you would map these extended DAT calls to this arlin pvoid function definition? For uDAPL, the DAT_PROVIDER structure is defined as follows: struct dat_provider { const char *device_name; DAT_PVOID extension; ... You could create a well known extensions API by defining a structure with several function pointers struct dat_atomic_extensions { DAT_RETURN (*cmp_and_swap_func)(IN DAT_EP_HANDLE ep_handle, IN DAT_UINT64 cmp_value, IN DAT_UINT64 swap_value, IN DAT_LMR_TRIPLE *local_iov, IN DAT_DTO_COOKIE user_cookie, IN DAT_RMR_TRIPLE *remote_iov, IN DAT_COMPLETION_FLAGS completion_flags); ... } and require the dat_provider's extensions member to point to your new extension struct. To make the API easier to use, you could also create macros, similar to the standard DAT macros, to reach inside an objects provider structure and call the correct extension function. #define dat_ep_post_cmp_and_swap(ep, cmp, swap, local_iov, cookie, remote_iov, flags) \ (*DAT_HANDLE_TO_EXTENSION (ep)-cmp_and_swap_func) (\ (ep), \ (cmp), \ (swap), \ (local_iov),\ (cookie), \ (remote_iov), \ (flags)) A drawback to this approach is that adding new extensions requires synchronizing with the original extension specification document. To eliminate that issue, you could require that the dat_provider's extension member point to a typed list of these sorts of extension structures. The other drawback is that the consumer calls directly into a table with no validation of provider extensions nor handles. The method I am proposing uses the existing dat_api layer for handle validation, a provider extension validation during the open, and provider extension operation validation with the extension operation parameter in the new DAT_EXTENSION_FUNC typedef. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] uDAPL disconnect events
Jimmy Hill wrote: I'm running with the latest OpenIB Gen2 uDAPL code (CMA version) and have encountered a problem with disconnect events. The basic problem is that both sides have to call dat_ep_disconnect in order to break down a connection cleanly. It should be possible for just one side (i.e., client) to call disconnect and the other side wait for and see the disconnect event. That does not appear to be working. It does however work that way in the old reference implementation (as it should). I have code that depends on that functionality and as a result, can not move it to OpenIB Gen2 uDAPL yet. Is this a known problem? Yes, this is a problem. The uDAPL event should be processed from the uCMA event callback. I will work on a fix. Changing the flag to DAT_CLOSE_GRACEFUL_FLAG does not change the behavior. The attached copy of dtest.c is still using the default flag value. I have attached a modified version of the dtest test program which demonstrates the problem. The client will disconnect and exit cleanly. The server will hang waiting for the disconnect event. thanks, jimmy Jimmy Hill [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general