[ewg] Not seeing any SDP performance changes in OFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
Jim, Using netperf with TCP_STREAM and TCP_RR, I'm not seeing any changes in SDP throughput or CPU utilization comparing OFED 1.3 beta and OFED 1.2.5. Looks like I need to set a non-zero value in /sys/module/ib_sdp/sdp_zcopy_thresh? Do you plan to enable this by default soon? I tried echo 4096 /sys/module/ib_sdp/sdp_zcopy_thresh on RHEL4 and then tried netperf, and got an Oops. Unable to handle kernel NULL pointer deref erence at RIP: Nov/30 10:33 am80163ff0{put_page+0} Nov/30 10:33 amPML4 1a3047067 PGD 1a7a6d067 PMD 0 Nov/30 10:33 amOops: [1] SMP Nov/30 10:33 amCPU 0 Nov/30 10:33 amModules linked in: parport_pc lp parport autofs4 i2c_dev i2c_co re nfs lockd nfs_acl sunrpc rdma_ucm(U) rds(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_ addr(U) mlx4_ib(U) mlx4_core(U) ds yenta_socket pcmcia_core dm_mirror dm_multipa th dm_mod joydev button battery ac uhci_hcd ehci_hcd shpchp ib_mthca(U) ib_ipoib (U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) md5 ipv6 e1000 floppy ata_piix libata sg ext3 jbd mptscsih mptsas mptspi mptscsi mp tbase sd_mod scsi_mod Nov/30 10:33 amPid: 6802, comm: netperf241 Not tainted 2.6.9-55.ELlargesmp Nov/30 10:33 amRIP: 0010:[80163ff0] 80163ff0{put_page+0} Nov/30 10:33 amRSP: 0018:0101a7bcbbc0 EFLAGS: 00010203 Nov/30 10:33 amRAX: RBX: 0001 RCX: 02 02 Nov/30 10:33 amRDX: 0101b0b43e80 RSI: 0202 RDI: 00 00 Nov/30 10:33 amRBP: 0101b85761c0 R08: R09: 00 00 Nov/30 10:33 amR10: 0246 R11: a02e0e36 R12: 0101a4b330 80 Nov/30 10:33 amR13: 0101a7bcbd58 R14: R15: 000100 00 Nov/30 10:33 amFS: 002a95696940() GS:80500380() knlGS:000 0 Nov/30 10:33 amCS: 0010 DS: ES: CR0: 8005003b Nov/30 10:33 amCR2: CR3: 00101000 CR4: 06 e0 Nov/30 10:33 amProcess netperf241 (pid: 6802, threadinfo 0101a7bca000, tas k 0101a70df030) Nov/30 10:33 amStack: a02e110a 0100 0 0529780 Nov/30 10:33 am 00010246 0246 8013feac 0 800ffe0 Nov/30 10:33 am 0101a7bcbe88 Nov/30 10:33 amCall Trace:a02e110a{:ib_sdp:sdp_sendmsg+724} fff f801478b2{queue_delayed_work+101} Nov/30 10:33 am a02c6200{:ib_addr:queue_req+122} 802a 7ecb{sock_sendmsg+271} Nov/30 10:33 am 80169a61{do_no_page+916} 801359a8{au toremove_wake_function+0} Nov/30 10:33 am 802a7c53{sockfd_lookup+16} 802a939a{ sys_sendto+195} Nov/30 10:33 am 801242b9{do_page_fault+577} 801934c8 {dnotify_parent+34} Nov/30 10:33 am 80179335{vfs_read+248} 8011026a{syst em_call+126} Nov/30 10:33 am Nov/30 10:33 amCode: 8b 07 48 89 fa f6 c4 80 74 3b 48 8b 57 10 8b 02 48 89 d1 f6 Nov/30 10:33 amRIP 80163ff0{put_page+0} RSP 0101a7bcbbc0 Nov/30 10:33 amCR2: Nov/30 10:33 am 0Kernel panic - not syncing: Oops Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Not seeing any SDP performance changes in OFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
Scott Weitzenkamp (sweitzen) wrote: Using netperf with TCP_STREAM and TCP_RR, I'm not seeing any changes in SDP throughput or CPU utilization comparing OFED 1.3 beta and OFED 1.2.5. Looks like I need to set a non-zero value in /sys/module/ib_sdp/sdp_zcopy_thresh? Do you plan to enable this by default soon? I know there wasn't univeral agreement as to the need, but there _are_ native SDP tests in netperf2 now, which I think would be better to use where possible since they will have correct headers lest someone look at a cut and paste ages from now and mistakenly think it was actually TCP rather than SDP. happy benchmarking, rick jones unless of course one wants to test the LD_PRELOAD mechanism... ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] how to use Intel MPI with dapl2?
Scott Weitzenkamp (sweitzen) wrote: How did you configure your servers to run Intel MPI with v2 libraries? I only installed the DAPL 2.0 libs. Did you happen to see the following message (I_MPI_DEBUG=50) before failover to sockets? I_MPI: [0] I_MPI_dat_ia_openv_wrap(): DAPL version compatibility requirement check failed; required DAPL 1.2, provided DAPL 2.0 -arlin ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 2/5] nes: provider listener cleanup
If an error occurs during the provider listen call the reference count can be off. This will prevent the listener from being destroyed properly. This is fixed by correcting the reference counts when a problem is detected. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c index 3005cb1..933f31c 100644 --- a/drivers/infiniband/hw/nes/nes_cm.c +++ b/drivers/infiniband/hw/nes/nes_cm.c @@ -668,7 +668,7 @@ int send_syn(struct nes_cm_node *cm_node, u32 sendack) options = (union all_known_options *)optionsbuffer[optionssize]; options-as_windowscale.optionnum = OPTION_NUMBER_WINDOW_SCALE; options-as_windowscale.length = sizeof(struct option_windowscale); - options-as_windowscale.shiftcount = cm_node-tcp_cntxt.snd_wscale; + options-as_windowscale.shiftcount = cm_node-tcp_cntxt.rcv_wscale; optionssize += sizeof(struct option_windowscale); if (sendack !(NES_DRV_OPT_SUPRESS_OPTION_BC nes_drv_opt) @@ -1387,15 +1387,12 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb, case NES_CM_STATE_CLOSED: break; case NES_CM_STATE_LISTENING: - if (!(tcph-syn)) { - nes_debug(NES_DBG_CM, Received an ack without a SYN on a listening port\n); - send_reset(cm_node); - /* send_reset bumps refcount, this should have been a new node */ - rem_ref_cm_node(cm_core, cm_node); - return -1; - } else { - nes_debug(NES_DBG_CM, Received an ack on a listening port (syn-ack maybe?)\n); - } + nes_debug(NES_DBG_CM, Received an ACK on a listening port (SYN %d)\n, tcph-syn); + cm_node-tcp_cntxt.loc_seq_num = ntohl(tcph-ack_seq); + send_reset(cm_node); + /* send_reset bumps refcount, this should have been a new node */ + rem_ref_cm_node(cm_core, cm_node); + return -1; break; case NES_CM_STATE_TSA: nes_debug(NES_DBG_CM, Received a packet with the ack bit set while in TSA state\n); @@ -1832,6 +1829,10 @@ int mini_cm_recv_pkt(struct nes_cm_core *cm_core, struct nes_vnic *nesvnic, cm_node = make_cm_node(cm_core, nesvnic, nfo, listener); if (!cm_node) { nes_debug(NES_DBG_CM, Unable to allocate node\n); + if (listener) { + nes_debug(NES_DBG_CM, unable to allocate node and decrementing listener refcount\n); + atomic_dec(listener-ref_count); + } ret = -1; goto out; } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] When on the they an early to the
We carry all popular Rep!ica_watches_online http://brianabelgravemd.googlepages.com ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0 firmware support
Vlad, please pull cxgb3 fixes for ofed-1.2.5 from: git://git.openfabrics.org/~swise/ofed-1.2.5 stevo These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.2.5 (stay tuned for ofed-1.3 patches soon). The patches are all accepted upstream and were posted here: http://www.spinics.net/lists/netdev/msg47492.html and here: http://www.spinics.net/lists/netdev/msg48240.html Also, please pull version 1.1.0 of libcxgb3 from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 The library and drivers need to be included together as they are both needed to support the chelsio 5.0 firmware. Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 build including all this? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: [PATCH] IB/sdp Fix a kernel panic in put_page() that was passingNULL
The question is, how did that page get unset? My understanding is that the get_user_pages() call in sdp_bz_setup() should have incremented the count for each page in the range. Since there is supposed to be only one thread doing bzcopy at a time (to preserve order), the sdp_bz_cleanup() ought to be able to call without checking. Simple netperf testing uses a single send thread so it should just work. The problem that I am chasing now is that while a thread is sleeping waiting for credits, the socket is not locked. Another thread hops in and tries to do a zero copy. Since there is only one active context, and it is associated with the socket structure, it will be trampled by the second thread. Thread 2 blocks waiting for credit, thread 1 wakes up and decrements thread 2's page and bad things follow. I've got that fixed, but there are some cleanup issues that are not quite working. Thoughts? Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: [EMAIL PROTECTED] Phone: 512-294-5481 -Original Message- From: Ralph Campbell [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 5:07 PM To: Jim Mott Cc: EWG Subject: [PATCH] IB/sdp Fix a kernel panic in put_page() that was passingNULL The new bzcopy_state() was trying to free unset bz-pages[i] entries. Signed-off-by: Dave Olson [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/sdp/sdp_main.c b/drivers/infiniband/ulp/sdp/sdp_main.c index 809f7b8..35c4dd3 100644 --- a/drivers/infiniband/ulp/sdp/sdp_main.c +++ b/drivers/infiniband/ulp/sdp/sdp_main.c @@ -1212,7 +1212,8 @@ static inline struct bzcopy_state *sdp_bz_cleanup(struct bzcopy_state *bz) if (bz-pages) { for (i = bz-cur_page; i bz-page_cnt; i++) - put_page(bz-pages[i]); + if (bz-pages[i]) + put_page(bz-pages[i]); kfree(bz-pages); } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 1/5] nes: accelerated loopback support
This patch allows accelerated loopback connections to be made through the driver. Prior to this patch iWarp acclerated loopback requests were not handled. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c index 4f7ae5c..a5e0bb5 100644 --- a/drivers/infiniband/hw/nes/nes.c +++ b/drivers/infiniband/hw/nes/nes.c @@ -175,6 +175,8 @@ static int nes_inetaddr_event(struct notifier_block *notifier, nes_write_indexed(nesdev, NES_IDX_DST_IP_ADDR+(0x10*PCI_FUNC(nesdev-pcidev-devfn)), 0); + nes_manage_arp_cache(netdev, netdev-dev_addr, + ntohl(nesvnic-local_ipaddr), NES_ARP_DELETE); nesvnic-local_ipaddr = 0; return NOTIFY_OK; break; @@ -191,6 +193,8 @@ static int nes_inetaddr_event(struct notifier_block *notifier, nes_write_indexed(nesdev, NES_IDX_DST_IP_ADDR+(0x10*PCI_FUNC(nesdev-pcidev-devfn)), ntohl(ifa-ifa_address)); + nes_manage_arp_cache(netdev, netdev-dev_addr, + ntohl(nesvnic-local_ipaddr), NES_ARP_ADD); return NOTIFY_OK; break; default: diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c index 4023a2c..3005cb1 100644 --- a/drivers/infiniband/hw/nes/nes_cm.c +++ b/drivers/infiniband/hw/nes/nes_cm.c @@ -1616,6 +1616,7 @@ struct nes_cm_node * mini_cm_connect(struct nes_cm_core *cm_core, struct nes_cm_node *cm_node; struct nes_cm_listener *loopbackremotelistener; struct nes_cm_node *loopbackremotenode; + struct nes_cm_info loopback_cm_info; u16 mpa_frame_size = sizeof(struct ietf_mpa_frame) + ntohs(mpa_frame-priv_data_len); @@ -1632,6 +1633,7 @@ struct nes_cm_node * mini_cm_connect(struct nes_cm_core *cm_core, // set our node side to client (active) side cm_node-tcp_cntxt.client = 1; + cm_node-tcp_cntxt.rcv_wscale = NES_CM_DEFAULT_RCV_WND_SCALE; if (cm_info-loc_addr == cm_info-rem_addr) { loopbackremotelistener = find_listener(cm_core, cm_node-rem_addr, @@ -1639,13 +1641,14 @@ struct nes_cm_node * mini_cm_connect(struct nes_cm_core *cm_core, if (loopbackremotelistener == NULL) { create_event(cm_node, NES_CM_EVENT_ABORTED); } else { - u16 temp; - temp = cm_info-loc_port; - cm_info-loc_port = cm_info-rem_port; - cm_info-rem_port = temp; - loopbackremotenode = make_cm_node(cm_core, nesvnic, cm_info, + loopback_cm_info = *cm_info; + loopback_cm_info.loc_port = cm_info-rem_port; + loopback_cm_info.rem_port = cm_info-loc_port; + loopback_cm_info.cm_id = loopbackremotelistener-cm_id; + loopbackremotenode = make_cm_node(cm_core, nesvnic, loopback_cm_info, loopbackremotelistener); loopbackremotenode-loopbackpartner = cm_node; + loopbackremotenode-tcp_cntxt.rcv_wscale = NES_CM_DEFAULT_RCV_WND_SCALE; cm_node-loopbackpartner = loopbackremotenode; memcpy(loopbackremotenode-mpa_frame_buf, mpa_frame-priv_data, mpa_frame_size); @@ -1654,6 +1657,14 @@ struct nes_cm_node * mini_cm_connect(struct nes_cm_core *cm_core, // we are done handling this state, set node to a TSA state cm_node-state = NES_CM_STATE_TSA; + cm_node-tcp_cntxt.rcv_nxt = loopbackremotenode-tcp_cntxt.loc_seq_num; + loopbackremotenode-tcp_cntxt.rcv_nxt = cm_node-tcp_cntxt.loc_seq_num; + cm_node-tcp_cntxt.max_snd_wnd = loopbackremotenode-tcp_cntxt.rcv_wnd; + loopbackremotenode-tcp_cntxt.max_snd_wnd = cm_node-tcp_cntxt.rcv_wnd; + cm_node-tcp_cntxt.snd_wnd = loopbackremotenode-tcp_cntxt.rcv_wnd; + loopbackremotenode-tcp_cntxt.snd_wnd = cm_node-tcp_cntxt.rcv_wnd; + cm_node-tcp_cntxt.snd_wscale = loopbackremotenode-tcp_cntxt.rcv_wscale; + loopbackremotenode-tcp_cntxt.snd_wscale =
RE: [ewg] RE: [ofa-general] [PATCH 3/5] nes: fix link reset for certainphy types
+ +while (((nes_read32(nesdev-regs+NES_SOFTWARE_RESET) + 0x0040) != 0x0040) (i++ 5000)) { +} Is there a better way to wait for the read? Typically, the reset is pretty quick and is complete within a few loops. The i++ counter is there to prevent a driver hang in case the reset fails for some reason. Thanks, Glenn. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg